Skip to content

Instantly share code, notes, and snippets.

@balouf
Last active May 17, 2022 20:42
Show Gist options
  • Select an option

  • Save balouf/e78f05a15c36a55aefe18ef5157ec4d9 to your computer and use it in GitHub Desktop.

Select an option

Save balouf/e78f05a15c36a55aefe18ef5157ec4d9 to your computer and use it in GitHub Desktop.
Introduction to submodular functions
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"id": "b430c492",
"cell_type": "markdown",
"source": "# Introduction to submodular functions"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Fabien Mathieu\n\n<img src=\"https://www.swapcard.com/static/LogoFINAL-8700cd1c4ccb0b24abfdaf26e5b67b4f.svg\">"
},
{
"metadata": {
"cell_style": "center",
"slideshow": {
"slide_type": "slide"
}
},
"id": "0207aa7b",
"cell_type": "markdown",
"source": "## Original problem"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "dac746bb",
"cell_type": "markdown",
"source": "# Cashing Checks"
},
{
"metadata": {},
"id": "52982b06",
"cell_type": "markdown",
"source": "Location of Bank Accounts to Optimize Float: an Analytic Study of Exact and Approximate Algorithms (Management Science, 1977).\n\nhttps://pubsonline.informs.org/doi/pdf/10.1287/mnsc.23.8.789"
},
{
"metadata": {},
"id": "3532c68e",
"cell_type": "markdown",
"source": "*The number of days required to clear a check drawn on a bank in city $j$ depends on the city $i$ in which the check is cashed. (...) We will discuss the problem of optimally allocating bank account to maximize clearing times.*"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "88b1d4af",
"cell_type": "markdown",
"source": "# Cashing Checks: What Would You Do?"
},
{
"metadata": {},
"id": "f02943fc",
"cell_type": "markdown",
"source": "<center><img src=\"https://www.whereig.com/usa/maps/usa-states-us-map.jpg\" align=\"center\"/></center>"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "8ec021f5",
"cell_type": "markdown",
"source": "# Cashing Checks: Toy Model"
},
{
"metadata": {
"cell_style": "split"
},
"id": "51297636",
"cell_type": "markdown",
"source": "Consider a simple abstraction of the problem:\n- A ring of 12 cities\n- 3 bank accounts at most\n- Delay = ring-distance\n- Where to open the banks?\n- $0$ account $\\rightarrow$ $0$ delay"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-05-17T20:27:27.479549Z",
"end_time": "2022-05-17T20:27:29.560278Z"
},
"cell_style": "split",
"hide_input": true,
"init_cell": true,
"trusted": true
},
"id": "b7dae624",
"cell_type": "code",
"source": "import numpy as np\nn = 12\nlocations = 1+np.arange(n)\nimport stochastic_matching as sm\nfrom stochastic_matching.display import VIS_OPTIONS\nVIS_OPTIONS['interaction']['navigationButtons'] = False\n# VIS_OPTIONS['nodes'] = {'font': {'face': \"Domestic Manners\", 'size': 80}}\n# VIS_OPTIONS['edges'] = {'font': {'face': \"Domestic Manners\", 'size': 25}}\nVIS_OPTIONS['width'] = 350\nVIS_OPTIONS['height'] = 350\n\ncircle = sm.Cycle(n)\ncircle.names = [str(l) for l in locations]\nnodes_info = [{'color': {'background': 'white'}} for i in range(n)]\ncircle.show_graph(nodes_info=nodes_info)",
"execution_count": 1,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<IPython.core.display.HTML object>",
"text/html": "\n<div id=\"5d582162-bcc8-449a-a847-ea8bf9f0f2d1\"></div>\n<script>\nrequire.config({\n paths: {\n vis: 'https://unpkg.com/vis-network/standalone/umd/vis-network.min'\n }\n});\nrequire(['vis'], function(vis){\nvar nodes = [{\"id\": 0, \"label\": \"1\", \"title\": \"0: 1\", \"color\": {\"background\": \"white\"}}, {\"id\": 1, \"label\": \"2\", \"title\": \"1: 2\", \"color\": {\"background\": \"white\"}}, {\"id\": 2, \"label\": \"3\", \"title\": \"2: 3\", \"color\": {\"background\": \"white\"}}, {\"id\": 3, \"label\": \"4\", \"title\": \"3: 4\", \"color\": {\"background\": \"white\"}}, {\"id\": 4, \"label\": \"5\", \"title\": \"4: 5\", \"color\": {\"background\": \"white\"}}, {\"id\": 5, \"label\": \"6\", \"title\": \"5: 6\", \"color\": {\"background\": \"white\"}}, {\"id\": 6, \"label\": \"7\", \"title\": \"6: 7\", \"color\": {\"background\": \"white\"}}, {\"id\": 7, \"label\": \"8\", \"title\": \"7: 8\", \"color\": {\"background\": \"white\"}}, {\"id\": 8, \"label\": \"9\", \"title\": \"8: 9\", \"color\": {\"background\": \"white\"}}, {\"id\": 9, \"label\": \"10\", \"title\": \"9: 10\", \"color\": {\"background\": \"white\"}}, {\"id\": 10, \"label\": \"11\", \"title\": \"10: 11\", \"color\": {\"background\": \"white\"}}, {\"id\": 11, \"label\": \"12\", \"title\": \"11: 12\", \"color\": {\"background\": \"white\"}}];\nvar edges = [{\"title\": \"0: (1, 2)\", \"label\": \"\", \"from\": 0, \"to\": 1}, {\"title\": \"1: (1, 12)\", \"label\": \"\", \"from\": 0, \"to\": 11}, {\"title\": \"2: (2, 3)\", \"label\": \"\", \"from\": 1, \"to\": 2}, {\"title\": \"3: (3, 4)\", \"label\": \"\", \"from\": 2, \"to\": 3}, {\"title\": \"4: (4, 5)\", \"label\": \"\", \"from\": 3, \"to\": 4}, {\"title\": \"5: (5, 6)\", \"label\": \"\", \"from\": 4, \"to\": 5}, {\"title\": \"6: (6, 7)\", \"label\": \"\", \"from\": 5, \"to\": 6}, {\"title\": \"7: (7, 8)\", \"label\": \"\", \"from\": 6, \"to\": 7}, {\"title\": \"8: (8, 9)\", \"label\": \"\", \"from\": 7, \"to\": 8}, {\"title\": \"9: (9, 10)\", \"label\": \"\", \"from\": 8, \"to\": 9}, {\"title\": \"10: (10, 11)\", \"label\": \"\", \"from\": 9, \"to\": 10}, {\"title\": \"11: (11, 12)\", \"label\": \"\", \"from\": 10, \"to\": 11}];\nvar data= {\n nodes: nodes,\n edges: edges,\n};\nvar options = {\"interaction\": {\"navigationButtons\": false}, \"width\": 350, \"height\": 350};\nvar container = document.getElementById('5d582162-bcc8-449a-a847-ea8bf9f0f2d1');\nvar network = new vis.Network(container, data, options);\nnetwork.fit({\n maxZoomLevel: 1000});\n});\n</script>\n"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-05-17T20:27:29.562912Z",
"end_time": "2022-05-17T20:27:29.575410Z"
},
"init_cell": true,
"slideshow": {
"slide_type": "skip"
},
"trusted": true
},
"id": "a470010f",
"cell_type": "code",
"source": "def select(i, candidates):\n if i in candidates:\n return 'red'\n else:\n return 'white'\n \ndef total_delay(candidates):\n k = len(candidates)\n if k==0:\n return 0\n candidates = np.array(candidates)\n diff = np.abs(np.ones( (k, 1), dtype=int ) @ locations.reshape(1, n) \n - candidates.reshape(k, 1) @ np.ones((1, n) , dtype=int))\n dist = np.minimum(diff, n-diff)\n return np.sum(np.max(dist, axis=0))\n\n \n\ndef display(candidates):\n nodes_info = [{'color': {'background': select(i, candidates)}} for i in range(1, 13)]\n circle.show_graph(nodes_info=nodes_info)\n print(f\"Total delay: {total_delay(candidates)} days.\")",
"execution_count": 2,
"outputs": []
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "ef394088",
"cell_type": "markdown",
"source": "# (One) optimal solution"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-05-17T20:27:29.577236Z",
"end_time": "2022-05-17T20:27:29.606476Z"
},
"init_cell": true,
"trusted": true
},
"id": "de122aab",
"cell_type": "code",
"source": "display([1, 5, 9])",
"execution_count": 3,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<IPython.core.display.HTML object>",
"text/html": "\n<div id=\"eda70496-8052-4eb6-8fb3-468b43499b74\"></div>\n<script>\nrequire.config({\n paths: {\n vis: 'https://unpkg.com/vis-network/standalone/umd/vis-network.min'\n }\n});\nrequire(['vis'], function(vis){\nvar nodes = [{\"id\": 0, \"label\": \"1\", \"title\": \"0: 1\", \"color\": {\"background\": \"red\"}}, {\"id\": 1, \"label\": \"2\", \"title\": \"1: 2\", \"color\": {\"background\": \"white\"}}, {\"id\": 2, \"label\": \"3\", \"title\": \"2: 3\", \"color\": {\"background\": \"white\"}}, {\"id\": 3, \"label\": \"4\", \"title\": \"3: 4\", \"color\": {\"background\": \"white\"}}, {\"id\": 4, \"label\": \"5\", \"title\": \"4: 5\", \"color\": {\"background\": \"red\"}}, {\"id\": 5, \"label\": \"6\", \"title\": \"5: 6\", \"color\": {\"background\": \"white\"}}, {\"id\": 6, \"label\": \"7\", \"title\": \"6: 7\", \"color\": {\"background\": \"white\"}}, {\"id\": 7, \"label\": \"8\", \"title\": \"7: 8\", \"color\": {\"background\": \"white\"}}, {\"id\": 8, \"label\": \"9\", \"title\": \"8: 9\", \"color\": {\"background\": \"red\"}}, {\"id\": 9, \"label\": \"10\", \"title\": \"9: 10\", \"color\": {\"background\": \"white\"}}, {\"id\": 10, \"label\": \"11\", \"title\": \"10: 11\", \"color\": {\"background\": \"white\"}}, {\"id\": 11, \"label\": \"12\", \"title\": \"11: 12\", \"color\": {\"background\": \"white\"}}];\nvar edges = [{\"title\": \"0: (1, 2)\", \"label\": \"\", \"from\": 0, \"to\": 1}, {\"title\": \"1: (1, 12)\", \"label\": \"\", \"from\": 0, \"to\": 11}, {\"title\": \"2: (2, 3)\", \"label\": \"\", \"from\": 1, \"to\": 2}, {\"title\": \"3: (3, 4)\", \"label\": \"\", \"from\": 2, \"to\": 3}, {\"title\": \"4: (4, 5)\", \"label\": \"\", \"from\": 3, \"to\": 4}, {\"title\": \"5: (5, 6)\", \"label\": \"\", \"from\": 4, \"to\": 5}, {\"title\": \"6: (6, 7)\", \"label\": \"\", \"from\": 5, \"to\": 6}, {\"title\": \"7: (7, 8)\", \"label\": \"\", \"from\": 6, \"to\": 7}, {\"title\": \"8: (8, 9)\", \"label\": \"\", \"from\": 7, \"to\": 8}, {\"title\": \"9: (9, 10)\", \"label\": \"\", \"from\": 8, \"to\": 9}, {\"title\": \"10: (10, 11)\", \"label\": \"\", \"from\": 9, \"to\": 10}, {\"title\": \"11: (11, 12)\", \"label\": \"\", \"from\": 10, \"to\": 11}];\nvar data= {\n nodes: nodes,\n edges: edges,\n};\nvar options = {\"interaction\": {\"navigationButtons\": false}, \"width\": 350, \"height\": 350};\nvar container = document.getElementById('eda70496-8052-4eb6-8fb3-468b43499b74');\nvar network = new vis.Network(container, data, options);\nnetwork.fit({\n maxZoomLevel: 1000});\n});\n</script>\n"
},
"metadata": {}
},
{
"output_type": "stream",
"text": "Total delay: 60 days.\n",
"name": "stdout"
}
]
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "2946b5ba",
"cell_type": "markdown",
"source": "# (One) greedy solution"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-05-17T20:27:29.609218Z",
"end_time": "2022-05-17T20:27:29.621888Z"
},
"cell_style": "split",
"init_cell": true,
"trusted": true
},
"id": "702088ba",
"cell_type": "code",
"source": "display([1])",
"execution_count": 4,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<IPython.core.display.HTML object>",
"text/html": "\n<div id=\"5b6d45f8-6476-4da7-b0d7-04d24bef30bc\"></div>\n<script>\nrequire.config({\n paths: {\n vis: 'https://unpkg.com/vis-network/standalone/umd/vis-network.min'\n }\n});\nrequire(['vis'], function(vis){\nvar nodes = [{\"id\": 0, \"label\": \"1\", \"title\": \"0: 1\", \"color\": {\"background\": \"red\"}}, {\"id\": 1, \"label\": \"2\", \"title\": \"1: 2\", \"color\": {\"background\": \"white\"}}, {\"id\": 2, \"label\": \"3\", \"title\": \"2: 3\", \"color\": {\"background\": \"white\"}}, {\"id\": 3, \"label\": \"4\", \"title\": \"3: 4\", \"color\": {\"background\": \"white\"}}, {\"id\": 4, \"label\": \"5\", \"title\": \"4: 5\", \"color\": {\"background\": \"white\"}}, {\"id\": 5, \"label\": \"6\", \"title\": \"5: 6\", \"color\": {\"background\": \"white\"}}, {\"id\": 6, \"label\": \"7\", \"title\": \"6: 7\", \"color\": {\"background\": \"white\"}}, {\"id\": 7, \"label\": \"8\", \"title\": \"7: 8\", \"color\": {\"background\": \"white\"}}, {\"id\": 8, \"label\": \"9\", \"title\": \"8: 9\", \"color\": {\"background\": \"white\"}}, {\"id\": 9, \"label\": \"10\", \"title\": \"9: 10\", \"color\": {\"background\": \"white\"}}, {\"id\": 10, \"label\": \"11\", \"title\": \"10: 11\", \"color\": {\"background\": \"white\"}}, {\"id\": 11, \"label\": \"12\", \"title\": \"11: 12\", \"color\": {\"background\": \"white\"}}];\nvar edges = [{\"title\": \"0: (1, 2)\", \"label\": \"\", \"from\": 0, \"to\": 1}, {\"title\": \"1: (1, 12)\", \"label\": \"\", \"from\": 0, \"to\": 11}, {\"title\": \"2: (2, 3)\", \"label\": \"\", \"from\": 1, \"to\": 2}, {\"title\": \"3: (3, 4)\", \"label\": \"\", \"from\": 2, \"to\": 3}, {\"title\": \"4: (4, 5)\", \"label\": \"\", \"from\": 3, \"to\": 4}, {\"title\": \"5: (5, 6)\", \"label\": \"\", \"from\": 4, \"to\": 5}, {\"title\": \"6: (6, 7)\", \"label\": \"\", \"from\": 5, \"to\": 6}, {\"title\": \"7: (7, 8)\", \"label\": \"\", \"from\": 6, \"to\": 7}, {\"title\": \"8: (8, 9)\", \"label\": \"\", \"from\": 7, \"to\": 8}, {\"title\": \"9: (9, 10)\", \"label\": \"\", \"from\": 8, \"to\": 9}, {\"title\": \"10: (10, 11)\", \"label\": \"\", \"from\": 9, \"to\": 10}, {\"title\": \"11: (11, 12)\", \"label\": \"\", \"from\": 10, \"to\": 11}];\nvar data= {\n nodes: nodes,\n edges: edges,\n};\nvar options = {\"interaction\": {\"navigationButtons\": false}, \"width\": 350, \"height\": 350};\nvar container = document.getElementById('5b6d45f8-6476-4da7-b0d7-04d24bef30bc');\nvar network = new vis.Network(container, data, options);\nnetwork.fit({\n maxZoomLevel: 1000});\n});\n</script>\n"
},
"metadata": {}
},
{
"output_type": "stream",
"text": "Total delay: 36 days.\n",
"name": "stdout"
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-05-17T20:27:29.624253Z",
"end_time": "2022-05-17T20:27:29.637376Z"
},
"cell_style": "split",
"init_cell": true,
"trusted": true
},
"id": "87cf30ea",
"cell_type": "code",
"source": "display([1, 7])",
"execution_count": 5,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<IPython.core.display.HTML object>",
"text/html": "\n<div id=\"2e6d26e6-b4dc-4581-b951-f741870edf52\"></div>\n<script>\nrequire.config({\n paths: {\n vis: 'https://unpkg.com/vis-network/standalone/umd/vis-network.min'\n }\n});\nrequire(['vis'], function(vis){\nvar nodes = [{\"id\": 0, \"label\": \"1\", \"title\": \"0: 1\", \"color\": {\"background\": \"red\"}}, {\"id\": 1, \"label\": \"2\", \"title\": \"1: 2\", \"color\": {\"background\": \"white\"}}, {\"id\": 2, \"label\": \"3\", \"title\": \"2: 3\", \"color\": {\"background\": \"white\"}}, {\"id\": 3, \"label\": \"4\", \"title\": \"3: 4\", \"color\": {\"background\": \"white\"}}, {\"id\": 4, \"label\": \"5\", \"title\": \"4: 5\", \"color\": {\"background\": \"white\"}}, {\"id\": 5, \"label\": \"6\", \"title\": \"5: 6\", \"color\": {\"background\": \"white\"}}, {\"id\": 6, \"label\": \"7\", \"title\": \"6: 7\", \"color\": {\"background\": \"red\"}}, {\"id\": 7, \"label\": \"8\", \"title\": \"7: 8\", \"color\": {\"background\": \"white\"}}, {\"id\": 8, \"label\": \"9\", \"title\": \"8: 9\", \"color\": {\"background\": \"white\"}}, {\"id\": 9, \"label\": \"10\", \"title\": \"9: 10\", \"color\": {\"background\": \"white\"}}, {\"id\": 10, \"label\": \"11\", \"title\": \"10: 11\", \"color\": {\"background\": \"white\"}}, {\"id\": 11, \"label\": \"12\", \"title\": \"11: 12\", \"color\": {\"background\": \"white\"}}];\nvar edges = [{\"title\": \"0: (1, 2)\", \"label\": \"\", \"from\": 0, \"to\": 1}, {\"title\": \"1: (1, 12)\", \"label\": \"\", \"from\": 0, \"to\": 11}, {\"title\": \"2: (2, 3)\", \"label\": \"\", \"from\": 1, \"to\": 2}, {\"title\": \"3: (3, 4)\", \"label\": \"\", \"from\": 2, \"to\": 3}, {\"title\": \"4: (4, 5)\", \"label\": \"\", \"from\": 3, \"to\": 4}, {\"title\": \"5: (5, 6)\", \"label\": \"\", \"from\": 4, \"to\": 5}, {\"title\": \"6: (6, 7)\", \"label\": \"\", \"from\": 5, \"to\": 6}, {\"title\": \"7: (7, 8)\", \"label\": \"\", \"from\": 6, \"to\": 7}, {\"title\": \"8: (8, 9)\", \"label\": \"\", \"from\": 7, \"to\": 8}, {\"title\": \"9: (9, 10)\", \"label\": \"\", \"from\": 8, \"to\": 9}, {\"title\": \"10: (10, 11)\", \"label\": \"\", \"from\": 9, \"to\": 10}, {\"title\": \"11: (11, 12)\", \"label\": \"\", \"from\": 10, \"to\": 11}];\nvar data= {\n nodes: nodes,\n edges: edges,\n};\nvar options = {\"interaction\": {\"navigationButtons\": false}, \"width\": 350, \"height\": 350};\nvar container = document.getElementById('2e6d26e6-b4dc-4581-b951-f741870edf52');\nvar network = new vis.Network(container, data, options);\nnetwork.fit({\n maxZoomLevel: 1000});\n});\n</script>\n"
},
"metadata": {}
},
{
"output_type": "stream",
"text": "Total delay: 54 days.\n",
"name": "stdout"
}
]
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "8d6cd160",
"cell_type": "markdown",
"source": "# (One) greedy solution"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-05-17T20:27:29.639832Z",
"end_time": "2022-05-17T20:27:29.652718Z"
},
"cell_style": "split",
"init_cell": true,
"trusted": true
},
"id": "d9947ca1",
"cell_type": "code",
"source": "display([1, 7])",
"execution_count": 6,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<IPython.core.display.HTML object>",
"text/html": "\n<div id=\"fe6005eb-782a-4c49-bc19-485e3ce64f34\"></div>\n<script>\nrequire.config({\n paths: {\n vis: 'https://unpkg.com/vis-network/standalone/umd/vis-network.min'\n }\n});\nrequire(['vis'], function(vis){\nvar nodes = [{\"id\": 0, \"label\": \"1\", \"title\": \"0: 1\", \"color\": {\"background\": \"red\"}}, {\"id\": 1, \"label\": \"2\", \"title\": \"1: 2\", \"color\": {\"background\": \"white\"}}, {\"id\": 2, \"label\": \"3\", \"title\": \"2: 3\", \"color\": {\"background\": \"white\"}}, {\"id\": 3, \"label\": \"4\", \"title\": \"3: 4\", \"color\": {\"background\": \"white\"}}, {\"id\": 4, \"label\": \"5\", \"title\": \"4: 5\", \"color\": {\"background\": \"white\"}}, {\"id\": 5, \"label\": \"6\", \"title\": \"5: 6\", \"color\": {\"background\": \"white\"}}, {\"id\": 6, \"label\": \"7\", \"title\": \"6: 7\", \"color\": {\"background\": \"red\"}}, {\"id\": 7, \"label\": \"8\", \"title\": \"7: 8\", \"color\": {\"background\": \"white\"}}, {\"id\": 8, \"label\": \"9\", \"title\": \"8: 9\", \"color\": {\"background\": \"white\"}}, {\"id\": 9, \"label\": \"10\", \"title\": \"9: 10\", \"color\": {\"background\": \"white\"}}, {\"id\": 10, \"label\": \"11\", \"title\": \"10: 11\", \"color\": {\"background\": \"white\"}}, {\"id\": 11, \"label\": \"12\", \"title\": \"11: 12\", \"color\": {\"background\": \"white\"}}];\nvar edges = [{\"title\": \"0: (1, 2)\", \"label\": \"\", \"from\": 0, \"to\": 1}, {\"title\": \"1: (1, 12)\", \"label\": \"\", \"from\": 0, \"to\": 11}, {\"title\": \"2: (2, 3)\", \"label\": \"\", \"from\": 1, \"to\": 2}, {\"title\": \"3: (3, 4)\", \"label\": \"\", \"from\": 2, \"to\": 3}, {\"title\": \"4: (4, 5)\", \"label\": \"\", \"from\": 3, \"to\": 4}, {\"title\": \"5: (5, 6)\", \"label\": \"\", \"from\": 4, \"to\": 5}, {\"title\": \"6: (6, 7)\", \"label\": \"\", \"from\": 5, \"to\": 6}, {\"title\": \"7: (7, 8)\", \"label\": \"\", \"from\": 6, \"to\": 7}, {\"title\": \"8: (8, 9)\", \"label\": \"\", \"from\": 7, \"to\": 8}, {\"title\": \"9: (9, 10)\", \"label\": \"\", \"from\": 8, \"to\": 9}, {\"title\": \"10: (10, 11)\", \"label\": \"\", \"from\": 9, \"to\": 10}, {\"title\": \"11: (11, 12)\", \"label\": \"\", \"from\": 10, \"to\": 11}];\nvar data= {\n nodes: nodes,\n edges: edges,\n};\nvar options = {\"interaction\": {\"navigationButtons\": false}, \"width\": 350, \"height\": 350};\nvar container = document.getElementById('fe6005eb-782a-4c49-bc19-485e3ce64f34');\nvar network = new vis.Network(container, data, options);\nnetwork.fit({\n maxZoomLevel: 1000});\n});\n</script>\n"
},
"metadata": {}
},
{
"output_type": "stream",
"text": "Total delay: 54 days.\n",
"name": "stdout"
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-05-17T20:27:29.654061Z",
"end_time": "2022-05-17T20:27:29.668626Z"
},
"cell_style": "split",
"init_cell": true,
"trusted": true
},
"id": "3c0ae845",
"cell_type": "code",
"source": "display([1, 7, 10])",
"execution_count": 7,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<IPython.core.display.HTML object>",
"text/html": "\n<div id=\"39028ab3-4e43-42b6-8984-98c3fd4250df\"></div>\n<script>\nrequire.config({\n paths: {\n vis: 'https://unpkg.com/vis-network/standalone/umd/vis-network.min'\n }\n});\nrequire(['vis'], function(vis){\nvar nodes = [{\"id\": 0, \"label\": \"1\", \"title\": \"0: 1\", \"color\": {\"background\": \"red\"}}, {\"id\": 1, \"label\": \"2\", \"title\": \"1: 2\", \"color\": {\"background\": \"white\"}}, {\"id\": 2, \"label\": \"3\", \"title\": \"2: 3\", \"color\": {\"background\": \"white\"}}, {\"id\": 3, \"label\": \"4\", \"title\": \"3: 4\", \"color\": {\"background\": \"white\"}}, {\"id\": 4, \"label\": \"5\", \"title\": \"4: 5\", \"color\": {\"background\": \"white\"}}, {\"id\": 5, \"label\": \"6\", \"title\": \"5: 6\", \"color\": {\"background\": \"white\"}}, {\"id\": 6, \"label\": \"7\", \"title\": \"6: 7\", \"color\": {\"background\": \"red\"}}, {\"id\": 7, \"label\": \"8\", \"title\": \"7: 8\", \"color\": {\"background\": \"white\"}}, {\"id\": 8, \"label\": \"9\", \"title\": \"8: 9\", \"color\": {\"background\": \"white\"}}, {\"id\": 9, \"label\": \"10\", \"title\": \"9: 10\", \"color\": {\"background\": \"red\"}}, {\"id\": 10, \"label\": \"11\", \"title\": \"10: 11\", \"color\": {\"background\": \"white\"}}, {\"id\": 11, \"label\": \"12\", \"title\": \"11: 12\", \"color\": {\"background\": \"white\"}}];\nvar edges = [{\"title\": \"0: (1, 2)\", \"label\": \"\", \"from\": 0, \"to\": 1}, {\"title\": \"1: (1, 12)\", \"label\": \"\", \"from\": 0, \"to\": 11}, {\"title\": \"2: (2, 3)\", \"label\": \"\", \"from\": 1, \"to\": 2}, {\"title\": \"3: (3, 4)\", \"label\": \"\", \"from\": 2, \"to\": 3}, {\"title\": \"4: (4, 5)\", \"label\": \"\", \"from\": 3, \"to\": 4}, {\"title\": \"5: (5, 6)\", \"label\": \"\", \"from\": 4, \"to\": 5}, {\"title\": \"6: (6, 7)\", \"label\": \"\", \"from\": 5, \"to\": 6}, {\"title\": \"7: (7, 8)\", \"label\": \"\", \"from\": 6, \"to\": 7}, {\"title\": \"8: (8, 9)\", \"label\": \"\", \"from\": 7, \"to\": 8}, {\"title\": \"9: (9, 10)\", \"label\": \"\", \"from\": 8, \"to\": 9}, {\"title\": \"10: (10, 11)\", \"label\": \"\", \"from\": 9, \"to\": 10}, {\"title\": \"11: (11, 12)\", \"label\": \"\", \"from\": 10, \"to\": 11}];\nvar data= {\n nodes: nodes,\n edges: edges,\n};\nvar options = {\"interaction\": {\"navigationButtons\": false}, \"width\": 350, \"height\": 350};\nvar container = document.getElementById('39028ab3-4e43-42b6-8984-98c3fd4250df');\nvar network = new vis.Network(container, data, options);\nnetwork.fit({\n maxZoomLevel: 1000});\n});\n</script>\n"
},
"metadata": {}
},
{
"output_type": "stream",
"text": "Total delay: 59 days.\n",
"name": "stdout"
}
]
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "4f4a587b",
"cell_type": "markdown",
"source": "# 59<60"
},
{
"metadata": {},
"id": "9da88f1e",
"cell_type": "markdown",
"source": "- Greedy is less efficient but much easier to compute\n- In the toy model, the loss is acceptable\n- Is it true in general?\n- General answer with submodular functions"
},
{
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"id": "e4ba4686",
"cell_type": "markdown",
"source": "## Examples of submodular problems"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "d5d15930",
"cell_type": "markdown",
"source": "# Diversity / max-cover"
},
{
"metadata": {
"cell_style": "split"
},
"id": "0fdd459f",
"cell_type": "markdown",
"source": "Max-cover:\n- A set $\\Omega$ and a set $A$ of subsets of $\\Omega$\n- For given $k$, find $\\underset{B \\subset A, |B|\\leq k}{\\operatorname{argmax}}|\\cup_{b\\in B}b|$"
},
{
"metadata": {
"cell_style": "split",
"slideshow": {
"slide_type": "fragment"
}
},
"id": "c7b57d69",
"cell_type": "markdown",
"source": "Diversity:\n- Set of documents that contain some information\n- For given $k$, find $k$ documents with maximal total information\n- e.g. give me 3 documents that will maximize my knowledge on submodularity"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "d06b8f79",
"cell_type": "markdown",
"source": "# Clustering"
},
{
"metadata": {},
"id": "4688f229",
"cell_type": "markdown",
"source": "- A set $X$ of points in some space\n- For given $k$, find $\\underset{Y \\subset X, |Y|\\leq k}{\\operatorname{argmin}}\\sum_{x\\in X} d(x, Y)$"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "f9a6cc26",
"cell_type": "markdown",
"source": "# Early adoption (independent cascade model)"
},
{
"metadata": {},
"id": "dbd53a13",
"cell_type": "markdown",
"source": "- Oriented graph $G=(V, E)$ with edge weights $0\\leq p_{i, j}\\leq 1$\n- Some nodes may become contaminated\n- When node $i$ gets contaminated, it contaminates neighbor $j$ with probability $p_{i, j}$\n- One-shot: each edge is tried at most once\n- Goal: find $k$ nodes to contaminate that maximize the expected contamination"
},
{
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"id": "b35e755b",
"cell_type": "markdown",
"source": "## Definition"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "52f64f35",
"cell_type": "markdown",
"source": "# Definition"
},
{
"metadata": {},
"id": "7ec19adb",
"cell_type": "markdown",
"source": "We consider a set function $f$ on a *finite* set $\\Omega$, $|\\Omega|=n$."
},
{
"metadata": {},
"id": "e41bff7f",
"cell_type": "markdown",
"source": "Intuition to capture:\n- The gain of choosing a new item decreases with the number of items already picked\n- Alternative intuition: no combo (e.g. pair of shoes)"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "182f933f",
"cell_type": "markdown",
"source": "# Definition"
},
{
"metadata": {
"slideshow": {
"slide_type": ""
}
},
"id": "09296b8b",
"cell_type": "markdown",
"source": "Formal, equivalent, definitions of submodularity: \n1. $\\forall S \\subset T, x\\in \\bar{T}$, we have $$f(S\\cup \\{x\\})-f(S)\\geq f(T\\cup \\{x\\})-f(T)$$\n1. $\\forall S, T$, we have $$f(S)+f(T)\\geq f(S\\cup T)+f(S\\cap T)$$\n1. $\\forall S$, $\\forall x_1, x_2 \\in \\bar{S}$, $x_1\\neq x_2$, we have $$f(S\\cup \\{x_1\\})+f(S\\cup \\{x_2\\})\\geq f(S\\cup \\{x_1,x_2\\})+f(S)$$"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "304cd0a9",
"cell_type": "markdown",
"source": "# Equivalence: 2$\\rightarrow$3"
},
{
"metadata": {
"hide_input": false
},
"id": "bfd21080",
"cell_type": "markdown",
"source": "- Start: $f(X)+f(Y)\\geq f(X\\cup Y)+f(X\\cap Y)$\n- Goal: $f(S\\cup \\{x_1\\})+f(S\\cup \\{x_2\\})\\geq f(S\\cup \\{x_1,x_2\\})+f(S)$"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "ebad341a",
"cell_type": "markdown",
"source": "$X = S\\cup\\{x_1\\}$, $Y=S\\cup\\{x_2\\}$"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "725eddd5",
"cell_type": "markdown",
"source": "$\\rightarrow f(S\\cup \\{x_1\\})+f(S\\cup \\{x_2\\})\\geq f(S\\cup \\{x_1,x_2\\})+f(S)$"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "89f39c33",
"cell_type": "markdown",
"source": "# Equivalence: 3$\\rightarrow$1"
},
{
"metadata": {},
"id": "bdb80b15",
"cell_type": "markdown",
"source": "- Start: $f(S\\cup \\{x_1\\})+f(S\\cup \\{x_2\\})\\geq f(S\\cup \\{x_1,x_2\\})+f(S)$\n- Goal: $f(S\\cup \\{x\\})-f(S)\\geq f(T\\cup \\{x\\})-f(T)$"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "ba495115",
"cell_type": "markdown",
"source": "Let $S\\subsetneq T \\subset \\Omega$, $\\{x_1, x_2, ..., x_i\\}=T\\setminus S$"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "9a93dd85",
"cell_type": "markdown",
"source": "$$\n\\begin{split}\nf(S\\cup \\{x\\}) - f(S) &\\geq f(S\\cup \\{x, x_1\\}) - f(S\\cup \\{x_1\\})\\\\\n &\\geq f(S\\cup \\{x, x_1, x_2\\}) - f(S\\cup \\{x_1, x_2\\})\\geq...\\\\\n &\\geq f(T\\cup \\{x\\}) - f(T)\n\\end{split}$$"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "4c64bc4e",
"cell_type": "markdown",
"source": "# Equivalence: 1$\\rightarrow$2"
},
{
"metadata": {},
"id": "400775c1",
"cell_type": "markdown",
"source": "- Start: $f(S\\cup \\{x\\})-f(S)\\geq f(T\\cup \\{x\\})-f(T)$\n- Goal: $f(S)+f(T)\\geq f(S\\cup T)+f(S\\cap T)$"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "23f4c014",
"cell_type": "markdown",
"source": "Let $\\{x_1, x_2, ..., x_i\\}=T\\setminus S$"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "a6af9267",
"cell_type": "markdown",
"source": "$$\n\\begin{split}\nf( (S \\cap T) \\cup \\{x_1\\}) - f( S \\cap T) &\\geq f( S \\cup \\{x_1\\}) - f(S)\\\\\n&\\ldots \\\\\nf(T) - f(T\\setminus \\{x_i\\}) &\\geq f(S\\cup T) - f((S\\cup T)\\setminus \\{x_i\\})\n\\end{split}\n$$"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "2efce7ed",
"cell_type": "markdown",
"source": "$\\rightarrow f(T) - f(S\\cap T) \\geq f(S\\cup T) - f(S)$"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "176fe290",
"cell_type": "markdown",
"source": "# Additional definitions"
},
{
"metadata": {},
"id": "c793b8d8",
"cell_type": "markdown",
"source": "- Non-negative: $\\forall S \\subset \\Omega, f(S)\\geq 0$.\n- Monotone: $\\forall S \\subset T \\subset \\Omega, f(S)\\leq f(T)$\n- Symmetric: $\\forall S \\subset \\Omega, f(S) = f(\\Omega \\setminus S)$\n- Subadditive: $\\forall S, T s.t. S\\cap T=\\emptyset, f(S)+F(T) \\geq f(S\\cup T)$\n\nAll submodular examples we gave are (or can be rewritten as) non-negative monotone but this is not always the case."
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "a8b0c1ba",
"cell_type": "markdown",
"source": "# Side step: max-cut"
},
{
"metadata": {},
"id": "e80c8e5c",
"cell_type": "markdown",
"source": "- Undirected graph $G=(V, E)$\n- For $S\\subset V, f(S) = |\\{(u, v) \\in E \\text{ s.t. } u\\in S, v\\notin S\\}|$ (*cut* of $S$)\n- Symmetric: $f(S)=f(\\Omega\\setminus S)$\n- Non-monotone: if $(u, v)\\in E$, $f(\\{u\\})=\\operatorname{deg}(u)>f(\\Omega)=0$\n- Submodular"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "09c7f791",
"cell_type": "markdown",
"source": "# Proof of cut submodularity"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "e01dbc2e",
"cell_type": "markdown",
"source": "Let $S \\subset T \\subset V, u\\in V\\setminus T$"
},
{
"metadata": {
"cell_style": "split",
"slideshow": {
"slide_type": "fragment"
}
},
"id": "32bdf2b7",
"cell_type": "markdown",
"source": "Add $u$ to $S$:\n- Loose between $u$ and $S$\n- Gain between $u$ and $T\\setminus S$\n- Gain between $u$ and $\\Omega \\setminus T$\n\n"
},
{
"metadata": {
"cell_style": "split",
"slideshow": {
"slide_type": "fragment"
}
},
"id": "249cb3c5",
"cell_type": "markdown",
"source": "Add $u$ to $T$:\n- Loose between $u$ and $S$\n- Loose between $u$ and $T\\setminus S$\n- Gain between $u$ and $\\Omega \\setminus T$\n\n"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "9a478034",
"cell_type": "markdown",
"source": "Net loss going from $S$ to $T$: twice the edges between $u$ and $T\\setminus S \\geq 0$"
},
{
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"id": "b885f225",
"cell_type": "markdown",
"source": "## Approximation under cardinality constraints"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "1d75e08c",
"cell_type": "markdown",
"source": "# Approximation under cardinality constraints"
},
{
"metadata": {},
"id": "b5a32615",
"cell_type": "markdown",
"source": "- $f$: non-negative monotone submodular function\n- Goal: maximize $f(S)$ under constraint $|S|\\leq k$\n- Let $S^*$ denote an optimal solution\n- Greedy algorithm:\n - $S_0\\leftarrow \\emptyset$\n - $S_{i+1}\\leftarrow S_i\\cup \\{x\\}$, where $x$ maximizes $f(S_i\\cup \\{x\\})$\n- Theorem: $f(S_k)\\geq (1-\\frac{1}{e})f(S^*)$\n- Remark (no proof here): unless $P=NP$, we cannot do better in polynomial time"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "9ed4be26",
"cell_type": "markdown",
"source": "# Remark"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-05-17T20:34:23.628307Z",
"end_time": "2022-05-17T20:34:23.639709Z"
},
"trusted": true
},
"id": "25c169c1",
"cell_type": "code",
"source": "1-np.exp(-1)",
"execution_count": 8,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 8,
"data": {
"text/plain": "0.6321205588285577"
},
"metadata": {}
}
]
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "c7e905ac",
"cell_type": "markdown",
"source": "# Proof"
},
{
"metadata": {},
"id": "c1572378",
"cell_type": "markdown",
"source": "We start with a lemma: for any $S$, we have:\n\n$$\\max_{x\\in \\Omega}f(S\\cup\\{x\\})-f(S) \\geq \\frac{1}{k}(f(S^*) - f(S))$$\n\nIntuition: we have some guarantee on what we gain at each step of a greedy algorithm."
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "e643c7cf",
"cell_type": "markdown",
"source": "# Lemma proof"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "e3c089fb",
"cell_type": "markdown",
"source": "Let $\\{x_1, \\ldots, x_p\\}$ denote $S^*\\setminus S $ ($p\\leq k$).\n\n$$\n\\begin{split}\nf(S^*) &\\leq f(S^*\\cup S)\\\\\n&= f(S) + \\sum_{i=1}^p \\left( f(S\\cup \\{x_1, \\ldots, x_i\\}) - f(S\\cup \\{x_1, \\ldots, x_{i-1}\\}) \\right) \\\\\n&\\leq f(S) + \\sum_{i=1}^p \\left( f(S\\cup \\{x_i\\}) - f(S) \\right)\\\\\n&\\leq f(S) + \\sum_{i=1}^p \\max_{x\\in \\Omega}(f(S\\cup \\{x\\}) - f(S))\\\\ &\\leq f(S) + k \\max_{x\\in \\Omega}(f(S\\cup \\{x\\}) - f(S))\n\\end{split}\n$$\n"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "b3f07438",
"cell_type": "markdown",
"source": "# Proof"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "df739974",
"cell_type": "markdown",
"source": "- We plug the lemma on $S_k$:\n$$f(S_k) \\geq \\frac 1 k f(S^*) + (1-\\frac 1 k )f(S_{k-1})$$"
},
{
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"id": "b95e954f",
"cell_type": "markdown",
"source": "- We dive in:\n$$f(S_k)\\geq \\frac 1 k f(S^*) \\left(\\sum_{i=0}^{k-1} (1-\\frac 1 k)^i\\right) + (1-\\frac 1 k)^k f(\\emptyset)$$"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "b2357d3f",
"cell_type": "markdown",
"source": "# Proof"
},
{
"metadata": {
"slideshow": {
"slide_type": ""
}
},
"id": "aae0c308",
"cell_type": "markdown",
"source": "- We clean (hint: $1-x\\leq e^{-x}$):\n$$\n\\begin{split}\nf(S^*) &\\geq \\frac 1 k f(S^*) \\frac{1- (1-\\frac 1 k )^k}{1 - (1-\\frac 1 k )} + (1-\\frac 1 k)^k f(\\emptyset)\\\\ \n&\\geq \\frac 1 k f(S^*) \\frac{1- (1-\\frac 1 k )^k}{1 - (1-\\frac 1 k )} \\\\\n&= (1 - (1-\\frac 1 k)^k)f(S^*) \\\\\n&\\geq (1-\\frac 1 e)f(S^*)\n\\end{split}\n$$"
},
{
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"id": "8933f9a4",
"cell_type": "markdown",
"source": "# Conclusion "
},
{
"metadata": {},
"id": "bc59f12e",
"cell_type": "markdown",
"source": "Tons of variants:\n- Non-monotone\n- More complex constraints (e.g. knapsack)\n- Wide range of approximations (1/3, 1/2, ...)\n- Take-away: submodular $\\rightarrow$ greedy algorithms perform well\n- The bound is tight in *theory*, loose in *practice*"
},
{
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"id": "3194779f",
"cell_type": "markdown",
"source": "# Sources"
},
{
"metadata": {},
"id": "3d3f6994",
"cell_type": "markdown",
"source": "- https://en.wikipedia.org/wiki/Submodular_set_function\n- https://www.cs.cornell.edu/home/kleinber/kdd03-inf.pdf\n- https://swoh.web.engr.illinois.edu/courses/ie512/handout/submodular.pdf\n- https://people.orie.cornell.edu/dpw/orie6334/lecture23.pdf\n- https://www.cs.toronto.edu/~eidan/papers/submod-max.pdf\n- https://pubsonline.informs.org/doi/pdf/10.1287/mnsc.23.8.789\n- https://people.seas.harvard.edu/~yaron/AM221-S16/lecture_notes/AM221_lecture20.pdf\n- https://courses.engr.illinois.edu/cs598csc/sp2011/Lectures/lecture_6.pdf"
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"name": "python3",
"display_name": "Python 3 (ipykernel)",
"language": "python"
},
"language_info": {
"name": "python",
"version": "3.9.7",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
},
"toc": {
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": true,
"base_numbering": 1,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"gist": {
"id": "e78f05a15c36a55aefe18ef5157ec4d9",
"data": {
"description": "Introduction to submodular functions",
"public": true
}
},
"_draft": {
"nbviewer_url": "https://gist.github.com/e78f05a15c36a55aefe18ef5157ec4d9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment