Skip to content

Instantly share code, notes, and snippets.

@rwcitek
Last active November 2, 2025 04:03
Show Gist options
  • Select an option

  • Save rwcitek/304aed3f2188f0dbf86dacc0085b0959 to your computer and use it in GitHub Desktop.

Select an option

Save rwcitek/304aed3f2188f0dbf86dacc0085b0959 to your computer and use it in GitHub Desktop.
riemann-zeta-zeros.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"authorship_tag": "ABX9TyORFPEuZSdNW1f41wRz2/kW",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/rwcitek/304aed3f2188f0dbf86dacc0085b0959/riemann-zeta-zeros.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"# Fetching data from lmfdb.org\n",
"\n"
],
"metadata": {
"id": "Ar2PP0NDezJu"
}
},
{
"cell_type": "markdown",
"source": [
"## Setup\n"
],
"metadata": {
"id": "khTzki2FhU7G"
}
},
{
"cell_type": "code",
"source": [
"%%capture\n",
"%%bash\n",
"apt-get update\n",
"apt-get install -y elinks\n"
],
"metadata": {
"id": "F4qPgvr0hSDP"
},
"execution_count": 1,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## Browsing the site"
],
"metadata": {
"id": "sHFK79ozhcLo"
}
},
{
"cell_type": "markdown",
"source": [
"If you use a browser, the HTML has JavaScript that sets a cookie and then redirects you. Notice the cookie being set at line 7.\n"
],
"metadata": {
"id": "Vuj70yDqe8M5"
}
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"curl -L -s https://beta.lmfdb.org/data/riemann-zeta-zeros/ |\n",
"cat -n\n"
],
"metadata": {
"id": "Lmr4kdkDc0JW",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "dedaeef1-ab08-48f4-eb96-87ceab0d3f62"
},
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
" 1\t<!DOCTYPE html>\n",
" 2\t<html>\n",
" 3\t<meta charset=\"UTF-8\">\n",
" 4\t<noscript><p>Enable JavaScript, then reload.</p></noscript>\n",
" 5\t<script>\n",
" 6\t // Set cookie (expires in 1 day)\n",
" 7\t document.cookie = \"human=1; path=/; max-age=86400\";\n",
" 8\t // Get the 'gateorig' parameter from URL\n",
" 9\t var params = new URLSearchParams(window.location.search);\n",
" 10\t if (params.has('gateorig')) {\n",
" 11\t var orig = params.get('gateorig');\n",
" 12\t params.delete('gateorig');\n",
" 13\t } else {\n",
" 14\t var orig = '/';\n",
" 15\t }\n",
" 16\t params = params.toString();\n",
" 17\t if (params) {\n",
" 18\t orig = orig + \"?\" + params;\n",
" 19\t }\n",
" 20\t console.log(\"leaving gate\");\n",
" 21\t // Redirect back to original page\n",
" 22\t window.location.href = orig;\n",
" 23\t</script>\n",
" 24\t</html>\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"You can set the cookie manually in curl."
],
"metadata": {
"id": "iurHiWv6fT7g"
}
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"curl -b \"human=1\" -L -s https://beta.lmfdb.org/data/riemann-zeta-zeros/ |\n",
"head -20\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Jcp5O3ZVepYe",
"outputId": "87f4b024-dcc3-4160-eafc-38c9e2a69250"
},
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2 Final//EN\">\n",
"<html>\n",
" <head>\n",
" <title>Index of /data/riemann-zeta-zeros</title>\n",
" </head>\n",
" <body>\n",
"<h1>Index of /data/riemann-zeta-zeros</h1>\n",
" <table>\n",
" <tr><th valign=\"top\"><img src=\"/icons/blank.gif\" alt=\"[ICO]\"></th><th><a href=\"?C=N;O=D\">Name</a></th><th><a href=\"?C=M;O=A\">Last modified</a></th><th><a href=\"?C=S;O=A\">Size</a></th><th><a href=\"?C=D;O=A\">Description</a></th></tr>\n",
" <tr><th colspan=\"5\"><hr></th></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/back.gif\" alt=\"[PARENTDIR]\"></td><td><a href=\"/data/\">Parent Directory</a></td><td>&nbsp;</td><td align=\"right\"> - </td><td>&nbsp;</td></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/text.gif\" alt=\"[TXT]\"></td><td><a href=\"md5.txt\">md5.txt</a></td><td align=\"right\">2016-06-01 15:58 </td><td align=\"right\">792K</td><td>&nbsp;</td></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/unknown.gif\" alt=\"[ ]\"></td><td><a href=\"zeros_14.dat\">zeros_14.dat</a></td><td align=\"right\">2016-06-01 15:58 </td><td align=\"right\"> 57K</td><td>&nbsp;</td></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/unknown.gif\" alt=\"[ ]\"></td><td><a href=\"zeros_5000.dat\">zeros_5000.dat</a></td><td align=\"right\">2016-06-01 15:58 </td><td align=\"right\">328K</td><td>&nbsp;</td></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/unknown.gif\" alt=\"[ ]\"></td><td><a href=\"zeros_26000.dat\">zeros_26000.dat</a></td><td align=\"right\">2016-06-01 15:58 </td><td align=\"right\">4.1M</td><td>&nbsp;</td></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/unknown.gif\" alt=\"[ ]\"></td><td><a href=\"zeros_236000.dat\">zeros_236000.dat</a></td><td align=\"right\">2016-06-01 15:58 </td><td align=\"right\">4.5M</td><td>&nbsp;</td></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/unknown.gif\" alt=\"[ ]\"></td><td><a href=\"zeros_446000.dat\">zeros_446000.dat</a></td><td align=\"right\">2016-06-01 15:59 </td><td align=\"right\"> 51M</td><td>&nbsp;</td></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/unknown.gif\" alt=\"[ ]\"></td><td><a href=\"zeros_2546000.dat\">zeros_2546000.dat</a></td><td align=\"right\">2016-06-01 15:58 </td><td align=\"right\"> 55M</td><td>&nbsp;</td></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/unknown.gif\" alt=\"[ ]\"></td><td><a href=\"zeros_4646000.dat\">zeros_4646000.dat</a></td><td align=\"right\">2016-06-01 16:01 </td><td align=\"right\"> 57M</td><td>&nbsp;</td></tr>\n",
"<tr><td valign=\"top\"><img src=\"/icons/unknown.gif\" alt=\"[ ]\"></td><td><a href=\"zeros_6746000.dat\">zeros_6746000.dat</a></td><td align=\"right\">2016-06-01 16:04 </td><td align=\"right\"> 58M</td><td>&nbsp;</td></tr>\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"Render the HTML using elinks"
],
"metadata": {
"id": "ZinP-cfPk0NU"
}
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"curl -b \"human=1\" -L -s https://beta.lmfdb.org/data/riemann-zeta-zeros/ |\n",
"elinks --dump > rendered.html\n",
"wc rendered.html\n"
],
"metadata": {
"id": "Myee8sRkhj8G",
"outputId": "0b5d9216-3e21-4221-8c56-cb243727ea14",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
" 43761 116682 2440255 rendered.html\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"Extract the file names\n"
],
"metadata": {
"id": "DvJFsygPk3hl"
}
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"cat rendered.html |\n",
"fgrep '//dev/' |\n",
"grep -o zeros.*.dat > data.files.txt\n",
"wc -l data.files.txt\n"
],
"metadata": {
"id": "jlmcbEtchj45",
"outputId": "76a58097-f7af-4ee8-87fb-db462a117361",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 5,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"14580 data.files.txt\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"Setting the cookie manually is also necessary to retrieve the individual files.\n"
],
"metadata": {
"id": "fK7HZ1OGfglQ"
}
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"data=$( head -1 data.files.txt )\n",
"curl -b \"human=1\" -L -s -I https://beta.lmfdb.org/data/riemann-zeta-zeros/${data}\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "s33T-j0-dTB2",
"outputId": "f2ae7c8b-5485-4935-bee7-d61bb6f19c1d"
},
"execution_count": 6,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"HTTP/1.1 200 OK\r\n",
"Date: Sun, 02 Nov 2025 03:48:58 GMT\r\n",
"Server: Apache/2.4.52 (Ubuntu)\r\n",
"Last-Modified: Wed, 01 Jun 2016 19:58:06 GMT\r\n",
"ETag: \"e5b0-5343ce871af80\"\r\n",
"Accept-Ranges: bytes\r\n",
"Content-Length: 58800\r\n",
"Cache-Control: max-age=600\r\n",
"Expires: Sun, 02 Nov 2025 03:58:58 GMT\r\n",
"Vary: User-Agent\r\n",
"\r\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"data=$( head -1 data.files.txt )\n",
"curl -b \"human=1\" -L -O https://beta.lmfdb.org/data/riemann-zeta-zeros/${data}\n"
],
"metadata": {
"id": "k_D8Vwaag2Qz",
"outputId": "432133ec-dcfc-4d9a-fd8c-b43848016b66",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
" % Total % Received % Xferd Average Speed Time Time Time Current\n",
" Dload Upload Total Spent Left Speed\n",
"\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 58800 100 58800 0 0 198k 0 --:--:-- --:--:-- --:--:-- 198k\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"ls -la zeros*\n"
],
"metadata": {
"id": "VbEF_NSIg5H_",
"outputId": "92456acd-b42a-4630-8711-325a74fb093d",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 8,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"-rw-r--r-- 1 root root 58800 Nov 2 03:48 zeros_14.dat\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"Fetch the files. Change/remove the `head -3` to get fewer, more, or all files.\n"
],
"metadata": {
"id": "lI6TD1T2larK"
}
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"cat data.files.txt |\n",
"head -3 |\n",
"while read data ; do\n",
" echo == ${data}\n",
" curl -b \"human=1\" -L -s --time-cond ${data} -O https://beta.lmfdb.org/data/riemann-zeta-zeros/${data}\n",
"done\n"
],
"metadata": {
"id": "JuOdIzaqhBof",
"outputId": "3b93215d-b7c0-429d-c652-d0e91d684f79",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 9,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"== zeros_14.dat\n",
"== zeros_5000.dat\n",
"== zeros_26000.dat\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"ls -la zeros*\n"
],
"metadata": {
"id": "1jsue0c_l8K9",
"outputId": "0a4f764a-4227-4170-eccd-fd172ab79ec6",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"-rw-r--r-- 1 root root 58800 Nov 2 03:48 zeros_14.dat\n",
"-rw-r--r-- 1 root root 4264205 Nov 2 03:49 zeros_26000.dat\n",
"-rw-r--r-- 1 root root 335780 Nov 2 03:48 zeros_5000.dat\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"An alternate version that fetches 20 files and does so 4 files at a time. Change the option to `-P` to pull fewer or more at a time. And change/remove the `head -20` to get fewer, more, or all files.\n"
],
"metadata": {
"id": "CRyagLEBoyrJ"
}
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"cat data.files.txt |\n",
"head -20 |\n",
"xargs -P 4 -I{} -t \\\n",
" curl -b \"human=1\" -L -s --time-cond {} -O https://beta.lmfdb.org/data/riemann-zeta-zeros/{}\n"
],
"metadata": {
"id": "0steDTomop8n",
"outputId": "7dae72a2-d70e-4ee4-eca1-a3944c9513e7",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 11,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"curl -b 'human=1' -L -s --time-cond zeros_14.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_14.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_5000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_5000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_26000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_26000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_236000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_236000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_446000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_446000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_2546000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_2546000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_4646000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_4646000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_6746000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_6746000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_8846000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_8846000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_10946000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_10946000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_13046000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_13046000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_15146000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_15146000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_17246000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_17246000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_19346000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_19346000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_21446000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_21446000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_23546000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_23546000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_25646000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_25646000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_27746000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_27746000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_29846000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_29846000.dat\n",
"curl -b 'human=1' -L -s --time-cond zeros_31946000.dat -O https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_31946000.dat\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"%%bash\n",
"ls -la zeros*\n"
],
"metadata": {
"id": "wzlcphLIrDIo",
"outputId": "ef00618c-d2d8-4da6-d3c8-e1db0dbf8e06",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": 12,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"-rw-r--r-- 1 root root 62863725 Nov 2 03:49 zeros_10946000.dat\n",
"-rw-r--r-- 1 root root 63566244 Nov 2 03:49 zeros_13046000.dat\n",
"-rw-r--r-- 1 root root 58800 Nov 2 03:48 zeros_14.dat\n",
"-rw-r--r-- 1 root root 64170537 Nov 2 03:49 zeros_15146000.dat\n",
"-rw-r--r-- 1 root root 64700937 Nov 2 03:49 zeros_17246000.dat\n",
"-rw-r--r-- 1 root root 65173500 Nov 2 03:49 zeros_19346000.dat\n",
"-rw-r--r-- 1 root root 65599672 Nov 2 03:49 zeros_21446000.dat\n",
"-rw-r--r-- 1 root root 65987651 Nov 2 03:49 zeros_23546000.dat\n",
"-rw-r--r-- 1 root root 4732868 Nov 2 03:49 zeros_236000.dat\n",
"-rw-r--r-- 1 root root 57630106 Nov 2 03:49 zeros_2546000.dat\n",
"-rw-r--r-- 1 root root 66343896 Nov 2 03:49 zeros_25646000.dat\n",
"-rw-r--r-- 1 root root 4264205 Nov 2 03:49 zeros_26000.dat\n",
"-rw-r--r-- 1 root root 66673011 Nov 2 03:49 zeros_27746000.dat\n",
"-rw-r--r-- 1 root root 66978979 Nov 2 03:49 zeros_29846000.dat\n",
"-rw-r--r-- 1 root root 67264823 Nov 2 03:49 zeros_31946000.dat\n",
"-rw-r--r-- 1 root root 53396943 Nov 2 03:49 zeros_446000.dat\n",
"-rw-r--r-- 1 root root 59608278 Nov 2 03:49 zeros_4646000.dat\n",
"-rw-r--r-- 1 root root 335780 Nov 2 03:48 zeros_5000.dat\n",
"-rw-r--r-- 1 root root 60983574 Nov 2 03:49 zeros_6746000.dat\n",
"-rw-r--r-- 1 root root 62024952 Nov 2 03:49 zeros_8846000.dat\n"
]
}
]
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "byBUAGWorbYf"
},
"execution_count": 12,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment