Skip to content

Instantly share code, notes, and snippets.

@hamoid
Last active January 25, 2026 21:37
Show Gist options
  • Select an option

  • Save hamoid/a9b0bdc1c96e6e6995cfad6f4b069279 to your computer and use it in GitHub Desktop.

Select an option

Save hamoid/a9b0bdc1c96e6e6995cfad6f4b069279 to your computer and use it in GitHub Desktop.
Downloads a just-the-docs website and converts it into a PDF file for offline reading
#!/bin/bash
filename="openrndr-guide"
domain="guide.openrndr.org"
path="" # /some/folder/ if the guide is not located at /
mkdir -p /tmp/manual
cd /tmp/manual || exit
# curl downloads the index page of the website
# grep extracts the <nav> ... </nav> section
# sed(1) injects a line break in front of every URL and adds the full domain
# sed(2) deletes from each line the " character and everything that follows, leaving the clean URL
# tail deletes the first line, which contains a lonely <nav> tag
urlstr=$(curl -s "https://$domain$path" | grep -o -E '<nav .*</nav>' | sed "s/href=\"\//href=\"\nhttps:\/\/$domain\//g" | sed "s/\".*//g" | tail +2)
# convert a long string into an array
urls=($urlstr)
# count how many items in the array
length=${#urls[@]}
echo "Found $length URLs"
# one by one create NNNN.pdf files from each URL
for (( i=0; i<${length}; i++ ));
do
echo "# Page $i of $length"
padded=$(printf "%04d" $i)
wkhtmltopdf ${urls[$i]} $padded.pdf
done
date=$(date +"%F")
# finally join all the PDF files into one
pdfunite *.pdf /tmp/$filename-$date.pdf
@hamoid
Copy link
Author

hamoid commented Jun 16, 2022

Currently tied to guide.openrndr.org but can be adapted for other websites.

An example of the produced PDF can be downloaded from https://github.com/openrndr/openrndr-guide/blob/pdf/openrndr-guide.pdf

Not publishing quality, but probably train- or airplane-reading quality :-)

@hamoid
Copy link
Author

hamoid commented Jan 24, 2024

Dependencies: wkhtmltopdf, curl, sed, pdfunite, grep

@Boris0791
Copy link

Boris0791 commented May 16, 2024

Hello Hamoid, thank you for your script. Is it possible to adapt with this website : https://bp.veeam.com/sp/ ?
I can't adapt the URL, could you send me an example?

@hamoid
Copy link
Author

hamoid commented May 16, 2024

Hi @Boris0791 ! I updated the script. It works with the following configuration:

filename="veeam-guide"
domain="bp.veeam.com"
path="/sp/" # /some/folder/ if the guide is not located at /

@Boris0791
Copy link

Hi @hamoid :-) perfect, thank you !

@nicolay-r
Copy link

@hamoid, that works flawlessly, thank you!

It is also easy adapted for HTTP Basic Auth.

login="YOUR_LOGIN"
password="YOUR_PASSWORD"
# fething URLS
curl -s -u "$login:$password" # ....
# Fetching Content
wkhtmltopdf --username "$login" --password "$password" # ...

@hamoid
Copy link
Author

hamoid commented Jan 25, 2026

I'm happy to hear it worked well for you :-)

Thanks for the Auth tip! Great addition. No need to, but feel free to fork this gist and add Auth to it :-)

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment