Skip to content

Instantly share code, notes, and snippets.

@dderevjanik
Created January 20, 2017 00:12
Show Gist options
  • Select an option

  • Save dderevjanik/f1783cddf540f089dc601f75e31300c6 to your computer and use it in GitHub Desktop.

Select an option

Save dderevjanik/f1783cddf540f089dc601f75e31300c6 to your computer and use it in GitHub Desktop.
[Python] scrapping website in one line
import requests, re, sys
[[ [open('jamal.txt', 'w+', encoding='utf-8').write("\n".join(re.findall(r, txt, re.I|re.S|re.U))) for r in ['b>ID:.*?b>(.*?)<b', 'b>.*?zverejnenia:.*?span.*?>(.*?)<\/span', 'b>Lokalita:.*?a.*?>(.*?)<\/', 'b>Poz.cia(?:(.*?))<\/div', 'b>Spo.*?:.*?<a.*?">(.*?)<\/a']] for txt in ((requests.get('http://www.PROC.sk/' + offer, headers={'User-agent': 'Mozilla/5.0'}).text) for offer in page)] for page in (re.findall('itemscope.*?href="(.*?)"', requests.get('http://www.PROC.sk/praca/?page_num=' + str(n), headers={'User-agent': 'Mozilla/5.0'}).text) for n in range(int(re.findall('page_num=(.*?)"', requests.get('http://www.PROC.sk/praca/', headers={'User-agent': 'Mozilla/5.0'}).text)[-2])))]
# thanks to python generators. Without them, I won't be able to write this ugly code.
# mini-contest
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment