Skip to content

Instantly share code, notes, and snippets.

@matthewrobertbell
Created April 14, 2013 13:48
Show Gist options
  • Select an option

  • Save matthewrobertbell/5382797 to your computer and use it in GitHub Desktop.

Select an option

Save matthewrobertbell/5382797 to your computer and use it in GitHub Desktop.
lists gonna list
import urlparse
import collections
urls = (l.strip() for l in open('urls.txt') if len(l.strip()))
data = collections.defaultdict(set)
for url in urls:
domain = urlparse.urlparse(url).netloc
data[domain].add(url)
while len(data):
current_list = []
for k in data.keys():
current_list.append(data[k].pop())
if not len(data[k]):
del(data[k])
print current_list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment