OPML to HTML: Parsing a list of feeds
I’m a feed junkie. If you are searching for a simple method to convert OPML to HMTL, chances are that you are too…
OPML as defined by Dave Winer way back in 2000:
OPML an XML-based format that allows exchange of outline-structured information between applications running on different operating systems and environments.
More specifically, OPML is the format used by most RSS aggregators to spit out lists of feed subscriptions. That list usually contains the name of the feed, the feed’s URL, and the site’s URL.
On one of my little sites, I wanted to keep a quick list of links showing the feeds I subscribed to on that particular subject. As I add feeds to my aggregator at a pretty good clip, updating the site would be a tedious process. So, rather than use the tools available in my aggregator
I went ahead and wrote a little python script to grab the OPML and convert it into an HTML stub.
# opml2html.py - sample code for converting OPML to HTML
from xml.dom.minidom import parse, parseString
import urllib2
#dom1 = parse('mah_links.opml') # parse an XML file by name - uncomment if you want to draw from a file
dom1 = parseString(urllib2.urlopen('http://share.opml.org/opml/top100.opml').read()) #use this to parse a feed
links = dom1.getElementsByTagName('outline')
f = open('links.html','w')
for link in links:
linktext = '<a href="' + link.getAttribute('htmlUrl') + '">'
linktext += link.getAttribute('title') + '</a><br />\n'
print linktext
f.write(linktext)
f.flush()
f.close()
Pretty simple, eh? I included the method to parse a live feed too just for reference. By the way, if you’re looking for lists of feeds, Winer’s share.opml.org is great.











[...] first one. Unfortunately by I can’t remember what the first one was. What I’m trying is OPML to HTML: Parsing a list of feeds then using that list to make a Google Custom Search Engine. Now to see if actually [...]
Thank you for the code, it worked beautifully for my little project.
Sweet! I wrote a similar search routing in my quick and dirty feed extractor.
http://www.fieldguidetoprogrammers.com/blog/python/feedextractor-a-quick-and-dirty-python-script-to-grab-lots-of-feeds-from-web-pages/