I decided to try out Planet Venus over the week-end, a popular blog aggregation system. The end result is pretty neat, I really like how it works (though I must say it's the first blog aggregator I ever look into.)
The documentation very sensibly insists on running all the unit tests before doing anything, which is where I encountered a couple of issues. One was a malformed HTML docs page which I ignored, the other one looked a bit more serious, a parsing error.
====================================================================== ERROR: test_content_tag_soup (tests.test_reconstitute.ReconstituteTest) ---------------------------------------------------------------------- <...snip traceback...> HTMLParseError: malformed start tag, at line 1, column 14
What didn't make a difference
- Running the tests using python 2.4, 2.5, 2.6
- Reinstalling python-libxml2 (already installed, reinstall inspired by Venus devel mailing list archives)
- Reinstalling python-beautifulsoup
- Installing python-libxslt1, python-xml, python-lxml
Trying to debug the exception a bit more closely, this looked like the usual BeautifulSoup issues that started occurring after the move to HTMLParser for Python 3.0 compatibility.
What worked, eventually
Warning: it's a hack :/ Some solutions suggest removing the python-beautifulsoup package, unfortunately I am unable to try this out as many things on my desktop depend on it.
This workaround posted on the mailing list for a similar problem did work for me though. It involves looking for "import BeautifulSoup" in planet/vendor/feedparser.py and removing the statement.
try: raise Exception # import BeautifulSoup except: BeautifulSoup = None