- You may override the fromEncoding in the constructor, this is very useful for non-roman, non-standard web pages.
- Versatile find/findAll on tag, attributes.
- Developer-friendly syntactic sugar, the Tag implements the interface of string, list, dict and callable function, so there are many ways to access the data as you wish. The drawback of this approach is the typo is only caught in the run time instead of compilation time.
- Easy to deploy, only one BeautifulSoup.py file.
Something I don’t like:
- The API does not support stream, or file object. Laziness is always cherished for pipelining.
- Why BeautifulSoup? I have made typo as soap more than ten times.