In recent versions of Safari, there is a "Reader" button that appears in the address bar on certain web pages. When you click this button, it will give you a text-only version of the article on the page without any ads or content that is not part of the article. I would like to create a web app that does something similar when the user enters the URL for an online article (a New York Times article, for instance).
I am wondering if anyone has any guesses as to whether this feature in Safari is implemented in:
- A complex way, e.g. "grepping" through the article and following some algorithm to guess which tags to extract, etc.
- A simple way, e.g. accessing some sort of RSS or Atom feed that provides only the article text. From what I can tell, most of these feeds seem to only provide short descriptions of articles and links to them, rather than the full text.
Any thoughts?