Robot Exclusion For Feed Readers?
Scoble reported about the guy that doesn't want people to subscribe his feeds through Bloglines. Scoble also tracks the responses from the blog community.
My first though was "God, what a stupid idea". But my second though was "OK, maybe feed reader should respect the robots.txt, like search engines do". I'm not the only one, Niall Kennedy already wrote about it, better then I could have done.
Niall mentions that
YahooFeedSeeker, the feed engine behind My Yahoo!, is currently the only aggregator requesting my robots.txt file. Mikel Maron's The World as a Blog requests my robots.txt before including my content in his application.
Interesting, since I didn't know that. So maybe I should update my robots.txt, which currently excludes all search robots from the feed directory... It doesn't make any sense to have your feed indexed on Google, does it?
However, the main problem about feeds and robot.txt is that feeds dont have a fixed location. For example feeds for comments on my posts are located in the directory [Permalink]/feed. No way of expressing this in robots.txt, since wildcards are not allowed, making it impossible to define a rule like this:
User-agent: Bloglines
Disallow: /*/feed/
If this was possible, not only the guy mentioned above would be satisfied, but me too, for having a simple way of preventing my feeds beeing indexed by search engines.
Update: The link to Niall's article got lost somehow, added it.
