Robots.txt vs. FOIA

Scripting News:Kicking Ass, the DNC weblog, on robots.txt disabling of caches on White House pages about Iraq. Interesting point. Now would be an appopriate[sic] time to ask the Democrats if they will have a different policy should a Democrat be elected to the White House in 2004.

It is also the time to take a look at how robots.txt should be used on government sites. Should web crawlers respect robots.txt files on government sites? The information is covered by the Freedom of Information Act. Perhaps bloggers should make requests for the earlier copies of the documents before the pages were changed.

By contrast the Colorado State robots.txt file has some comments in it about why each directory is eliminated.

Should robots respect robots.txt files on government servers?

No votes yet

Comments

Oct
27
2003

LOC

by Anonymous

Keeping a continual archive of government sites ought to be tasked to the Library of Congress These need to be complete archives, robots.txt-be-damned.

-Ross (karchner.com/update/)

Oct
27
2003

What changes?

by Anonymous

What documents have been changed? There is no evidence at all that any documents have been changed. This robots.txt change is likely a poor effort to deal with an out-of-control robot (note that many of the directories that are exempted do not even exist -- they simply added /iraq/ to every existing directory. Likely an out-of-control robot going through the site adding /iraq/ to every directory).

Oct
27
2003

The changes

by Joshua Brauer

The site said at one time that the war with Iraq is over. Now it says major combat operations.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Link to Amazon products with: [amazon product_id inline|full|thumbnail|datadescriptor]. Example: [amazon 1590597559 thumbnail] or [amazon 1590597559 author]. Details are on the Amazon module handbook page.
  • Twitter-style @usersnames are linked to their Twitter account pages.
  • Twitter-style #hashtags are linked to search.twitter.com.
  • Allowed HTML tags: <a> <b> <dd> <dl> <dt> <i> <li> <ol> <u> <ul><p> <img> <table> <tr> <td><strong><em><sup><div><fn><h1><h2><h3><h4><blockquote><img style="">
  • Use [fn]...[/fn] (or <fn>...</fn>) to insert automatically numbered footnotes.
  • You may insert videos with [video:URL]
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically. (Better URL filter.)
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.

More information about formatting options