.

Wayback Machine

Wikicars, a place to share your automotive knowledge
Revision as of 07:10, 14 December 2006 by Red marquis (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
File:Ia logo.jpg
The logo of Internet Archive

The Internet Archive (IA) is a non-profit organization dedicated to maintaining an archive of Web and multimedia resources. Located at the Presidio in San Francisco, California, this archive includes "snapshots of the World Wide Web" (archived copies of pages, taken at various points in time), software, movies, books, and audio recordings (including recordings of live concerts from bands that allow it). To ensure the stability and endurance of the archive, a complete copy is also maintained at Bibliotheca Alexandrina. The Archive makes the collections available at no cost to researchers, historians, and scholars.

History

File:Internet Archive Sheridan.jpg
Internet Archive headquarters

The Archive was founded by Brewster Kahle in 1996.

According to its website:

Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive's mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars. The Archive collaborates with institutions including the Library of Congress and the Smithsonian.

Because of its goal of preserving human knowledge and artifacts, and making its collection available to all, proponents of the archive have likened it to the Library of Alexandria.

Wayback Machine

Examples from the Wayback Machine's archives:

The archive also maintains the "Wayback Machine", with content from Alexa Internet. This service allows users to see archived versions of web pages, what the Archive calls a "three dimensional index".

The Wayback Machine's archive is gradually made available. It can take from six to twelve months for an archived snapshot to appear. As an alternative, librarians and scholars who want to permanently archive material and immediately cite an archived version can use the Archive-It system instead. As of 2006 the Wayback Machine contained almost two petabytes of data and was growing at a rate of 20 terabytes per month, increasing by two thirds the 12 terabytes/month growth rate reported in 2003. Its growth rate eclipses the amount of text contained in the world's largest libraries, including the Library of Congress.

The name "Wayback Machine" is a reference to a segment from The Rocky and Bullwinkle Show in which Mr. Peabody, a bow tie-wearing dog with a professorial air, and his human assistant, Sherman, use a time machine called the "WABAC machine" to witness famous events in history.

Mirrors

Media collections

Most of their movies, books, and recordings are public domain or licensed under a Creative Commons License. The audio section largely includes music from independent artists, as well as more established artists and musical ensembles with permissive rules in regards to the recording of their concerts (e.g. The Grateful Dead, String Cheese Incident, Toad the Wet Sprocket, O.A.R., 311, Fugazi, etc.).

Open Library

The Internet Archive operates the Open Library where a small but steadily growing number of scanned public domain books are made available in an easily browsable and printable format.

Moving image collection

Aside from feature films, their Moving Image collection includes: newsreels; classic cartoons; pro- and anti- war propaganda; Skip Elsheimer's "A.V. Geeks" collection; and ephemeral material from Prelinger Archives, such as advertising, educational and industrial films and amateur and home movie collections.

Their Brick Films collection contains stop-motion animation filmed with LEGO bricks, some of which are 'remakes' of feature films. The Election 2004 collection is a non-partisan public resource for sharing video materials related to the 2004 United States Presidential Election. The Independent News collection includes sub-collections such as the Internet Archive's World At War competition from 2001, in which contestants created short films demonstrating "why access to history matters." Among their most-downloaded video files are eyewitness recordings of the devastating 2004 tsunami.

Some of the films available on the Internet Archive are:


Controversies

Scientology sites

In late 2002, the Internet Archive removed various sites critical of Scientology from the Wayback Machine. The error message stated that this was in response to a "request by the site owner". However, it was later clarified that lawyers from the Church of Scientology had demanded the removal, on unknown legal grounds, and that the actual site owners did not want their material removed.

Archived web pages as evidence

In an October 2004 case called "Telewizja Polska SA v. Echostar Satellite", a litigant attempted to use the Wayback Machine archives as a source of admissible evidence, perhaps for the first time. Telewizja Polska is the provider of TVP Polonia, and EchoStar operates the Dish Network. Prior to the trial proceedings, EchoStar indicated that it intended to offer Wayback Machine snapshots as proof of the past content of Telewizja Polska’s website. Telewizja Polska brought a motion in limine to suppress the snapshots on the grounds of hearsay and unauthenticated source, but Magistrate Judge Arlander Keys rejected Telewizja Polska’s assertion of hearsay and denied TVP's motion in limine to exclude the evidence at trial. However, at the actual trial, district Court Judge Ronald Guzman, the trial judge, overruled Magistrate Keys' findings, and held that neither the affidavit of the Internet Archive employee nor the underlying pages (i.e. the Telewizja Polska website) were admissible as evidence. Judge Guzman reasoned that the employee's affidavit contained both hearsay and inconclusive supporting statements, and the purported webpage printouts themselves were not self-authenticating.

Grateful Dead

In November 2005, free downloads of Grateful Dead concerts were removed from the site. John Perry Barlow identified Bob Weir, Mickey Hart, and Bill Kreutzmann as the instigators of the change, according to a New York Times article. Phil Lesh commented on the change in a November 30, 2005 posting to his personal website:

It was brought to my attention that all of the Grateful Dead shows were taken down from Archive.org right before Thanksgiving. I was not part of this decision making process and was not notified that the shows were to be pulled. I do feel that the music is the Grateful Dead's legacy and I hope that one way or another all of it is available for those who want it.

A November 30 forum post from Brewster Kahle summarized what appeared to be the compromise reached among the band members. Audience recordings could be downloaded or streamed, but soundboard recordings were to be available for streaming only. Concerts have been since re-added.

Issues with cybersquatters

Due to the nature of the Internet Archive's policy of removing sites that disallow bots to index pages (through the use of robots.txt) a number of websites over the years have now become inaccessible through the Wayback Machine. This is due to the new cybersquatting domain owner placing a robots.txt file that disallows indexing of the site. Because of this practice, it is having a detrimental effect for researchers looking for information that was available in the past. The administrators claim to be working on a system that will allow access to that previous material while excluding after the point the domain switched hands.

See also

References

Scientology controversy

Wayback Machine archives as legally admissible evidence

Grateful Dead controversy

External links