Internet Archive News

updates about archive.org

Want to help build a distributed web?

Isn’t the web distributed now?  No really, let me illustrate– ever IM your friend that is near you “Hey, wanna see a cool video?  check out this URL”?  Then they download the same video you just downloaded from the original server even though it might be a long way away, rather than from your machine.   This is slow, expensive, wasteful, and well, dumb.

What if, with no browser or server config other than maybe downloading a plug-in:

  • all bigger files come from the folks near you or the original server, whatever is faster?
  • What if the website gets to keep download counts, and keep their website up-to-date.
  •  Website gets get reduced bandwidth bills, and get superstar user satisfaction because of faster speed than YouTube
  •  Web users, even in remote countries, get that “I am sitting on a gig-e network in palo alto” feel.
  •  Less money goes to monopoly phone companies.

Is a real problem?  Yes:

  • Internet Archive servers 2million people each day.   Egyptians and Japanese are two of our most popular user communities.
  • They download the same files over and over.    There is someone with the file that is closer to them than us.
  • the 20gigabits/sec of bandwidth costs us a fortune.
  • others want to serve video, but don’t because of the cost.
  • others host on youtube, or amazon, or archive.org but would rather not.

Would be great, right?   What it takes:

  •   A browser plug-in, and eventually get the browsers to do it natively.
  •   When a user clicks, the browser starts downloading from a site (the site then gets the download credit)
  •   Website serves unique hash for the file and the length of the file in the header and then serves the file as normal (archive.org and other sites do this already)
  •   Browser looks up the hashcode in a “trackerless p2p” system, I think bittorrent can be used for this.
  •   If others have it via p2p, then it gets it from those users as well, so it is not slower than getting it from the website
  •   After the browser downloads it, they offer it to others via p2p.

What do we get?

  •   Less expense for web site owners operators, but keeps them in control and in the loop
  •   Faster and less expensive for users
  •   More sites taking control of their own stuff (don’t need to give your files to remote organizations)
  •   Being far from the server is not as much of a penalty

Who can help?

  •   people that can help debug the idea (and maybe it is already done…)
  •   browser plug-in programmers
  •   p2p super distributed trackerless hashcode knowledgable folks
  •   the Internet Archive will seed all of its files for this system.
  •   we need enthusiasm, a cool logo/mascot, and coffee.

Please comment on this post as a first round to see if we can debug the idea and get critical mass.

-brewster

Originally posted on The Internet Archive Blog by brewster.
Advertisements

Written by internetarchive

February 15, 2012 at 5:50 pm

Posted in News

%d bloggers like this: