Internet Archive News

updates about archive.org

Archive for the ‘internet archive’ Category

Time travel through millions of historic Open Library images

The BBC has an article about Kalev Leetaru’s project to extract images from millions of Open Library pages.

You can read about how it works…

The Internet Archive had used an optical character recognition (OCR) program to analyse each of its 600 million scanned pages in order to convert the image of each word into searchable text. As part of the process, the software recognised which parts of a page were pictures in order to discard them.

Mr Leetaru’s code used this information to go back to the original scans, extract the regions the OCR program had ignored, and then save each one as a separate file in the Jpeg picture format. The software also copied the caption for each image and the text from the paragraphs immediately preceding and following it in the book. Each Jpeg and its associated text was then posted to a new Flickr page, allowing the public to hunt through the vast catalogue using the site’s search tool.

“I think one of the greatest things people will do is time travel through the images,” Mr Leetaru said.

… or just check out some of the results. Images plus citations plus metadata! We couldn’t be happier. Free to use with no restrictions.

Image from page 301 of "The New England magazine" (1887)

Image from page 788 of "St. Nicholas [serial]" (1873)

Image from page 210 of "Farmington, Connecticut, the village of beautiful homes" (1906)

Image from page 1121 of "The Saturday evening post" (1839)

Image from page 368 of "New England; a human interest geographical reader" (1917)

Image from page 902 of "Canadian grocer July-December 1896" (1889)

Image from page 249 of "Gleanings in bee culture" (1874)

Image from page 411 of "The Canadian druggist" (1889)

Originally posted on The Open Library Blog by Jessamyn West.
Advertisements

Written by internetarchive

August 29, 2014 at 10:52 pm

Posted in internet archive

Wikimania London!

Internet Archive at Wikimedia

The Internet Archive had a booth at Wikimania in London. The booth was in the Community Village section of the conference. We hope you stopped by and said hello, grabbed a sticker or a handout, and learned a bit more about our book scanning projects and told us what you were up to. If you’d like to pick up digital copies of our handouts, PDFs are here.

We also went to a lot of programs that were really worthwhile, the free/open culture vibe was palpable and exciting with 2500+ people all getting together to find ways to share more content in more ways. A few other documents we picked up that might be interesting to other folks.

For people who like working on Wikipedia but are often flustered by paywalls, you should know about the Wikipedia Library which has a project to help editors access reliable sources. The Wikipedia Loves Libraries project is gearing up for a month of wiki-workshops and edit-a-thons at libraries around Open Access Week in October/November.

Originally posted on The Open Library Blog by Jessamyn West.

Written by internetarchive

August 12, 2014 at 8:09 pm

Posted in internet archive

Bitcoin and the Internet Archive Swag Store

bitcoinrotateSan Francisco Weekly said we are the best Bitcoin Evangelists in their BestOf section.   Fun.

We now accept bitcoin at our Archive swag store.    We continue to offer bitcoins to our emplInternet Archive TShirtoyees as salary, eat sushi for bitcoin next door, supported bitcoin as well as could at our credit union, have a cool honor-based bitcoin ATM (please come and use it), accept bitcoin at movies, as well as graciously accept bitcoins as donations to keep our servers humming.   (We get a few bits every day, thank you!)

Go Bitcoin!

Sushi for Bitcoins

Sushi for Bitcoins

Originally posted on The Internet Archive Blog by brewster.

Written by internetarchive

May 8, 2014 at 4:25 pm

Posted in internet archive

Announcing: A Brave New Feature for TV News V2.1

The new TV News Archive, launched just over one month ago, was updated today with the addition of a super new feature: Search Inside shows.

Screen Shot 2014-05-03 at 11.38.14 AM-01

It sounds simple enough for those familiar with the ubiquitous keyboard shortcut Ctrl+F…but it turns out that’s actually only 10% of you! So why use this feature when you’re browsing the TV News Archive of 500,000+ US TV News Shows? Several reasons:

1) More Better Context – The TV New search inside feature enables users to discover a word or combination of words within a show by highlighting the desired term in every segment where it occurs in a show. Furthermore, for every 1 minute segment where a term occurs, all accompanying closed captioning text is surfaced!
2) Less Background Noise – Columns of 1 minute segments that don’t contain a “search inside” term collapse so you can find exactly what you need faster.
3) Remedies the “Refer Problem” – About 80% of the time a user is referred to a TV News show page from a third party search engine, the user’s original search term doesn’t carryover. In other words, you land on a show page with zero terms highlighted, and that’s annoying. While we can’t exactly solve this problem, we can prescribe medication for the pain, “search inside.”

So now you know, go try it out for yourself! Here are just a couple amazing projects made possible by TV News, get inspired and show us how this tool helps you. Screen Shot 2014-05-03 at 12.13.25 PM

Why Cable TV Is Dying and Twitter is Winning | André-Pierre du Plessis, Columbia Graduate School of Journalism

Screen Shot 2014-05-03 at 12.14.53 PM

Tiny Numbers | Bodo Winter, UC Merced Cognitive Sciences 

— the  team

Originally posted on The Internet Archive Blog by kristen.

Written by internetarchive

May 5, 2014 at 6:49 pm

Posted in internet archive

Introducing the New TV News Archive

Happy April Fool’s Day! We couldn’t think of a better day to launch the fully redesigned TV News Archive.

This research library, originally released in September 2012, is a free service provided as a way to enhance the capabilities of journalists, scholars, teachers, librarians, civic organizations and other engaged citizens. It repurposes closed captioning to enable users to search, quote and borrow from the Internet Archive’s collection of 500,000+ US TV news broadcasts aired since 2009.

The new interface has been designed to give users better access to this collection, and to provide new tools that enable users to share short clips from any broadcast and track play and share statistics of those clips over time.

Here’s a quick overview of the site’s features; we hope they serve you well.

 

TV News_V2.0_Buttons_Final-10 3 1.38.34 PM

Search transcripts of US TV news shows aired since 2009

  • Search with topical terms to return shows with corresponding transcripts. Remember, you are searching the words spoken in the show.
  • Use the advanced search tool (click the TV News_V2.0_Buttons_Final-10 2 1.38.34 PM 3 icon) to specify a network or show name, or sort your search results.
  • Refer to the TV News_V2.0_Buttons_Final-10 9 1.38.34 PM 2 “info” panel throughout the site for details about your search results, related topics and other stats.

TV News_V2.0_Buttons_Final-10 2 1.38.34 PM

Scan and view show segments

  • Shows are presented in 60 second segments, each with a video and corresponding transcript text.
  • Scroll left and right to scan through segments of a show; search terms are highlighted in transcript text.
  • To search within a show transcript text try Ctrl + F ( TV News Launch Memo-02 + F on mac) to search inside the page. (scrollable transcripts are coming soon!)

 

TV News_V2.0_Buttons_Final-10 3 1.38.34 PM 2

Share and embed short clips (aka quotes) from a show

  • Shareable quotes are limited to 60 seconds. Refine your quote selection by clicking the “Edit” button and dragging the  TV News_V2.0_Buttons_Final-10 6 1.38.34 PM 3   handles.
  • Click a social media button TV News_V2.0_Buttons_Final-10 14 (or 2x the embed button) to finalize and share your quote.
  • Your quote will be assigned a permalink. You can always come back to see it!

 

TV News_V2.0_Buttons_Final-10 3 1.38.34 PM 3

Track popularity of show quotes shared over time

  • Quotes with a unique start and stop time within a show will be tracked to see how often they are re-shared or played.
  • View a specific quote by saving or sharing its unique permalink, or you can browse quotes from shows on the TV News Archive site by looking for the TV News_V2.0_Buttons_Final-10 6 1.38.34 PM 2icon.

 

TV News_V2.0_Buttons_Final-10 9 1.38.34 PM 3

Borrow full shows on DVD

  • Borrow shows (click the TV News_V2.0_Buttons_Final-10 9 1.38.34 PMicon on any show detail page) from the Internet Archive library on a DVD-ROM for 30 days for a $25 processing fee.
  • Internet Archive does not sell or license this content. Please note that this is a copyrighted work and performance, copying, or sale, whether or not for profit, by the recipient is not authorized.

 

 

Originally posted on The Internet Archive Blog by kristen.

Written by internetarchive

April 1, 2014 at 12:45 pm

Posted in internet archive

Magazine in movie “WarGames” is discovered using an Internet Archive Collection

 

 

01_Title

An intrepid researcher wanted to figure out what magazine was used in movie WarGames and using the Internet Archive collection found it was Creative Computing.  (which was a key magazine for me in the 70′s when I sold personal computers during the pre-Apple ][, kit days).

Reading the gory details of this hunt is fun.  http://mw.rat.bz/wgmag/

 

Originally posted on The Internet Archive Blog by brewster.

Written by internetarchive

December 19, 2013 at 5:51 pm

Posted in internet archive

Birthday of the Defensive Patent License: Friday, Nov 15, 4:30-8:00 in SF

 

dpl-header

Please join us to celebrate the birthday of the
Defensive Patent License (“DPL”)!

Short Program and Birthday Party
Friday, November 15, 2013

Panel Discussion 4:30-6:00 PM
Reception 6:00-8:00 PM

Internet Archive
300 Funston Ave, San Francisco, 94118

Click here for more information

Click here to RSVP

DPL Launch Conference
Friday, February 28, 2014

Brower Center
2150 Allston Way, Berkeley, 94704

dpl-footer
Originally posted on The Internet Archive Blog by brewster.

Written by internetarchive

November 13, 2013 at 8:01 pm

Posted in internet archive