Smithsonian Institution Archives
  • Collections
  • Services
  • Smithsonian History
  • About
  • Education
  • Blog
  • Forums
  • Press
  • Audiences
  • Donate

The Bigger Picture: Visual Archives and the Smithsonian

Google Halts an Archiving Project

by Marvin Heiferman on June 8, 2011

Newspapers, by Quinn Cowper, Creative Commons: BY-NC-ND 2.0. On May 20th, a flurry of reports took note of Google’s decisions to halt its ambitious efforts to digitize the contents of newspaper archives and make them online and at no cost. Walking away from a program it began in 2006 to make two hundred years worth of articles public and searchable—and after scanning millions of pages from over two thousand currently operating and defunct newspapers—Google signaled it was switching gears to focus on helping publishers monetize their deep wells of content. The company issued a statement saying while users could still search previously digitized content, “we don’t plan to introduce any further features or functionality to the Google News Archives and we are no longer accepting new microfilm or digital files for processing." Instead, Google will harness its corporate and technological heft to support OnePass, a payment platform program instituted back in February. According to the Boston Phoenix, an alternate newspaper and News Archive project partner, and as reported by Search Engine Land, Google said it would be concentrating on “newer projects that help the industry” and “enables publishers to sell content and subscriptions directly from their own sites.” Example of how newspapers are scanned, here on a MediaScan 880C duplex Newspaper Scanner, Courtesy of Newspaper Scanning Systems YouTube channel. Google maintains that it wasn’t abandoning it earlier intended goal because of copyright issues, although those certainly generated controversy and threw a monkey wrench into, for example, Google’s earlier and equally ambitious plan to scan the contents of books in library collections worldwide. It has also been suggested by the UK’s The Guardian that this new turn of events may have been prompted by Apple’s rival payment plans with a number of newspapers. In an article posted online by The Atlantic quotes Carly Carlioi, an editor at the Boston alternative weekly The Boston Phoenix, who explained:

“News Archive was generally a good deal for newspapers—especially smaller ones like ours, who couldn't afford the tens or hundreds of thousands of dollars it would have cost to digitally scan and index our archives—and a decent bet for Google. It threaded a loophole for newspapers, who, in putting pre-internet archives online, generally would have had to sort out tricky rights issues with freelancers—but were thought to have escaped those obligations due to the method with which Google posted the archives. (Instead of posting the articles as pure text, Google posted searchable image files of the actual newspaper pages.) Google reportedly used its Maps technology to decipher the scrawl of ancient newsprint and microfilm; but newspapers are infamously more difficult to index than books, thanks to layout complexities such as columns and jumps, which require humans or intense algorithmic juju to decode. Here's two wild guesses: the process may have turned out to be harder than Google anticipated. Or it may have turned out that the resulting pages drew far fewer eyeballs than anyone expected.”

Or, the decision might be seen as part on the ongoing issue that publishers and museums and educational institutions, too, are now facing: how to balance a mandate or desire to make content available against the costs of actually doing that.  

Categories: What Gets Saved
Tags: Web/Tech, Archive, Digitization
Comments: View 3 comments, or Give us yours!
All comments are moderated and subject to approval. Further information is available in The Bigger Picture’s Commenting Guidelines.

Comments (3) – Leave a comment

Eric Peters

I'm sad to hear Google giving up on this effort. Google usually does things that other companies see as economically indefeasible, and they succeed at it. Maybe once archive.org scans all the books in existence, they'll move on to newspapers.

Eric Peters June 8, 2011 at 10:44 am
  • reply
Russell Davison

It truly is difficult to balance the desire to make content available against the costs of actually doing it. It is a task that must be done by some body. But who? I would suggest that we ask the IMF to fund the project or take money away from other projects like underground CO2 storage or residential area wind farms.

Russell Davison June 8, 2011 at 10:17 am
  • reply
Marvin Heiferman

Even as digital storage options become more plentiful and less expensive, the costs incurred in collecting, processing, scanning and tagging materials may still create financial hurdles. Does that mean that decisions about what get's saved is destined to be what pays for itself, makes a profit, or strikes a benefactor's specific fancy?

Marvin Heiferman June 8, 2011 at 4:04 pm
  • reply

Leave a comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
By submitting this form, you accept the Mollom privacy policy.

Produced by the Smithsonian Institution Archives. For copyright questions, please see the Terms of Use.

Stay in touch!

Facebook Twitter Flickr YouTube SlideShare
Join our eNewsletter

About

Connecting you to America’s past with a behind-the-scenes exploration of the Smithsonian’s history, treasures, and the challenges that Archives face preserving collections. More details...

Smithsonian on Flickr Commons

Topics/Tags

  • See Here (614)
  • American History (553)
  • Science (437)
  • Archive (338)
  • Cities/Places (282)
  • Exhibitions (236)
  • Web/Tech (215)
  • Photo History (190)
  • Link Love (157)
  • Politics/Government (154)

Blog Roll

All Smithsonian blogs
American Historical Association Blog
American Institute of Conservation Blog
Archives Next
Archives of American Art
Around the Mall
Field Book Project
Hanging Together
Library of Congress Blogs
National Archives (US) Blogs
National Museum of American History, O say can you see?
Smithsonian Collections Blog
Smithsonian Libraries
Teaching American History

Categories

  • Collections in Focus (1002)
  • What Gets Saved (342)
  • Behind the Scenes (213)
  • Smithsonian History (141)

Recent Posts

  • Women in Science Wednesday: Constance Endicott Hartt
  • Mr. Rogers at the Zoo
  • Sneak Peek 6/17/2013
  • Link Love: 6/14/2013
  • Summertime on the Mall - Smithsonian Folklife Festival

Monthly Archive

  • June 2013 (14)
  • May 2013 (32)
  • April 2013 (26)
  • March 2013 (26)
  • February 2013 (26)
  • January 2013 (28)
  • December 2012 (26)
  • November 2012 (28)
  • October 2012 (32)
  • September 2012 (26)
  • August 2012 (31)
  • July 2012 (26)
  • June 2012 (27)
  • May 2012 (27)
  • April 2012 (27)
  • March 2012 (28)
  • February 2012 (27)
  • January 2012 (26)
  • December 2011 (31)
  • November 2011 (28)
  • October 2011 (35)
  • September 2011 (31)
  • August 2011 (35)
  • July 2011 (41)
  • June 2011 (43)
  • May 2011 (33)
  • April 2011 (40)
  • March 2011 (43)
  • February 2011 (35)
  • January 2011 (36)
  • December 2010 (42)
  • November 2010 (40)
  • October 2010 (44)
  • September 2010 (37)
  • August 2010 (39)
  • July 2010 (38)
  • June 2010 (37)
  • May 2010 (42)
  • April 2010 (44)
  • March 2010 (47)
  • February 2010 (40)
  • January 2010 (39)
  • December 2009 (43)
  • November 2009 (34)
  • October 2009 (11)
  • September 2009 (11)
  • August 2009 (12)
  • July 2009 (14)
  • June 2009 (10)
  • May 2009 (12)
  • April 2009 (14)
  • March 2009 (10)
  • January 2009 (1)
Smithsonian Institution Archives
eNewsletter Facebook Twitter Flickr Historypin YouTube SlideShare Browsealoud
Smithsonian Institution
  • Privacy
  • Copyright
  • Contact