We're looking for a programmer to integrate more book records into the site. This requires analyzing, merging, and manipulating large datasets to share them with a wide audience.
The Internet Archive is a non-profit digital library committed to preserving the world's digital cultural artifacts. Used by over 6 million people, this resource is becoming part of how the Internet works. Our job is to put the best humanity has to offer within reach of students, educators and the general public. Find out more about our organization and web archive at www.archive.org.
Open Library is an open source software project started by the Internet Archive to build a site with one web page for every book ever published. The site uses a new type of Semantic Wiki that preserves the structured data that already exists for books. Leveraging millions of library and publisher bibliographic records, we have already created a technology demo, available at http://demo.openlibrary.org, and we're looking for a data importer to help us grow the site to the next level.
You will assist the current team of programmers to import data in MARC, ONIX and other formats, crawl and parse information from the web, and integrate and deduplicate the records that we get from different sources.
REQUIREMENTS:
We are located in the Presidio of San Francisco with parking and public transportation available.
The Internet Archive is an equal opportunity employer. We provide medical and dental benefits. Please send your resume and cover letter to ntannenbaum@archive.org with the subject line “Programmer- Semantic Web”. The Internet Archive thanks all applicants for their interest, but advises that only those selected for an interview will be contacted. No phone calls please.