Making a digital library

Abstract
The CORE (Chemical Online Retrieval Experiment) project is a library of primary journal articles in chemistry. Any library has an inside and an outside; in this article we describe the inside of the library and the methods for building the system and accumulating the database. A later article will describe the outside (user experiences). Among electronic-library projects, the CORE project is unusual in that it has both ASCII derived from typesetting and image data for all its pages, and among experimental electronic-library projects, it is unusually large. We describe here (a) the processes of scanning and analyzing about 400,000 pages of primary journal material, (b) the conversion of a similar amount of textual database material, (c) the linking of these two data sources, and (d) the indexing of the text material.

This publication has 8 references indexed in Scilit: