Launching in Autumn 2011, The British Newspaper Archive will make millions of pages of historical newspapers available online for the first time – unlocking a treasure trove of material for historians, researchers, genealogists, students and anyone interested in when, where and how our ancestors lived and key periods of historical interest.
The project is a partnership between the British Library and brightsolid online publishing, and aims to digitise up to 40 million pages from the UK national newspaper collection over the next ten years.
The text contained within the scanned newspapers will be fully searchable online by date, title and keywords – transforming access to material previously only available in print or microfilm.
Scanning is undertaken using five Zeutschel A0 scanners creating very high quality digital images of 400dpi in 24bit colour. Some of the newspapers already scanned have resulted in single page image files being as large as 400MB! This is due to the very large (physical) size of the original newspaper pages, particularly around the turn of the nineteenth century.
The scanned page images are then converted to a JPEG2000 format for archive purposes. The image files are also run through an optical character recognition (OCR) process which creates the electronic text. This process involves segmenting each page into classified zones to aid searching. Finally the output OCR text is indexed in a large database and presented to the users via the new website, due to launch later this year.
More than 1 million pages of pre-1900 newspapers will be available at launch, building to 4 million digitised pages over the next 2 years.
Register now to be the first to know when the newspapers will be available to search online! . . . www.britishnewspaperarchive.co.uk