Delpher magazines: KB dataset
Which magazines used to be illustrated and which were not? What did the first sports magazines write about? Which recipes were popular in Dutch magazines in the past? Our Delpher magazine dataset comprises over 400,000 digitized magazines from 1800 to 2000. Some of the files can be freely accessed and downloaded.
What is in Delpher magazines?
What will you find in the Delpher magazine dataset? And how do you use the dataset? This dataset is a great resource for research into historical magazines. But it is also useful for broader-based history and culture projects. The dataset gives a peek behind the scenes of what was going on in society and academia between 1800 and 2000.
The collection consists of over 400,000 magazines from the 19th and 20th centuries. In the dataset, you will find address books, almanacs, sports journals, early scientific magazines, public-interest magazines, youth magazines and much more. The dataset consists of scans of the printed pages, with OCR (with room for improvement) and word coordinates, showing where a word is printed on a page. There is a searchable version of every magazine.
Descriptive and structural metadata are also available. Descriptive metadata contains bibliographic details, such as the author, title or date of the edition. Structural metadata provide information about the structure of the file, such as the number of pages, type area and paragraphs. Magazines are added to the dataset on a regular basis.
How is the information presented?
The following files are available for every newspaper edition:
- descriptive metadata (Dubline Core in XML)
- structural metadata (MPEG21-DIDL)
- document (PDF)
The following files are available for every page that has been scanned:
- the image (JPEG 2000)
- the text (OCR in XML)
- the coordinates of every word on a page (ALTO)
Conditions for re-use
The data in Delpher magazines is partially accessible to all. The KB wants to make as much information as possible freely available to all, but this is not possible for magazines that are still protected by copyright.
Depending on the copyright, the use of the magazine collection can be divided into two regimes. Magazines that were first published more than 40 years ago belong to the public domain. They are no longer subject to copyright. More recent magazines are sometimes still protected by copyright, but academics, researchers, lecturers or journalists can ask to use them for research purposes.
There are two APIs: a metadata harvest API on the basis of OAI-PMH, and a search API on the basis of SRU. Manuals for these APIs can be supplied once legal access has been granted via @email. Please note: users must have some experience of programming.
We can sometimes offer customised solutions. Ask your question via @email.
Contact and feedback
We are interested to know who uses our magazines, and how they are used. So please send an e-mail with your contact details and a brief explanation of what you intend to do with the data to @email. Feedback is always welcome. If you provide us with your personal details, we will keep you informed about any relevant developments, such as changes to the dataset or the release of new datasets.