Digital Library for Dutch Literature (DBNL) dataset
Which novels were popular in the past? How were male and female characters portrayed? Can you measure literature? The DBNL dataset has over 5 million digitised pages of Dutch language and literature. Some of the files can be freely accessed and downloaded.
What can you find in the DBNL dataset?
The DBNL dataset has a wealth of material on Dutch and Flemish language and literature dating from the Middle Ages to the present day. It also contains examples of Limburg, Frisian, Surinamese and South African literature.
The dataset comprises digitised texts, manually corrected by an editor, with the relevant metadata. The collection includes Middle Dutch literature, as well as classic novels. In addition, the dataset covers Dutch linguistics magazines, such as De Gids and De Revisor.
How is the information presented?
The following files are available for each book or magazine volume in DBNL:
- the full text (php)
- a TEI XML file
- a .txt file
Often also available:
- a searchable PDF
- an epub
- a PDF with scans of the book or magazine
For texts that are with certainty no longer protected by copyright ("public domain"), zip files containing XML and .txt files are available. In addition, a CSV file with metadata is available. More recent material may not be offered as a dataset due to copyright, but can be downloaded per title as an XML or .txt file.
Conditions for re-use
The data in the DBNL dataset are partially accessible to all. The KB wants to make us much information as possible freely available to all. This is not possible for texts that are protected by copyright.
Depending on the copyright, the use of this dataset can be divided into two regimes. Work belonging to a maker who has been dead for over 70 years is part of the public domain. This work is freely available to everyone. Many of these publications are available as zip files. Other work is still covered by copyright, but can be used for research purposes on request by, for example, academics, researchers, lecturers or journalists. You can apply for access via @email.
We can sometimes offer customised solutions. Ask your question via @email.
Contact and feedback
We are interested to know who uses our texts, and how they are used. So please send an e-mail with your contact details and a brief explanation of what you intend to do with the data to @email. Feedback is always welcome. If you provide us with your personal details, we will keep you informed about any relevant developments, such as changes to the dataset or the release of new datasets.
