Digital Library for Dutch Literature (DBNL) dataset

Which novels were popular in the past? How were male and female characters portrayed? Can you measure literature? The DBNL dataset has over 5 million digitised pages of Dutch language and literature. Some of the files can be freely accessed and downloaded.

What can you find in the DBNL dataset?

The DBNL dataset has a wealth of material on Dutch and Flemish language and literature dating from the Middle Ages to the present day. It also contains examples of Limburg, Frisian, Surinamese and South African literature.

The dataset comprises digitised texts, manually corrected by an editor, with the relevant metadata. The collection includes Middle Dutch literature, as well as classic novels. In addition, the dataset covers Dutch linguistics magazines, such as De Gids and De Revisor.

How is the information presented?

For every book or magazine, there is:

  • the (corrected) text (TEI XML)
  • a searchable PDF (based on the XML)
     

Sometimes, there is also:

  • an ePub
  • a PDF of the original scan of the work

A ZIP file with txt files plus a csv file with metadata, are available for texts that have been automatically analysed to make certain that they are no longer protected by copyright.

Conditions for re-use

The data in the DBNL dataset are partially accessible to all. The KB wants to make us much information as possible freely available to all. This is not possible for texts that are protected by copyright.

Depending on the copyright, the use of this dataset can be divided into two regimes. Work belonging to a maker who has been dead for over 70 years is part of the public domain. This work is freely available to everyone. Many of these publications are available as ZIP files. Other work is still covered by copyright, but can be used for research purposes on request by, for example, academics, researchers, lecturers or journalists. You can apply for access via @email.

We can sometimes offer customised solutions. Ask your question via @email.

Contact and feedback

We are interested to know who uses our texts, and how they are used. So please send an e-mail with your contact details and a brief explanation of what you intend to do with the data to @email. Feedback is always welcome. If you provide us with your personal details, we will keep you informed about any relevant developments, such as changes to the dataset or the release of new datasets.