Creating and Integrating a Database – Work in Progress
*This blog was originally published on The Recipes Project on 14 July 2016*
By Marieke Hendriksen
As mentioned in the post introducing the ARTECHNE project at Utrecht University last month, we are in the process of creating a database containing recipes, artist handbooks, and art theoretical texts that can clarify the development of the use of the term ‘technique’, as well as related terms referring to processes of making and doing. The database is linked to Geographical Information Software (GIS), thus creating an online historical semantic map of ‘technique’. Such maps are more than merely nice illustrations; they can reveal connections that remain hidden otherwise.
We are in the fortunate position where we can build the database on an existing one, namely the excellent Colour ConText database. However, we do not simply want to offer a copy of the Colour ConText database, we want to integrate it with other sources on artisanal practices and theories, such as recipes, books of secrets, art theoretical texts, and artist handbooks, from the period 1500-1900, in Latin, Dutch, German, English, French, Italian and Spanish. The entire database has to be searchable with advance searches, allowing users to search full text for occurrences of terms, in particular geographical or linguistic areas and periods, or to trace the changing uses and meanings of particular terms over time, linking various forms of terms and different terms with a similar meaning in relational tables and glossaries.
For example, if we want to know how instructions for making paint have changed in the low countries between 1600 and 1900, we want to be able to select all available sources from that period and area and search them for imperatives, nouns, measurements, and particular ingredients. We also want to be able to distinguish between printed and manuscript sources, artist handbooks and household recipes books, and ideally we want to be able to search annotations and marginalia too, as well as differences between various editions of the same work. That means a lot of different parameters have to be specified for each source, and everything we include must be carefully checked before it is added.
One of the problems we encountered when selecting the first sources we wanted to add was the low reliability of Optical Character Recognition software when used on early modern printed sources – a problem I have written about on my own blog before. We do not just want to combine existing digitized sources in our database, but also to add texts that have not been digitized and made searchable thus far. However, that often involves lengthy correction processes. One of the solutions we are considering is crowd sourcing such corrections. Fortunately, we can also add many digitized sources that are available under Creative Commons licenses (i.e. from the Digital Library of the Netherlands), and various researchers have contacted us to discuss adding their own datasets to the database.
The first edition of the database is now online – have a look at http://artechne.hum.uu.nl! Please note that this is a first version, and that the database will be improved and continue to grow for the duration of the project. Over the next few months, we will focus on increasing searchability and adding new sources. If you have any questions or suggestions, please contact us at artechne[at]uu.nl.
Finally, we will also explore the options of maintaining the database after the end of the ARTECHNE project, for example by depositing the datasets or even the entire infrastructure with an archival institution.
The ERC ARTECHNE project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 648718) and is a cooperation of Utrecht University and University of Amsterdam.