Documentation and metadata

Table of contents

Documentation and metadata help ensure that your work is smooth and organized.

An essential part of data management is to take care of documentation, i.e. keep track of what you are doing. This can be done, for example, by writing a research diary or by using an excel spreadsheet in which the stages of data processing are recorded. Thus, documentation is a process that makes the data understandable and usable.

And what is metadata? Imagine your data as a closed package, the contents of which you do not know. Metadata is like a sticker on top of a package that tells you the contents of the package. 

  • In research, metadata refers to basic data describing data that has been compiled in a human- and machine-readable format and that enables the identification and findability of the data if the data is published online.
  • In the thesis process, metadata is usually not published, so it can be thought of as a table of contents for the data, which you can create to keep your files and folders organized. For example, you might have two collections of photos. What basic information do you need to list about them in order to differentiate them and be able to work smoothly?

Documentation

  • Scientific knowledge production requires that the research process is documented in such detail that it would be possible to repeat the research design afterwards in the same form. In this way, the reliability of the results can be verified.
  • Take care from the beginning of your research process to make sure that you are able to describe it accurately enough in your thesis.
  • The documentation requirement also applies to data – it must be possible to describe the creation, structure and processing of data in a form that is understandable to others.
  • For documentation related to the data, you can use, for example, a formal research diary or an excel spreadsheet that describes
    • the different parts of which the data consists of, and
    • key information about the components of the data (e.g. number of interviews, date of interview, specific themes, etc.)

In practice, you keep track of what you did, when, how, why, and with whom.

Documentation helps you find, for example, among the interviews you have conducted, the one interview that talked about issue X. Writing down the explanations of the variables you use and taking notes on each stage of data editing and analysis are a key part of documentation, for example.

In doing so, your work is more thoughtful, systematic and structured, i.e. you do better science. You make your own progress easier when you know what you are doing and how to find what you need in your data at any given time. In this way, you can also return to your data without problems or, if permitted, make it available to other researchers so that it is understandable to them.

Different disciplines may have their own practices related to documentation. Chemistry students can look at

Metadata as part of the documentation process

Note that while you probably won't be able to accurately define all the metadata relevant to your research in the early stages of your research, design them with the precision possible. This way, you can describe your data logically during the work, and files and folders will not get confused.

  • Documentation includes recording and updating the metadata of the data, i.e. the basic description data.
  • In practice, you need descriptive and technical information to append your research data. These form the metadata (metadata) of your data. In other words, metadata means information about information, in this case descriptive information about your research data.
  • If your data is suitable for publication or archiving after the thesis has been completed, you will need metadata for this.

Depending on the type of dataset, metadata includes, for example, the name given to the dataset, the author, and when, where and how the data was collected. It is important how you name folders and files. For example, it's easy to identify the latest version of a file when you name the latest version consistently the same way.

Metadata ensures that you or the downstream user can find everything they need and interpret the data unambiguously, regardless of the moment and context of use. If you only had the data you had collected, but no explanatory information and there was a break in writing your master's thesis, would you remember what you were doing and what was in each file after a month? And if you handed over the data to the research group but had not explained the variables and abbreviations you used, would the research group even be able to understand your data? You can think of documentation and metadata as a kind of reading and user guide ("readme" file, e.g. "readme" file). ).

What if you use ready-made data? In archived finished data, the archivist has taken care of the descriptions and you can find them in the archive's data catalogues. Please note, however, that your own tabulations, etc., made on the basis of the finished data require that you describe them yourself. In your own dataset, you can store both the research data and the files containing its descriptive data in the same place.

Remember to refine your plan throughout the research process, i.e. also after the course!

Examples of metadata

  • There are many types of metadata, and different metadata may be required in different situations.
  • If some information has already been disclosed in the research plan, it does not need to be disclosed again in the data management plan.
  • Keep in mind that while you probably won't be able to accurately define all the metadata relevant to your research in the early stages of your research, design it with the precision possible.

More comprehensive information on metadata

This section relates to all FAIR principles - Findable, Accessible, Interoperable, and Re-usable.