
Data management plan - Opening, publishing, and archiving
As discussed in the planning of metadata, the basic descriptive metadata must always be published and made openly available in a FAIR metadata catalogue. This can take place during almost any phase of the research, as long as the datais in a phase that it can be described in sufficient detail.
The easiest way to publish the metadata is typically in the of the Ģֱ. In your DMP, anticipate and schedule when and where you will publish the metadata.
Importantly, you can publish metadata even during the research project, before the data themselves become ready. This way you will benefit the most from the published metadata. You can cite it and use it to showcase your work to the world!
Example: Metadata entries of the data will be published immediately in the University's JYX publication repository when they are considered sufficiently complete, even if the data itself is not yet public. (For description of creation and curation of metadata entries please see Section 3 of this plan.) The metadata will then be searchable in e.g. the national Etsin and Research.fi metadata catalogues.”
It is important to be able to assess which parts of the data can and should be published either openly or with restricted access. If the data sets are not published, a valid justification must be provided in the DMP. For example, handling non-anonymised personal data in the data and the loss of significance of the data if these data were removed.
For the data to be published, it is advisable to identify a suitable repository already in the planning phase, as the repository may have requirements related to the format and quality of the data, which should be considered as early as possible. The repository should adhere to the FAIR principles as closely as possible. In other words, it should provide at least a landing page and a persistent identifier for the data.
It is also good to set a timeline for the publication of the data - and remember that while embargoes and delayed publication protect the researcher’s rights to be the first to use their data, they should not be unnecessarily long.
In the planning phase, it is also good to remember that archiving is an option for ensuring openness and accessibility. In this case, it is even more important to plan the collection and handling of the data carefully and in accordance with the chosen archive’s guidelines, for example, regarding notifications, file formats, and documentation. Some archives refuse to accept data sets that do not comply with these guidelines.
It is important that the opening and archiving plans do not conflict with other intentions or constraints (e.g., personal data) and that they are considered when informing the subjects. A common mistake is to state that the data will be anonymised and opened, even though anonymisation is impossible - or to state that the data will be anonymised and opened while retaining an identifiable or pseudonymised copy (in which case the data is not anonymous by definition).
Examples: “Dataset(s) themselves, complete with full description of methods, will be published in [JYX / a certified, field-specific repository / Zenodo].”
“Sensitive parts of the data will be anonymized and published along with other data.”
“Sensitive parts of the data cannot be anonymized and thus cannot be openly published.
Sensitive data will be stored in JYX, only the metadata will be public, and access to the data can be requested and granted on certain conditions [describe the procedure and terms on which access and right to use the data can be granted] .”“Sensitive parts of the data cannot be stored and will be disposed of after the project is finished.”
In the DMP, it is also good to consider the potential long-term preservation of the data. However, this is often the responsibility of the entity that archives or publishes the data. The researcher can facilitate this by
- choosing a proper archive or by
- suggesting that once published in the JYX repository or deposited the future organisational data archive of the Ģֱ, it should further be included in the Fairdata-PAS long-term preservation environment provided by CSC, within JYU’s quota. The Research Data Services at Open Science Centre helps in this.
Finally, the timing of the destruction of any parts of data that are not planned for longer-term re-use is a point to be included in the DPM.