Before fieldwork 2

Researcher’s preparations Part 2/3

This blog is continuation to my previous blog which discussed Steps 1-3 of researcher’s preparations before fieldwork. After having completed steps 1-3, it is time to prepare Data Management Plan (step 4), setting up your own data infrastructure

for storing of research data (step 5) and asking informed consent from research participants (step 6). This blog post discusses Step 4, Data management plan in detail. Steps 5 & 6 will be discussed in a separate part 3.

Let’s first refresh our memory of earlier steps

*Original photo by Hermann Traub from Pixabay*

In order to be able to prepare a good Data Management Plan, I would recommend that a researcher completes necessary courses in Step 1, checks the requirements for Ethical review in Step 2, and prepares documents for the purpose of acquiring informed consent in Step 3.

Having gone through these steps, you’ve already done a good deal of work that you then insert in the Data Management Plan. It is a good document that forces you to plan in what format you store your data from the field and where you store it, during and after research has been completed. It also forces you to think about how to store your data so that it can be accessible to other researchers, if you decide to open the data one day. Anyhow, it is important to be clear and consistent that if any researcher should wish to check your data, they can.

Step 4. Prepare Data Management Plan

Data Management Plan (DMP) is a living document where researcher describes how data will be managed throughout the research life cycle. It is a good idea to find out what DMP template your university uses, and use the same one. If you are applying for funding in the near future, some funders have their own template that you can also use. It is a good idea to take a look at already existing DMP’s that researcher’s share with public.

I’ve used DMP Tuulia (link) to prepare my own DMP. You can take a look of these public plans (link). Usually the template’s have good instructions for each section, but it is useful to look at ready-filled plans to get an idea what needs to be written in the plan. If you have completed steps 1-3, your should have a pretty good idea on how to complete a DMP. DMP is a document that is modified according to how the research proceeds. Once you have completed first version of it, book a time with your thesis advisor and go through it together.

You can take a look at my university’s pages for guidance for data management (link). Your university should have similar guidance.

The DMP Tuulia format includes five sections:

1. General description of the data

This part provides information about what kind of data research is based on and how the consistency of data will be controlled. You can see here an example of how I have described my data at this point of my research.

It is important here to understand the significance of personal identifiers. Data that includes personal identifiers needs to be stored according to GDPR laws. As my research includes longterm observational data and all daily observations can be linked to a certain research participant, these observations are considered personal data and the processing and storage of the data needs to take place in a secured data processing environment.

I have also specified that the data may include health information – health information belongs to special categories of personal information and needs to be stored as confidential research data. Not all databases are suitable for this purpose. You need to find out from the research services of your university which data storage location is safe for this purpose.

2. Ethical and legal compliance

This part describes the legal issues related to data management and how the management of rights to use, produce and share the data is handled

As you can see here, I’ve written this in detail. I think it is good to be precise. Especially when it comes to the direct and indirect identifiers which are considered personal data, the purpose of DMP is to plan ahead so that there is no risk of storing the data in a risky way. I will discuss this in detail later blog post.

3. Documentation and metadata

This part concerns your plans on following the FAIR principles of open science, so that it is Findable, Accessible, Interoperable and Re-usable. Metadata is the data that describes the nature of the data that you collect.

When you start storing documents in formats such as docx., .pdf., or jpg., you need to store these systematically in folders so that they are easily findable. When you create the folders and store documents in the folders, pay attention to the naming of the folders. It is very useful that at the same time you prepare the descriptive and structural metadata text. files that helps the reader to understand what your documents are about and how they relate to each other.

So, for every folder and subfolder it is wise to prepare .txt file labeled README (to get the attention of the person looking at the screen) and describe to your best understanding what data the folder contains. Here one example of mine.

4. Storage and backup during the research project

This part simply states where your data will be stored, how the data will be backed up, and who is responsible for controlling access to your data, and how will secured access to it be secured. Usually, in order to ensure a safe access, a username and password required access is recommended. In my case, it is obligatory as my research may include sensitive data.

5. Opening, publishing and archiving data after the research project

This part is to help you think beforehand what parts of your research data are made openly available for other researchers and general public. This part is something that you can elaborate along the way, but it is good to give it some thought in the early phases of your research.

Metadata (descriptions of what kind of data your research processes) can be made public via your university’s publication repository. At the time of writing this blog post, I haven’t yet made public the metadata in my university’s JYX publication repository. It is possible, anyhow, to prepare metadata descriptions that are open within your university researchers and staff in case you’re not quite there yet to publish it to larger public.

You can also give some thought to whether you want to open your research data or parts of it to larger public once your research is finished. It is possible to do so by placing them in a data archive. Here (link) is an example of ethnographic field notes that were offered to Finnish Social Science Data archive.

If you wish to offer your empirical data to a repository, ask them if they can do the process of data anonymization as a paid service for you. In case of Finnish Social Science Data archive, a researcher can offer their data to the repository, but the final decision of acceptance to the archive will be made by the Data archive. They will check that it is not possible to identify the research participants from the data. In case this is not possible, they will not accept the data to the archive.

6. Data management responsibilities and resources

This part states the responsible parties for data management and any costs for data management. In my case as I am not employed by the university but a grant researcher, I am fully responsible for data management procedures. I may use the university’s secure databases to store my data, but I am responsible for the process of data management.

When it comes to the costs of data management procedures, my research costs include fees for transcription services. Other possible resources that I will need are the fees that Finnish Social Sciences Archive charges for the anonymization process of the empirical data.

I hope this blog was useful to you! In the next blogs I will describe how to set up a researchers data infrastructure, especially in the case you are a grant researcher and not employed by the university for the whole duration of your research process.

Please do leave comments below, good tips or corrections to my text should you find any! Thank you for reading!

Doctoral researcher

Saara Leppänen