FSD Bulletin is the electronic newsletter of the Finnish Social Science Data Archive. The Bulletin provides information and news related to the data archive and social science research.


Making Data Repositories Visible

Jussi Simpura, Director General, National Institute for Health and Welfare

Each year, millions of euros of public money are spent for various data repositories in Finland. They can be statistics and register data, sample data based on medical examinations or environmental measurements, interview and survey data, or information generated from the activities of companies or organisations. At the moment, there are substantial ongoing projects aiming at developing research infrastructures, including data repositories. They are potential research material in all their forms, but a major part of repositories are created and used in activities other than research.

Jussi Simpura

The National Institute for Health and Welfare started operating at the beginning of 2009, after the merger of the National Public Health Institute (KTL) and the National Research and Development Centre for Welfare and Health (STAKES). One of the first fruits of the merger was a profound mapping of the central data repositories hosted by both of the merging parties. As a result, a 50-page data catalogue was compiled. The National Institute hosts a wide variety of different databases and repositories which have been produced on very different bases - mostly through research and statutory health care registers, however. Because of its versatility, the National Institute is an excellent playground for practicing the process of making publicly funded data repositories in research institutes visible - first to the institute itself and its experts, and then to everyone interested in using the data. It remains to be seen how this will be reflected in the use of the data.

Especially in the field of research, data repositories are often treated as if they were the personal property of their collectors and users. In addition to the official regulation related to data protection and research ethics, the use of data is also regulated unofficially. In this internal regulation, the terms of use (i.e. who is allowed to use which data and for what purposes) are often agreed upon very strictly and formally. Therefore, it is still a long way from making data repositories visible to utilising them in a broad scale. But the journey cannot even begin if nobody knows about the data resources hosted by different institutes.

The ongoing and apparently long-continuing reform of sectoral research has at least indirectly set off from the idea of increasing the research use of data repositories. It is actually necessary for the successful realisation of the genuine competition for financing between the consortia formed by various research institutes, which is one of the focal points of the reform. It is difficult to write a quality research plan if one is unaware of what kind of data are hidden in the bowels of various research institutes and how they could be utilised.

The restrictions in the use of data based on data protection and research ethical regulation must be maintained. In addition to them, the re-use of already collected data is restricted by several unofficial practices, such as the above-mentioned habit of considering data as the property of their collectors. These kinds of practices are easier to change than laws or ethical principles. The first step in this kind of change is, again, making data repositories visible.

In the long run, making data visible can operate like the EU's famous open method of coordination: there is no need for official actions, because the activities are steered towards a better state by the sheer power of comparison. A couple of decades ago, the principle of openness, glasnost, led to the same thing in the Soviet Union. Visibility is a strong force: it is difficult to stop what it has once set in motion!