Number 3 November 2000

In pursuit of qualitative data

By Arja Kuula

An enormous amount of qualitative data has been collected in Finland, especially during the last two decades. Where is it all now? Archived at universities and research units? A part maybe. A large part can also be found in nondescript piles of paper in warehouses along with antiquated computer hardware and accessories. Some of the researchers may have "archived" data in their home or the summer house. Unfortunately, a very considerable amount of research data has surely been destroyed by now, or so dispersed and muddled up that it would take extra resources to get them organised for appropriate storage and reuse.

What significance or value do these data, collected years ago, actually have? Defining research value is difficult but not impossible. One might try it by, for example, picturing a researcher who analyses cultural significances of family in Finland. One can, of course, study the changes that have occurred in the significance of the family during the past couple of decades by acquainting oneself with earlier studies and various cultural products. But equally interesting - and more exciting - would be to have at one's disposal interview data from 20 years ago dealing with the same problematic as the new material.

In studies of social and cultural change, a comparison between interviews from different time periods would surely bring to the study of cultural products new angles to facilitate understanding of social and cultural change.

At times, research questions may assume a considerably sharper edge if they also incorporate a time dimension established with the help of an analysis of earlier datasets. Sometimes even a minor investment in time and money used to obtain a parallel dataset adds essentially to the substance of the research.

In spite of this, research data are not seen as a common resource of the scientific community in our country. A co-ordinated reuse of qualitative datasets is a relatively new idea both in Finland and elsewhere in Europe. An exception to the rule is Qualidata in the Department of Sociology at University of Essex where a new research culture of archiving, storing, distributing and reuse has been developed since the 1990's.

Qualidata: a pioneering unit

Qualidata was set up in 1994 to co-ordinate the archiving and reuse of qualitative data in social sciences. It was established and financed by ESRC, Economic and Social Research Council, an institution that started also the archiving and reuse of quantitative datasets in the 1960's. The archive for quantitative data has been one of the main co-operation partners in the development of archiving qualitative data.

Qualidata started its operation by canvassing significant qualitative social science datasets which were misplaced, unarchived and unaccounted for, a part of them also in imminent danger of being destroyed. Already the first round of collecting helped to locate tens of collections to be archived and reused, among them, extensive datasets by Peter Townsend, Ray Pahl and George Brown. Among the Qualidata collection of datasets there is the qualitative data of the study "The Affluent Worker", known also in Finland as one of the classics of work research. At the moment, an archiving and description process of the Tavistock Institute materials , collected since its founding in 1946, is in progress.

To the tasks of Qualidata belong finding suitable repositories, supervising the archiving process of data, and after that, documentation, describing information on the data in a database where there is information on over a hundred dataset collections. They can be browsed and searched on the unit's internet pages. In addition, researchers are guided in the planning of data collecting with a later archiving in mind. Secondary users are in turn helped to find suitable datasets. Qualidata has also prepared teaching packs for university teaching.

The change won't happen overnight

Qualidata has operated for six years but only recently researchers have grown to think of archiving and secondary use of qualitative data as something normal, an everyday practice. Research culture - just like any other culture - does not change all of a sudden.

A major influence in perceiving datasets as a common resource of the scientific community was the requirement in 1996 demanding that datasets of all projects financed by the ESRC must be offered for archiving. The requirement to OFFER datasets for archiving and eventual later secondary use does not mean that all datasets actually will be archived or that they all would be suitable for secondary use. What it means is that within the scientific community, datasets are more and more perceived as something which is public and shared just as the studies are.

Why did one end up with a requirement to archive? And what has been the reaction to the notion of giving one's datasets to the use of other researchers after one does not need them anymore? The director of Qualidata, Louise Corti says that the grounds for the requirement were the same as with quantitative data earlier.

Louise Corti - First of all, all types of research materials are perceived as research, as the basis of cumulative knowledge about society. Secondly, collecting qualitative material is often very time-consuming and it is not thus sensible to collect material if there is some completed dataset already in existence answering the research questions or parts of them. Thirdly, most of the research is financed by public funding. Therefore it feels quite natural that also research data should be available to the scientific community when their original collector no longer needs them for his or her own purposes.

Louise Corti says that the ESCR requirement to offer research data to be archived from all projects it had financed created an uproar among researchers. Proponents and those who vehemently opposed were found in all disciplines and fields of learning. One argument against archiving and reusing datasets was - and is - that the material is so tied to its context and the collector's special knowledge of the features of the data that it serves to impede and hinder an outsider from understanding and analysing the dataset in a proper manner. Understandably, anthropologists in particular stress this. Ms Corti wishes to point out that also anthropological datasets help us learn and understand at least the way we interpret foreign cultures and carry out anthropological research. Often also anthropologists at the end of their career wake up to a horror scenario where all the material they have collected falls into oblivion and is neither seen nor utilised by future researcher generations.

- One example is Paul Stirling who wanted to give everybody a chance to use his field material collected during several decades in a Turkish village. All means to identify single individuals were removed and the material was converted into a digital form. On the Internet pages of the Stirling dataset, there are photographs and video clips from the research village, original field notes and research texts based on the material. The website is well-visited and used in anthropological studies all over the world. The internet pages are a concrete example of what modern technology can offer in the use of qualitative datasets.

At times, there are grounds also for opposing secondary use of datasets. For instance, when interviews deal with delicate or sensitive matters, even finding people to be interviewed may be difficult and the process calls for carefully construed terms of confidentiality agreed upon in advance. On the other hand, researchers use confidentiality and the principle "this dataset is only at my disposal and nobody else's" presuming that this very thing makes it extraordinarily trustworthy and informative a source. Corti has noticed, however, that afterwards, the interviewed persons often grant easily a written permission to archive and reuse the dataset. When agreeing to sometimes quite long interviews, people invest both their time and intellectual energy in the topic at hand and do this to an extent that they wish that their effort would be appreciated and used in research as much as possible.

The nature of some research materials is such that they can't under any circumstances be handed over to secondary use. An example of these are datasets on criminality, and as a whole, datasets whose misuse might in some way endanger or have a negative effect on the lives of the people who have been interviewed. However, if the people studied have given their consent to the archiving of the dataset and its secondary use, after the completion of the study the dataset can be archived and closed for a certain period of time. The delicacy of some matters diminishes considerably over time. It is also possible to eliminate all identifiers in the datasets. On the other hand, in view of all secondary use, the most essential thing is to check the research plan of the eventual secondary user of the dataset. Often also personal negotiations between the original collector of the dataset and the secondary user are necessary. In all cases, according to Ms. Corti, researchers cause minimal damage, and extremely seldom at that, compared with the media which at times interpret research results and do it both wrong and very loudly.

The best way to get acquainted with the operation and datasets of Qualidata is to visit the internet pages. If a Finnish researcher finds a dataset in the Qualidata dataset catalogue that would suit his or her research interests, it is possible to apply for a grant financing a visit to the archive and to cover copying costs at ECASS (European Centre for Analysis in the Social Sciences).

Are there treasures in your cupboard?

Maybe you have datasets you don't need anymore but which would make a substantial addition to someone else's research material? A good dataset is a good one also after it has been extensively and thoroughly analysed in the original research context. No dataset is ever exhausted to the fullest. With this in mind, the FSD launches an effort to canvass and document existing qualitative datasets available in Finnish social sciences. The point is to canvass and describe the qualitative datasets at hand. A future prospect of the FSD is that eventually, the archive will have a permission and the necessary prerequisites for archiving also qualitative datasets in an electronic form.

