FSD Bulletin

Issue 20 (1/2007)

ISSN 1795-5262

Front page
Previous issues
Editorial staff

» latest issue

FSD Bulletin is the electronic newsletter of the Finnish Social Science Data Archive. The Bulletin provides information and news related to the data archive and social science research.


Finnish Social Science Data Archive
Tel: +358 40 190 1432
Fax: +358 3 343 9088
E-mail: fsd@uta.fi

Modern Statistical Software Provides New Possibilities for Longitudinal Research

Mari Kleemola

Anna-Liisa Lyyra

Researcher Anna-Liisa Lyyra joined the research team of the Jyväskylä Longitudinal Study of Personality and Social Development (JYLS) in 2003. She is the team's statistical expert and is responsible for the maintaining and documentation of the data. Lyyra has been working with longitudinal data ever since the 1980s, and has been in a good position to witness the rapid development of technology, software and research methods.
- 20 year ago it took a long time to obtain a large polychoric correlation matrix from a computer but nowadays a PC will do it in a split second.

Many statistical ideas have been around for a long time but have only now become applicable, with present-day powerful calculation capacity.
- Estimation methods and techniques have become much more sophisticated, there is no longer any need to always presume the standard normal distribution. In the past we could only dream of such statistical software packages like Mplus which are suited for longitudinal multivariate analysis, Lyyra enthuses.

Anna-Liisa Lyyra is also pleased with the improved graphics of modern statistical software systems. She thinks they greatly increase the possibilities of in-depth analyses of data, in addition to making it easier to handle files or analyse text data.
- There is plenty to learn. New systems offer more capabilities and features than I have time to use.

Lyyra describes the data collected for the JYLS as rich and varied. The data have been collected in various stages, using a variety of methods, with very low non-response. The team has been able to complete missing observations using these sources.
- New imputation procedures function well when there is a variety of background information. Using imputation, we have been able to avoid the diminishing of data due to missing observations which is typical to longitudinal studies. The number of respondents in the JYLS is sufficient for statistical analysis but not too large to maintain the quality of the data.

JYLS data well taken care of

Due to lack of resources, part of the collected data has not been entered. However, all material has been preserved. This has enabled the research team to enter, for example, responses to open-ended questions at a later date, using the preserved questionnaires. Non-response has not been a problem, with less than 10% of respondents refusing to participate.
- After each data collection round we have compared the central background variables with Statistics Finland data. The samples have been representative of the age cohort in Finland.

Storing, maintaining and archiving the JYLS data has been a challenge. The primary data includes over 9000 numerical variables, interview data in text form, life history calendars and laboratory test results. Collection stages, research instruments, and non-response analyses have been documented both in Finnish and English. FSD is transferring the documentation into DDI-XML format.

Confidentiality a primary issue

Anonymity of the research subjects has been well safeguarded. The JYLS data do not contain names or addresses. At the very beginning, each participant was given a three-digit identification number which can be used to merge data.
- To protect confidentiality we store personal information in a safe, separate from other data. In addition, we have strived to remove all identifiers, such as the names of family members, working places and schools, from the data. And it is only researchers working for the project who have access to the files, Lyyra says.