Descriptive metadata in DDI format

DDI started as a collaboration of data repositories

One of the most important prerequisites for data archiving and reuse is sufficient documentation of data, i.e. metadata. An international documentation format for research data was agreed upon already in the 1970s, but due to differences in documentation needs and practices, many local “dialects” of data documentation formed.

In 1995, however, ICPSR (Opens in a new tab) (Inter-university Consortium for Political and Social Research) founded a committee to develop a standard for the documentation of data in the social sciences. The committee included members with broad expertise in social-scientific research and documentation. The committee’s proposal for the new standard was named Data Documentation Initiative (DDI).

The DDI standard is maintained and developed by the international DDI Alliance (Opens in a new tab) hosted by the University of Michigan. Members of the DDI Alliance include data repositories and universities from all around the world as well as organisations developing computer programs for statistical analysis and data collection. Each member has a representative in the Alliance’s Scientific Board, which convenes once per year. DDI is open for use by anyone, but only members of the Alliance can contribute to the development of the standard. Earlier documentation standards, such as MARC, ISO690-2, SDMX and Dublin Core, were used as reference in the development of DDI.

XML format used for DDI

Data that are described using the DDI standard are documented in XML format. XML (Extensible Markup Language) (Opens in a new tab) is a file format meant for saving and presenting structured information on the internet. XML offers a hardware- and software-independent method for managing information. In a single XML document, the allowed elements (sections of the document e.g. titles, lists etc.) and their relations, order and repeatability are defined in the Document Type Definition (DTD) or schema.

DDI-Codebook used at FSD

All data archived at FSD have been described using the DDI-Codebook format, which is suitable for the long-term preservation of the metadata of individual survey datasets. The elements of DDI-Codebook (approx. 300 in total) can be divided into five sections:

Document Description
includes e.g. bibliographical information of the metadata (i.e. the “codebook”)
Study Description
includes e.g. dataset creators, keywords, abstract, sampling procedures, data collection, units of observation, target population, terms of access
Data Files Description
includes e.g. data structure and format, number of variables, size of files, software
Variable Description
includes e.g. variable and value labels and question texts
Other Study-Related Material
includes information related to the data that is not described elsewhere

FSD uses about a third of the elements of DDI-Codebook. All information elements are available in the DDI-Codebook 2.5 specification (Opens in a new tab) on the DDI Alliance website, along with guidelines for use.

DDI descriptions produced by FSD are also available in XML format.

DDI enables efficient discovery

Structured XML data descriptions aligned with the DDI standard significantly improve the findability of data. In addition, the XML files can easily be processed into different types of web and print publications.

DDI-Lifecycle for managing the data lifecycle

DDI-Lifecycle (Opens in a new tab) offers solutions for the documentation and long-term preservation of comparative, panel and series data. In comparison with DDI-Codebook, DDI-Lifecycle has a much more extensive structure.