An Update on Archonnex - University of Michigan

SHARING DATA TO ADVANCE SCIENCE Enhancing ICPSR metadata with DDI-Lifecycle Sanda Ionescu Metadata Specialist, ICPSR Testing DDI-Lifecycle at ICPSR ICPSR metadata - fully aligned with the DDI standard: Study-level records Compatible with both DDI-C (2.5) and DDI-L (3.1) Included in the main search (ICPSR home page)

Available for download in both formats Variable descriptions in DDI 2.5 Drive the cross-studies variables search, as well as the online comparison view and interactive codebooks Also included in our PDF documentation DDI-L not yet used at ICPSR for data-level documentation. 2 Testing DDI-Lifecycle at ICPSR DDI-Lifecycle offers enhancements increasing

the functionality of data-level documentation, especially for longitudinal or time-series studies New features supporting discoverability, comparability, and data harmonization across study waves and series of studies Study grouping Variable cascades Linking to concept schemes Reuse of information 3 Testing DDI-Lifecycle at ICPSR January 2018: small-scale pilot project to markup part of a time-series study using DDI 3.2 Project goals:

Assess feasibility in terms of required resources Staff time and skills Software availability and performance Define and refine workflow Evaluate benefits of value-added documentation in terms of enhanced services to users 4 Testing DDI-Lifecycle at ICPSR Selected data: National Crime Victimization Survey (NCVS) series Time-series study

Survey based Popular, high-usage study 15,000 25,000 downloads per individual wave in the last three years Data are publicly available Already well documented in DDI 2.5 Detailed and complete variable descriptions Highly regular data format across waves Allowed for shortcuts in generating new markup 5 Testing DDI-Lifecycle at ICPSR

National Crime Victimization Survey Administered by the Bureau of Justice Statistics Started in 1973, as the National Crime Surveys (NCS) Information on the frequency, characteristics, and consequences of non-fatal personal, and household property crimes, both reported and not reported to the police Complex data collection (64 studies to date) Main survey, released annually Supplements: School Crime Identity Theft Police-Public Contact Work Place Risk, etc.

6 Testing DDI-Lifecycle at ICPSR National Crime Victimization Survey annual release Five record types Address Household

Person Incident (year of occurrence) Collection Year Incident (year of reporting) Pilot project to create DDI 3.2 documentation for the incident-level (year of occurrence) files released in six consecutive waves, 2011-2016 Survey-based data Approximately 950 variables per year 7 Testing DDI-Lifecycle at ICPSR Starting point: variable descriptions in DDI 2.5 Native markup, created specifically for codebook

generation Rich content Variable groups Question text Variable descriptive text Variable notes (No frequencies or summary statistics)

Objective: preserve all content while transitioning to DDI 3.2 8 Testing DDI-Lifecycle at ICPSR Software used: TextPad Clean up / edit original DDI 2.5 Nesstar Publisher Review and enhance the edited DDI 2.5 Colectica Designer Re-organize content according to the DDI-L structure

Enhance data description with DDI 3.2 specific features Generate DDI 3.2 documentation for the project 9 Testing DDI-Lifecycle at ICPSR Steps: Original DDI edited in TextPad to achieve overall consistency and accuracy Resulting file imported into Nesstar Publisher Error check and validation Browse and review metadata in user-friendly interface Import data file to generate frequencies and summary statistics

Export well formatted and valid DDI that uploads seamlessly into Colectica Designer 10 Testing DDI-Lifecycle at ICPSR Using Colectica Designer Imported DDI-C for individual waves Most variable-level fields successfully imported, except textual descriptions and notes Created a Series entry to group all individual waves At Study Group / Series level created Concept scheme

Conceptual Variable scheme Represented Variable scheme Descriptive text and notes were extracted from original DDI-C and added to the Represented variables descriptions 11 Testing DDI-Lifecycle at ICPSR Define relationships: Series Concepts Conceptual Variables Represented Variables

Wave 1 Variables 2011 Study Wave 2 Variables Wave 3 Variables 2012 Study 2013 Study Wave 4 Variables 2014 Study .. ..

12 Testing DDI-Lifecycle at ICPSR Explore state-of-the art tools for data discovery, comparison, and harmonization using DDI-L enhanced metadata Variable concordance and comparison views Complex, multi-level interactive searches Variables within study or across studies

Variables by concept Variables by conceptual or represented variable Variables by variable group 13 Testing DDI-Lifecycle at ICPSR 14 Testing DDI-Lifecycle at ICPSR 15

Testing DDI-Lifecycle at ICPSR 16 Testing DDI-Lifecycle at ICPSR Questions Thank you ! [email protected] 17

