ICMI Grand Challenge Workshop on Multimodal Learning Analytics
Math Data Corpus Download


The First International Conference on Multimodal Learning Analytics represented an initial intellectual gathering of multidisciplinary scientists interested in this new topic. The Second International Workshop on Multimodal Learning Analytics brings together an international collection of researchers to further advance research on multimodal learning analytics with a data-driven grand challenge event.

In support of this event, the Math Data Corpus and related coding resources are made publicly available for community use by Incaa Designs. In addition, the ChronoViz multimodal data analysis tool is supported by UCSD for workshop grand challenge participants who wish to use it.

The Math Data Corpus contains high-fidelity time-synchronized multimodal data recordings on collaborating groups of students as they work together to solve mathematics problems varying in difficulty. Data were collected on students' natural multimodal communication and activity patterns, including their speech, digital pen input, facial expressions, gestures, and physical movements.
The dataset includes 12 sessions, with six three-student groups who each met twice. In total, approximately 29 student-hours of recorded multimodal data is available during these collaborative problem solving sessions. This data resource includes initial coding of problem segmentation, problem-solving correctness, and representational content on students’ writing. A full description of this dataset on this dataset is provided as part of this document, including detailed appendices. Moreover, this paper on "Problem solving, Domain Expertise and Learning: Ground-truth Performance Results for Math Data Corpus" can be used to better understand the data. The dataset is available to participants in the Second International Workshop on Multimodal Learning Analytics data-driven grand challenge, after signing a collaborator agreement (see below).

Steps for MMLA workshop participants:

  1. Read and sign the collaborator agreement document. Email it to the organizer at oviatt@incaadesigns.org. Once this has been submitted and your agreement has been approved by a workshop organizer (allow 48 hrs.), you will receive an email with password information to access the dataset and accompanying data description
  2. Login with your MMLA workshop participant password to http://mla.ucsd.edu/data
  3. Download and read the document describing data
  4. Download data for your MMLA grand challenge analyses. This data is available at this address http://mla.ucsd.edu/data and comprises 12 multimodal data sessions. Each of the session contains 4 different data types:
    1. Videos
    2. Audio
    3. Digital Pen Data
    4. Coding of Written Representation
    Detailed description of these data is available here
  5. Use any tool of your chocie to analyze the data
  6. ChronoViz: Additionally, participants can use the ChronoViz tool to visualize, navigate and analyze the data. A zipped archive of a ChronoViz session (G#D#.annotation.zip) integrating all of the multimodal data and annotations will be available soon . A custom version of ChronoViz supporting this dataset is available here. Instructions on how to use ChronoViz are available on the ChronoViz web page here.