Analyzing Student-Tutor Interaction Logs to Improve Educational Outcomes

 

Workshop W2 at ITS2004

 

When and where:  This full-day workshop takes place on August 30 in Maceió, Alagoas, Brazil. 

 

Schedule:

 

8:00 - 8:15:  Load talks onto computer

8:15 - 8:30:  Introduction (Beck)

8:30 - 9:00:  Some Useful Design Tactics for Mining ITS Data (Mostow)

9:00 - 9:30:  Bootstrapping Novice Data:  Semi-Automated Tutor Authoring Using Student Log Files (McLaren)

9:30 - 10:00: Lessons on Using ITS Data to Answer Educational Research Questions (Heiner)

10:00 - 10:30: Break/install talks onto computer

10:30 - 11:00: Discuss first 3 papers on Tools and techniques to simply the process of educational log analysis.

11:00 - 11:45:  Invited Speaker:  Albert Corbett (talk on knowledge tracing)

11:45 - 2:00:  Lunch

2:00 - 2:30:  Distinguishing Qualitatively Different Kinds of Learning Using Log Files and Learning Curves (Koedinger)

2:30 - 3:00:  Inferring unobservable learning variables from students' help seeking behavior (Arroyo)

3:00 - 3:20:  Discuss 2 papers on Diagnosing students

3:20 - 3:50:  Pixed: An ITS that guides students with the help of learners' interaction log (Heraud)

3:50 - 4:20:  Break

4:20 - 4:50:  Using Association Rules to Guide a Search for Best Fitting Transfer Models of Student Learning (Freyberger)

4:50 - 5:10:  Discuss papers on better understanding the domain being taught

5:10 - 5:30:  Wrap up and next steps

5:30 - : Dinner

 

Workshop objectives:  The goal of this workshop is to better understand how and what we can learn from data recorded when students interact with educational software.   Several researchers have been working in these areas, largely independent of what others are doing.  The time is ripe to exchange information about what we’ve learned.  

 

There are five major objectives for this workshop:

  1. Learn about existing techniques and tools for storing and analyzing data.  Although there are many efforts in the ITS community to record and analyze tutorial logs, there is little agreement on good approaches for storing and analyzing such data.  Our goal is to create a list of “best practices” that others in the community can use, and to create a list of existing software that is helpful for analyzing such data.
  2. Discover new possibilities in what we can learn from log files.  Currently, researchers are frequently faced with a large quantity of data but are uncertain about what they can learn.  Looking at the data in the proper way can uncover a variety of information ranging from student motivation to the efficacy of tutorial actions.  An exchange of information about what we can model and what are the open problems would help anyone with large piles of data (an increasingly large segment of the ITS community). 
  3. Share analysis techniques.  As data become more numerous, the analysis techniques change, and a straightforward pre- to post-test approach is not likely to be applicable.  Instead issues such as variable numbers of trials per student, data that are not strictly independent, multiple possible causal factors, and the possibility of subtle sample biases are rather common, and unfortunately an easy way to create bogus results.  Presenting our work in this workshop provides an opportunity to exchange analysis techniques, and to vet our approaches among a community of active researchers in this area. 
  4. Create sharable resources.  Currently the only way to test a theory about how students interact with educational software, or a theory about how to model such data, is to construct the software, gather a large number of students, and collect their interaction data.  Building data sets that are shared amongst the community will broaden the number of people who can perform such research by reducing the start up costs.  Broadening the number of researchers who can work on large dataset problems is important since people who best understand how to analyze such data are not necessarily the people with the skills and resources to build the systems to collect it.  Furthermore, meta-analyses of logs from multiple systems will allow researchers to uncover phenomena that occur more broadly than in one particular tutor.  In addition to shared data sets, creating and sharing tools to analyze gathered data are essential to rapid progress. 
  5. Create higher-level, visual, representations.  There are multiple possible consumers for data collected by educational software, including teachers, administrators, and researchers.  What are good abstractions of low-level information for each of these groups?  How should the information be presented?  For example, automatically grouping students by common instructional needs would be helpful for teachers, while administrators may be more concerned with learning gains at the school.   

 

Target audience:  The workshop is applicable both to researchers who have managed to collect a large amount of data from a tutor and aren’t sure what to do with it, and those who are debating adding fine-grained logging to their tutor and want to learn about the “why” and “how” of the process. 

 

Scope of topics:  We are interested in papers or proposals for demonstrations that address any of the five objectives listed above.  The area of analyzing large datasets generated by educational software is fairly broad, and researchers have a variety of backgrounds.  An example (but not an exhaustive list) of topics that would be welcome are machine learning, simulation, visualization, representations and mechanisms for storing data, novel things to learn, and innovative analysis techniques.  Demonstrations of software that are helpful in addressing the objectives listed above are also welcome.         

 

Submissions:  There is no required format for submissions, but papers should be single spaced, use at least an 11-point font, and not be longer than 5,000 words (shorter submissions are fine).  We are also interested in poster submissions and statements of interest in participating in the workshop.  Poster submissions can be submitted as papers (please label the submission as a poster), or as a poster with accompanying technical information.  Statements of interest are helpful for to both get a head count of who is interested in the workshop and to better tailor papers and discussions to the interests of attendees.  Submissions should be emailed to joseph.beck@cmu.edu in either RTF or PDF format.  Papers accepted to appear in the workshop proceedings will need to follow formatting guidelines..

 

Workshop format:  The workshop will consist of presentations of research results and demonstrations of software tools.  Presentations will be grouped according to which objective is addressed, and there will be a summary discussion after each group of papers is presented. 

 

Important dates:   

  • Submissions are due May 24, 2004 
  • Acceptance notification:  June 10, 2004
  • Camera ready version due:  July 7, 2004
  • Workshop:  August 30, 2004

 

 

Organizing Committee: 

Joseph Beck (Chair) – Carnegie Mellon University

Ryan Baker – Carnegie Mellon University

Albert Corbett -- Carnegie Mellon University

Judy Kay – University of Sydney

Diane LitmanUniversity of Pittsburgh

Tanja MitrovicUniversity of Canterbury

Steve Ritter – Carnegie Learning