Detection of Comprehension & Emotion from Real-time Video Capture of Facial Expressions

What if we could automatically detect how well someone is comprehending information presented on a computer screen, based on the emotions they display?

We set out to assess the effects of emotion on comprehension, using educational systems already in place at Stanford’s Virtual Human Interaction Laboratory. The four part strategy included: 1) tracking a learner’s face during a comprehension task; 2) capturing facial expressions and categorizing the learner’s emotions; 3) investigating how emotions effect comprehension; and finally; 4) measuring the effects of changing the learning system in real time, based on learner emotions.

We used machine learning algorithms to develop automated, real-time models that analyzed subjects’ emotional states based on facial expressions and physiological measurements. We created two types of evaluations – models specific to the subject, as well as general models. Our approach predicted emotion type (amusement versus sadness) and intensity level, with immediate comparisons against trained coders’ assessments.

Results demonstrated good fits for the models overall, with better performance for emotion categories than for emotion intensity. Amusement ratings were more accurate than sadness ratings. The full model using physiological measures as well as facial tracking performed better than facial tracking alone, and our person-specific models performed better than general models.

