About face: The shift away from facial coding technology


About face: The shift away from facial coding technology

Our facial expressions are often the most obvious, observable, and readily identifiable signals of our emotions. But interpreting facial expressions isn’t as simple as smile = happy and frown = sad. 

The challenge of measuring human emotions

Emotions are discrete responses to salient, motivationally important external or internal events. They comprise subjective, self-reported experiences, neurological and physiological changes in the body, and behavioral reactions.  

The quest to measure our emotions is not a new one. The idea that humans are motivated by both passion and reason has captivated philosophers, artists, and scientists for centuries. More recently, the importance of measuring emotional responses in marketing has led to increasing interest in tools that can assess the emotional impact of advertising.  

But measuring emotions remains challenging, even with the advent of advanced AI: our emotions are largely non-conscious, and measurement requires tools that go beyond the surveys traditionally used to assess advertising communication.  This need coincided with a growing appreciation of the non-conscious, or “System 1,” impact of marketing and the introduction of neuroscience and psychology-based techniques into market research.  

Facial coding technology limitations draw scrutiny

Over the years, facial coding—the often AI-based evaluation of facial expressions in response to stimuli—became a popular tool for measuring emotional responses to advertising. The promise of this technique was that it could assess emotion more non-consciously than asking people but would be easier to implement than some brain measures of emotional response.   

However, there is a growing body of academic evidence showing that human facial expressions typically do not accurately reveal our emotions1,2,3. A recent review of over 1,000 scientific studies concluded that the way people communicate emotions such as anger, fear, happiness, and disgust varies substantially across cultures, situations, and even across individuals experiencing an identical situation.

Someone might scowl, frown, or even laugh when they’re portraying anger. And a scowl can express anger, confusion or concentration. This variation makes it difficult to accurately decipher an internal emotion from a facial movement, and these findings have implications for any facial coding software (sometimes called “Emotion AI” systems) that claims to determine someone’s emotions by tracking their facial movements alone. 

At NielsenIQ BASES, we advise our clients about the limitations of facial coding, employing it in our ad testing exclusively as a complimentary, “add-on” diagnostic—a descriptive tool used in tandem with other System 1 measurements, like EEG (a diagnostic used to evaluate electrical activity in the brain). But following our recent completion of a multi-year, multi-country evaluation of facial coding, we have concluded that the methodology shows little correlation with what people feel, and even less correlation with what they do.   

BASES analysis reveals weaknesses of facial coding systems

In what we believe is one of the largest and most comprehensive analyses ever conducted on these systems, we analyzed facial expression data from over 2,000 video ad tests across 15 countries. We found that it’s difficult for these systems to measure facial expressions with accuracy, let alone use that information to infer an emotional state. Our own research shows that in video ad testing, respondents simply don’t make many facial expressions, and any expression they do make varies by individual.  

The evaluation of these systems revealed that they often disagree with each other and suffer from poor test–retest reliability.  Leveraging a very high threshold for what constitutes a facial expression, our findings show that most video ads elicit a measurable and reliable facial expression for only a small fraction of their duration—for example, 1 or 2 seconds across a 30-second ad.  In other words, consumers’ faces are largely expressionless while viewing most seconds within a video ad.  

Within academia, there is a direct brain measurement of emotional motivation that has been extensively validated by the top universities in the world.  We parallel-tested facial emotion software against this well-established, direct brain measurement of emotional motivation.  We found virtually no correlation (r = .07) between external facial expressions and internal emotional motivation measured directly from the brain. 

Finally, we found that facial coding has little predictive power when it comes to business outcomes. In an in-market validation study including 100+ ads across six countries, we found weak or no correlations between facial expressions and ad-driven sales (r = 0.18 for positive facial expressions; r = -0.02 for negative facial expressions; r = 0.04 for expressions of surprise). This weak relationship between facial coding data and business outcomes has led us to question the usefulness of this technology for emotion detection in general, and for video ad testing in particular. 

The future of facial coding in business

NielsenIQ isn’t the first company to reach this conclusion about facial coding.  Earlier this year, Microsoft announced that it had retired its facial analysis capabilities, noting the inability of the technology to generalize the link between facial expression and emotional state across use cases, regions, and demographics. Google has its facial coding technology under review for similar reasons. 

Most recently, the Information Commissioner Office in the U.K.  warned companies to steer clear of immature “emotional analysis” technologies like facial coding because of the “pseudoscientific” nature of the tools. It cautioned that companies should not make meaningful decisions based on technology that is not backed by science. 

What do these findings mean for the future of facial coding? For NielsenIQ BASES, it means we have joined the growing number of companies that have retired this technology. We will no longer employ it within our ad testing toolkit, or in any of our solutions. 

Instead, we will continue to focus on measurement tools that are backed by decades of scientific consensus, in particular EEG. Through our extensive R&D, we have found that EEG is the most reliable, valid, and scalable way to measure emotion for business applications. Crucially, we have found a strong correlation between our EEG measures and ad-driven sales, across multiple client-led studies.  

Businesses are increasingly recognizing the impact that measuring emotion can have on their bottom lines. To those tasked with selecting a research vendor, we join with other organizations in offering a word of caution: many of the available methods that profess to measure emotion—like facial coding—do not deliver.  Ensure the tools you employ are not only adequate for the job but are also capable of delivering meaningful data that will lead to better business decisions.  

Interested in learning more?

Get in touch with a representative today to learn more about NielsenIQ BASES System 1 Measurement tools.