Abstract
Why do some advertisements get shared more than others? Using scalable automated facial coding algorithms, the authors quantified the facial expressions of thousands of individuals in response to hundreds of video advertisements. Results suggest that not all emotions increase sharing and that the relationship between emotion and transmission is more complex than mere valence alone. Facial actions linked to positive emotions, (e.g., smiles) were associated with increased sharing. Although some actions associated with negative emotion (e.g., lip depressor associated with sadness) were linked to decreased sharing, others (e.g., nose wrinkles) associated with disgust were linked to increased sharing.
Management Slant
Not all emotions increase sharing, and the relationship between emotion and transmission is more complex than mere valence alone.
Arousal seems to drive the relationship between emotions and sharing so that high-arousal emotions increase sharing and low-arousal emotions decrease it.
Facial expressions provide a valuable tool to predict and understand consumer behavior and can be measured in a scalable manner using an Internet-based framework.
INTRODUCTION
The value of earned media is undeniable. Rather than having to pay per impression, word of mouth and social shares are free, so more consumers sharing an advertisement or message means greater impact at lower costs. Consequently, across industries, and both online and offline, companies are working to harness the power of social sharing.
The challenge, however, is getting consumers to share. Some content goes viral, getting millions of shares, whereas other content barely gets shared at all.
Why are some advertisements shared more than others? And, by understanding the answer to this question, how can companies design advertisements that more likely will be shared?
The authors examined this question in the context of online video advertisements; specifically, the link between consumer facial expressions and sharing. The face plays a key role in emotional expression (Ekman and Rosenberg, 1997). People who are happy, for example, have markedly different facial expressions from those who are sad. Consequently, facial analysis provides an unobtrusive way of measuring emotional reactions that can be performed quickly, easily, and cheaply at scale. The authors designed a custom facial action detection algorithm and demonstrated that it can reliably predict emotional expression. Using this algorithm, the authors quantified the facial expressions of thousands of individuals in response to hundreds of video advertisements and tested which emotional expressions are most linked to sharing.
This research makes four key contributions. First, it helps managers boost sharing. By understanding why people share, content creators can design content that more likely will be shared.
Second, this research deepens understanding around drivers of sharing. The little work that has been done on facial expressions (Teixeira, Picard, and el Kaliouby, 2014; Teixeira, Wedel, and Pieters, 2012) has examined only one or two emotions at a time and has examined positive emotions almost exclusively. Thus, researchers know relatively little about whether negative emotions might increase the sharing of advertisements or inhibit it. By simultaneously investigating multiple emotions, both positive and negative, researchers can identify more precisely the effect of each on what people share.
Third, most previous approaches to facial coding have required participants to come into a laboratory and sit in front of a computer loaded with special hardware and software. This is somewhat restrictive, unnatural, and costly, and it limits the breadth of data that can be collected. By contrast, the current authors utilized an approach that requires no specialized hardware or software; just a webcam.
Finally, this approach is highly scalable. Studies of human behavior and emotion such as this one almost always are limited because of the prohibitive cost of manual video coding by expert human coders. Automated coding, however, enables collection of a large amount of data from participants across a range of countries in a quick, cost-effective manner. Given that expressions of emotions in response to mundane media content can be sparse and that large interpersonal variability exists in nonverbal behaviors, it is important to sample a broad number of people. Performing this type of analysis at scale is nontrivial and the current work shows that these automated techniques are valid methods for scientific research.
LITERATURE REVIEW
Social Sharing
Researchers have become increasingly interested in what drives people to talk and share (Berger, 2014). Some work suggests that impression management shapes what gets passed along (Packard and Wooten, 2013). People care about self-presentation, or how they look to others, so the better something makes them look, the more likely they are to share it. Consistent with this notion, people more likely will share their consumption experiences if they go well, rather than badly (Deangelis, Bonezzi, Peluso, Rucker, and Costabile, 2012), and news articles (Berger and Milkman, 2012) or brands (Berger and Schwartz, 2011) that are more interesting.
Other work suggests that emotion may increase sharing. When people experience emotions, they often turn to others to regulate those emotions and help them make sense of what they are feeling (Rimé, 2009). Talking with others can help people understand what they feel and why (Rimé, Mesquita, Philippot, and Boca, 1991) and can help them relive positive experiences and receive social support around negative ones. Consistent with this perspective, people report greater willingness to share urban legends that they feel evoke more emotion (Heath, Bell, and Sternberg, 2001), and news articles that contain more emotional words more likely will make the “most e-mailed” list (Berger and Milkman, 2012).
People care about self-presentation, or how they look to others, so the better something makes them look, the more likely they are to share it. Consistent with this notion, people are more likely to share their consumption experiences if they go well, rather than badly.
Properly measuring emotion, however, is challenging. First, although one can ask someone how they feel, self-reports are often inaccurate. People do not always have the best insight into their own emotional states. Someone may report feeling negative, for example, without knowing whether they feel angry or anxious. Because they rely on cognitive elaborations of experienced emotions, self-reports also have trouble picking up quick moment-to-moment emotional shifts over time. The mere act of introspection can alter emotional experience (Kassam and Mendes, 2013). Reporting or having to report how one is feeling can affect the physiological state and thus bias responses.
Second, although automated methods exist to extract emotion from text, this is less feasible with video. Recent advances in natural language processing and sentiment scoring have allowed researchers and practitioners to estimate the valence and emotionality of text (Berger, Humphreys, Ludwig, Moe, et al., 2020). The Linguistic Inquiry and Word Count application (Pennebaker, Mehl, and Niederhoffer, 2003), for example, estimates positivity and negativity by counting the number of positive and negative words in each document.
Although advances in computer vision have begun to allow automated recognition of images, inferring likely emotional reactions to advertisements is far more challenging. Systems may be able to recognize that an advertisement contains a dog, for instance, but that alone is not enough information to determine what emotions viewers will feel when watching that advertisement. Stimulus analysis assumes that all individuals will respond in the same way to an advertisement and does not account for the individual differences in how an advertisement will be appraised.
Facial Responses
To address these measurement challenges, the authors used facial responses. The face plays a key role in emotional expression (Ekman and Rosenberg, 1997) and can be an outward manifestation or signal of these often otherwise internal states. Decades of quantitative research have revealed reliable patterns in the ways in which emotions are expressed on the face. The Facial Action Coding System (FACS), for example, has found consistent facial behaviors associated with anger, fear, joy, surprise, pain, and deceit (Ekman and Rosenberg, 1997). Of the signals that communicate affective information—physiology, nonverbal behaviors, and brain activity—the face is one of the more easily interpreted.
As a result, facial analysis provides an unobtrusive method of passively measuring in-the-moment behavior with a set of basic emotional states linked to prototypic expressions (Ekman, Friesen, O’Sullivan, Chan, et al., 1987). When experiencing joy, for example, people tend to make facial expressions that differ from those of people experiencing disgust. These different expressions involve different movements of individual facial muscles, which can be coded as specific action units (AUs). Consistent with the value of this approach, marketing scientists recently have started to use facial coding as a measurement tool (Xiao and Ding, 2014). Researchers have examined the relationship between smiling and purchase intent (Teixeira et al., 2014), for example, and the link between positive emotion and engagement (Teixeira et al., 2012).
Examining facial expressions avoids some common challenges in studying expressions of emotion. Rather than relying on self-report, recording facial responses to advertisements allows an unobtrusive measurement of expressed emotion response that is not disturbed by cognitive elaboration or reflection. It is difficult for an individual to introspect on their own expressions of emotion and report these in real time. Rather than simply relying on aggregate measures, such as one advertisement being more positive than another, coding facial responses allows researchers to examine variation in individual responses—that is, whether one person reacts more positively than another to the same advertisement—and whether such variation is linked to sharing. Finally, the advent of automated systems that can accurately detect facial muscle movements (De la Torre and Cohn, 2011) allows for increased scalability, repeatability, and observation in naturalistic environments. Rather than examining a small set of laboratory stimuli, researchers can investigate how thousands of people naturally react to hundreds of advertisements and code these reactions in a consistent, reliable manner. Rather than studying a single emotion in isolation, as in most previous work, this approach allows researchers to examine how multiple emotions, experienced at various points in watching an advertisement, might impact sharing.
THE CURRENT RESEARCH
The authors measured a range of facial actions representing both positive and negative emotions to examine whether facial actions are linked to sharing and which actions are most linked to sharing.
The simplest way that emotion might relate to sharing is that any emotion increases transmission. As noted previously, people often share emotional experiences with others. Psychological research on the social sharing of emotion argues that 90 percent of emotional experiences are passed on (Rimé, 2009) to others. Consistent with the notion that emotionality increases sharing, movies and news articles that evoke more emotion more likely will be shared (Berger and Milkman, 2012; Luminet, Bouts, Delie, Manstead, and Rimé, 2000). If all emotions increase sharing, then videos that evoke any facial expressions should be shared more.
A second possibility is that emotional valence drives sharing. Emotions such as happiness are perceived as more positive, whereas others such as anger are perceived as more negative. Given that impression management shapes what people share, one could argue that people might avoid sharing negative things to avoid communicating negative identities (Berger, 2014; Tesser and Rosen 1975). People prefer interacting with positive rather than negative others (Kamins, Folkes, and Perner, 1997), so consumers may share positive things to avoid seeming like a negative person or a “Debbie Downer.” Similarly, most people would prefer to put others in a good mood rather than a bad one. Research finds that positive news is more likely to be shared than negative news (Berger and Milkman, 2012; Tesser and Rosen, 1975). Overall, these perspectives suggest that content that evokes facial expressions linked to positive, rather than negative, emotions may be shared more.
The authors hypothesized that a third, more complex, possibility was more likely; specifically that different specific emotions have different effects on sharing. In addition to valence, specific emotions differ on various other dimensions such as arousal or certainty (Lerner and Keltner, 2000; Smith and Ellsworth, 1985). This approach suggests that even though both disgust and sadness are negative emotions, they may have different effects on whether something is shared.
To test these possibilities, the authors used a new data collection framework to examine the naturalistic facial responses—i.e., viewers in their home environment—of thousands of people watching hundreds of videos. Computer algorithms (Senechal, McDuff, and el Kaliouby, 2015) built specifically for this task automatically coded facial responses. These algorithms leverage significant advances in computer science, specifically in the field of machine learning, which have enabled the accurate measurement of subtle facial expression in situ. Given the very challenging nature of coding AUs, computer algorithms are still not capable of accurately coding all actions. The authors therefore selected five AUs that typically are associated with expression of different emotions and for which the algorithms had a high degree of accuracy, precision, and recall. The authors examined whether facial expressions reliably predict sharing and, if so, which of five facial actions are most predictive. The current study is believed to be the largest of its kind using an online framework and the first to investigate the link between emotional responses to video advertisements and sharing.
Each participant was shown a set of 10 of the 230 advertisements; video order within that set was randomized.
The authors examined both positive and negative emotions. One may wonder whether advertisements actually would aim to elicit negative emotions; after all, if the goal is to increase consumer evaluations or encourage purchases, why would a brand ever want to associate itself with negativity?
A few points are worth noting. First, some advertisements seem to evoke negative emotions purposely. Advertisements for nonprofits or related to corporate social responsibility, for example, often are classified as heartfelt tearjerkers. Similarly, paper towel advertisements may show a disgusting mess to demonstrate how well the paper towels can clean it up. Although positive emotions certainly seem more prevalent, negative emotions do exist in advertisements; thus the authors examined their link to sharing.
Second, without intending to, some advertisements may induce negative emotion. One person might find an advertisement funny, whereas another finds it disgusting or possibly offensive. Someone might find an advertisement clever, whereas another finds it confusing. Thus, examining individual heterogeneity in facial responses provides valuable insight.
Third, cultures vary in “display rules,” or when and where it is acceptable to express certain emotions (Matsumoto, 1990; Tsai and Chentsova-Dutton, 2003). In some cultures, expressing strong emotion is supported, but in others, it is seen as inappropriate and discouraged (Matsumoto, Yoo, Fontaine, Anguas-Wong, et al., 2008). Consequently, one might expect that responses to advertisements also would vary across cultures. An exploratory analysis examines this possibility.
METHOD
Materials
The authors recorded emotional reactions to 230 video advertisements from a variety of product categories (e.g., instant foods, confectionary, pet care, and beauty products). The videos were between 20 and 120 seconds in length (mean duration = 27.3 seconds; SD =8.65 seconds) and from five different countries: Germany (70 advertisements), the United States (60 advertisements), France (40 advertisements), the United Kingdom (40 advertisements), and China (20 advertisements). Participants viewed advertisements from their own country in their native language. The advertisements were all recent (aired in the past 10 years) and from major brands.
Participants
Participants (N = 2,106; mean age = 33.6 years; 51 percent male) were recruited from an online market research panel. Most participants completed the survey from home. The participants were compensated with the equivalent of $8.10 in their local currency. This sample size is an order of magnitude larger than most studies of nonverbal responses to advertisements. Given the variability in nonverbal behavior between individuals, it is important to consider large populations.
Procedure
Figure 1 summarizes the study’s online data collection framework. Participants were contacted via e-mail. They were told that they were taking part in a study to evaluate video advertisements. Participants simply clicked on a link and opted in through a browser-based survey. They were asked for consent to use their webcam to record while they took part in the study. Participants only needed an Internet connection and a webcam to take part. There was no requirement for specialized hardware or to download or install software. Consequently, their experiences while taking the survey were similar to watching online content during everyday life.
Each participant was shown a set of 10 of the 230 advertisements; video order within that set was randomized. After watching a given advertisement, they indicated their willingness to share (“If you watched this advertisement on a website such as YouTube, how likely would you be to share it with someone else?”) on a scale ranging from 1 (very unlikely) to 5 (very likely). Participants completed the study from home, and the average time to finish the survey, watch all 10 advertisements, and complete the questions was 26.3 minutes.
Automated Facial Coding
Automatically detecting spontaneous facial actions in everyday environments is challenging. Designing automated facial coding algorithms is difficult, and it is not possible to detect every single facial AU with the requisite performance in everyday settings because of the subtlety and variability in how they appear in videos. The authors therefore designed custom facial action detection algorithms focusing on five of the most commonly occurring, informative, and reliably detected actions: smiles, outer eyebrow raises, brow furrows, lip corner depressors, and nose wrinkles (See Figure 2 for examples). The selection of these actions in particular was based on the data collected; these actions were found to occur most frequently and capture a range of different expression responses to the content. Although anxiety is a common negative emotion more generally, it is not a common reaction to advertisements.
Smiles are defined as contractions of the zygomaticus major muscle, which pulls the lip corners toward the ears (AU 12; for details on smiles and other facial expressions, see Ekman, Friesen, and Hager, 2002). AU 12 often is associated with positive affect. Outer brow raises are defined as contractions of the frontalis (pars lateralis) muscle, which pulls the eyebrows upward (AU 2). AU 2 is often associated with surprise. Brow furrows are defined as contractions of the corrugator supercilii muscle, which pulls the eyebrows down and together to form vertical wrinkles on the inner brow (AU 4). Researchers commonly interpret AU 4 (brow knitting) as a signal of mental effort (Oster, 2017); confusion, worry, and concentration are specific affective states associated with it (Rozin and Cohen, 2003). Lip corner depressors are defined as contractions of the depressor anguli oris muscle, which pulls the lip corners down (AU 15). AU 15 often is associated with sadness. Nose wrinkles are defined as contractions of the levator labii superioris alaeque nasi muscle, which pulls the eyebrows down and the nose corners upward (AU 9). AU 9 often is associated with disgust (Kassam, 2010).
The automated software is designed to detect these facial actions. The software has two principal constituent components, both of which are created using supervised learning (McDuff, 2014). Supervised learning is a machine learning approach that involves training a mathematical model via a set of labeled examples. The first component involves face tracking: It identifies landmarks on the face—specific locations around the eyes, mouth, and nose—using a computer vision technique known as supervised descent. Once the face is located within each video frame, a region of interest is identified including the mouth, nose, eyes, and eyebrows. The second component analyzes how the texture of the face region of interest changes to identify which actions, such as a smile, are present; this component is implemented using computer vision algorithms known as support vector machines. The output is a probability score for each action. When no expression is present, these probabilities are all 0, and when one or more actions fire, the corresponding action probabilities will rise. The software computes these probabilities for every frame of video (14 times per second). As people may exhibit an expression for a short time and then return to a neutral expression, it will capture these dynamics with a resolution of approximately 70 milliseconds, thus avoiding problems with dial-based methods of moment-to-moment measurement where participants may forget to turn the dial for several seconds.
The algorithms also leverage an estimate of each subject’s neutral face. A moving time window is used to baseline estimates on the basis of the temporal dynamics of normal facial responses.
Automatic detectors were trained using example videos manually coded by expert human observers. Twenty coders were given training from the FACS manual (Ekman et al., 2002) and were certified by passing the FACS test. They were shown similar participant videos collected from a previous study and asked to code the presence of each of the five facial actions as defined by the FACS in each frame of the video. Agreement among expert human coders regarding the presence of a given action was high (frame-level κs = 0.74–0.92) and similar to agreement between humans (frame-level κs = 0.55–0.84).
Using these expert-coded data, the detectors then were trained using supervised learning. The manually coded videos were partitioned into training (80,000 labeled video frames), validation (10,000 labeled video frames), and testing (900,000 labeled video frames) sets. The training set was a random sample of 4,000 unique individuals, and the validation set was a random sample of 2,500 unique individuals. The training and validation videos were collected during previous market research studies. Using the training set, the authors trained a two-class support vector machine with a Nystrom-approximated radial basis function kernel (Senechal et al., 2015). Using the validation data, the authors optimized the number of samples used in the Nystrom approximation (Ns = {200, 500, 1,000, 2,000}), the support vector machine cost parameter (C = {0.01, 0.1,…100}), and the radial basis function spread parameter (γ = {0.01, 0.1, …, 100}). More detail about implementation and validation of this approach is reported by Senechal et al. (2015). The most commonly observed expressions were smiles (5.95 percent), followed by lip depressors (3.38 percent), eyebrow furrows (3.31 percent), eyebrow raises (1.80 percent), and nose wrinkles (0.45 percent). Although these numbers might seem low, they are similar to normal levels of responses to everyday television content and similar to the base rates observed in other studies of facial responses to television advertisements (McDuff, Girard, and el Kaliouby, 2016a; Teixeira et al., 2014).
Model
The authors used a linear mixed-effects model to examine the relationship between facial actions and sharing. This model captures the effect of facial actions on sharing, accounting for the fact that some advertisements more likely might be shared or that people from certain countries might have a higher propensity to share. (1)
Here, β0 is an intercept, and β1, β2, β3, β4, and β5 are the parameters that estimate the marginal linear effects of smiles, eyebrow raises, eyebrow furrows, lip depressors, and nose wrinkles, respectively, on sharing. The score for each of the actions was calculated by first thresholding the classifier output, at the threshold determined in the classifier validation process, and then calculating the fraction of frames in which the output was above the threshold (this can be described as the action “base rate”). Z1, Z2, and Z3 are parameters describing the variance in sharing that can be explained by the differences among subjects, advertisements, and countries, respectively. E is an error term. Modeling subject, advertisement, and country as random effects means that the authors were not interested in the specific effect of any one subject, advertisement, or country but rather wanted to account for the overall variability they exert on sharing. This allowed the authors to control for content-related factors unrelated to facial expressions that might impact sharing. Including interactions between facial actions did not improve model fit, so these were left out for the sake of simplicity.
RESULTS
Results indicate that smiles (AU 12) were positively and most strongly associated with sharing (β = 1.45, SE = 0.03, p < 0.001; See Figure 3). A 30 percent increase in smiling is associated with a 10 percent increase in willingness to share.
Some negative emotions seemed to decrease sharing: Lip depressor (AU 15, associated with sadness) and brow furrow (AU 4, associated with confusion) were both negatively associated with sharing (β = -0.17, SE = 0.05, p < 0.01; and β = –0.18, SE = 0.05, p < 0.01).
Other negative emotions, however, seemed to increase sharing: Nose wrinkles (AU 9, often associated with disgust) were positively associated with sharing (β = 0.22, SE = 0.11, p < 0.05).
Effects Over Time
One may wonder whether experiencing certain emotions early versus late would have different impacts. To test this, the authors also built a time-varying model. Advertisements vary in length and structure; as a result, modeling time as a continuous variable, even with a large amount of observations, is challenging. Most advertisements, however, do have some high-level structure. The simplest model to describe this would be a three-act composition: setup (beginning), confrontation (middle), and resolution (end). The authors therefore segmented them into thirds (beginning, middle, and end) and measured the presence of facial action variables in each segment. The authors estimated the following model: (2)
Here, β0 is an intercept, and β11, β21, β31, β41, and β51 are the parameters that estimate the marginal linear effects of smiles, eyebrow raises, eyebrow furrows, lip depressors, and nose wrinkles on sharing. β12, β22, β32, β42, and β52 are the parameters that estimate the time-dependent effect of each action on sharing. As in the previous model, the authors treated subject, advertisement, and country as random effects. Time is treated as a discrete variable in which the advertisements were divided into evenly sized temporal bins (n = 3) to capture the effect of expressions during the beginning, middle, and end of the advertisements. As before, adding interactions between the facial actions did not increase model fit.
Results indicate that the relationship between smiling (AU 12) and sharing increased over time (β = 0.05, SE = 0.02, p < 0.01). Although smiles at the beginning of the video were positively linked to sharing, consistent with the notion that the end of emotional experiences has a strong impact (Kahneman, Fredrickson, Schreiber, and Redelmeier 1993), smiles at the end had an even more positive effect. Although directionally similar, time effects for other actions did not reach significance.
Role of Culture
Given that the Internet-based framework that the authors utilized allowed them to collect data across multiple countries, they also explored cultural variation. Building on previous work analyzing cultural difference in facial expressivity, the authors performed an exploratory analysis testing whether the relationship between facial expressions and sharing varied cross-culturally. They focused on smiles, as most previous work studying cultural differences has analyzed smiling (Girard and McDuff, 2017; McDuff et al., 2016a) and the prevalence of other emotional expressions are too infrequent for cross-cultural comparisons to be meaningful.
Smiles were always positively associated with sharing, but the exact magnitude varied cross-culturally. The results mirror cross-cultural differences in individualism and collectivism (Hofstede, 2001). The relationship between smiling and sharing was largest in the United States (β1 = 1.54) and the United Kingdom (β1 = 1.54), the two countries with the largest individualism indices (91 and 89, respectively), and smallest in China (β1 = 0.58), the country with the smallest individualism index (20). France (β1 = 1.50) and Germany (β1 = 1.29), which have intermediate individualism indices (71 and 67, respectively), also showed intermediate relationships between smiling and sharing.
The frequency of smiling (smile base rate) also varied between countries, but this does not explain the differences in magnitude of the effects described earlier. The smile base rate was largest in the United States (5.6 percent of video frames features a smile), followed by Germany (4.5 percent), France (4.4 percent), the United Kingdom
(4.3 percent), and China (1.5 percent).
Individuals in different countries saw different videos, and smiles may mean different things in different cultures, making it difficult to infer too much from these differences. That said, these disparities are suggestive and highlight that examining cross-cultural differences in emotion and sharing is a valuable direction for future work.
DISCUSSION
A large-scale investigation—i.e., thousands of participants and hundreds of pieces of content—suggests that facial responses are linked to sharing. Although smiles (AU 12) and nose wrinkles (AU 9) were associated with increased sharing, brow furrows (AU 4) and lip depressors (AU 15) were associated with decreased sharing.
These results suggest that not all emotions increase sharing. The fact that some facial actions associated with negative emotions increased sharing (i.e., AU 9, linked to disgust), whereas other facial actions associated with negative emotions decreased sharing (i.e., AU 4 and AU 15, linked to confusion and sadness) suggests that sharing is driven by more than mere valence alone.
Instead, results are more consistent with specific emotions and, potentially, arousal (Berger and Milkman, 2012). There is ambiguity around arousal associated with discrete emotion states, but emotions characterized by high arousal (e.g., joy) seem to be associated with increased sharing, whereas emotions characterized by low arousal (e.g., sadness) seem to decrease sharing. This suggests that advertisements that evoke high-arousal emotions (e.g., anger, anxiety, disgust, and surprise) may be shared more. The authors hope that demonstrating these effects in an advertising context and using a range of specific negative emotions are part of the contributions of this research.
Ancillary results are also consistent with work on recency or end effects in emotional experiences (Kahneman et al., 1993), suggesting that latter parts of videos and other content may have more impact on sharing. Future work might examine this in text content, analyzing whether articles that evoke high-arousal emotions toward the end of the piece more likely will be shared.
Alternative Explanations
One could argue that, rather than reflecting sharing, participants sharing responses simply indicated their reactions to the videos. Three things cast doubt on this possibility. First, such an account has difficulty explaining why some negative emotions, such as disgust, seem to be positively linked to sharing, whereas others, such as sadness, seem to be negatively linked to sharing.
Second, results for liking are different than for sharing (See Table 1). Participants also rated how much they liked each video (item: “How much did you like the advertisement you just watched?”) on a scale ranging from 1 (not at all) to 5 (very much). Running the same analyses predicting liking rather than sharing showed different effects (See Table 1). Facial actions commonly associated with positive emotions increased liking, and facial actions commonly associated with negative emotions decreased liking. Although nose wrinkles (often associated with disgust) increased sharing, for example, they decreased liking (β = -0.37, SE = 0.12, p < .05).
Third, the study’s main results persist even when controlling for liking. The fact that facial expressions have different relationships with liking and sharing cast doubt on the possibility that the sharing measure is merely picking up liking of the videos.
Implications
These findings offer an important methodological contribution for people interested in studying facial expressions. A simple online tool allows researchers to collect facial responses from a range of individuals in their natural environment. The fact that automatic coding is as reliable as individuals’ manual coding means that data can be collected and analyzed at scale. This opens up a range of avenues for further research.
Trained models produced with the approach used in this research have been shown to be robust to changes in ethnicity (McDuff et al., 2016a), gender (McDuff, Kodra, el Kaliouby, and LaFrance, 2017), and age (McDuff, 2017), making them widely applicable. The trained classifiers the authors used are the same as those used in the AFFDEX software development kit (McDuff et al., 2016b), which is available publicly. This allows other researchers to use the same system for future experiments.
The findings also have a number of practical implications. First, the results provide suggestions for boosting shares. Most advertisements already try to make people smile, but the current findings suggest that certain negative emotions, such as disgust, may boost transmission as well. Consequently, content creators need not avoid all negative emotions; in fact, some negative emotions may help content propagation (Berger and Milkman, 2012).
Second, the tools used here can be useful in advertisement design and copy testing more broadly. Rather than relying on evaluations of the entire advertisement as a whole, marketers can use moment-to-moment analysis of facial expressions to determine which particular components may be working as desired and which should be replaced. These tools allow companies to determine whether one character can be dropped, for example, or a certain scene shifted, without having to replace the whole piece of content. Although we have applied these measures to sharing, they can be applied just as easily to evaluation and other outcomes.
Third, even once content has been created, these methods may be useful in determining resource allocation. Should more resources be put behind seeding and showing advertisement A or advertisement B? By estimating the likelihood of sharing, facial responses can help determine the likely impact of an advertisement. It also can help determine which advertisements might be better suited for television and which are better suited for social media on the basis of the relative expectation of sharing versus other downstream outcomes.
Limitations and Future Research
As with many methods, there are also limitations. When specific facial movements often are tied to emotions, the link is not always simple. Both envy and disgust, for example, include nose wrinkles. Therefore, when applying these tools, researchers and practitioners should be careful to understand what different facial movements indicate.
Another question is how these results translate into sales. As is often the case in research on word of mouth, the outcome measure in this study was sharing intentions, but one could wonder whether this actually links to sharing. Just because someone shares an advertisement, that does not always lead to sales. That said, word-of-mouth intentions are a reasonably good proxy of actual sharing (Berger and Schwartz, 2011), and more word of mouth tends to increase sales (Godes and Mayzlin, 2009). Future work might test these relationships further.
Although the authors focused on video advertisements, future work may apply these methods to advertising that relies on still images. By combining eye-tracking measures with facial responses, one may gain insight into which features of an advertisement are generating which reactions. Similar approaches may even be applied to text advertisements.
In conclusion, facial expressions provide a valuable tool to predict and understand consumer behavior. The authors hope that this work will encourage more researchers and practitioners to utilize emerging tools in this area.
ABOUT THE AUTHORS
Daniel McDuff is a principal researcher at Microsoft, where he is working on scalable artificial intelligence tools for understanding human behavior, health, and well-being. He completed his PhD at the MIT Media Lab, where his research helped spawn a new field of imaging-based physiological measurement. Previously, McDuff was director of research at Media Lab spinout Affectiva, where he led analysis of the largest facial expression dataset in the world. His work has received awards from Popular Science, SXSW, The Webby Awards, and ESOMAR and has been reported in The Times, The New York Times, The Wall Street Journal, BBC News, Scientific American and Forbes.
Jonah Berger is a marketing professor at the Wharton School of the University of Pennsylvania and an expert on word of mouth, natural-language processing and how products, ideas, and behaviors catch on. He has published over 50 articles in academic journals such as the Journal of Consumer Research, Journal of Marketing Research, Journal of Marketing, and Proceedings of the National Academy of Sciences. Popular accounts of his work often appear in places such as The New York Times, The Wall Street Journal, and Harvard Business Review. Berger is the author of books including Simon and Schuster’s Contagious (2013), Invisible Influence (2016), and The Catalyst: How to Change Anyone’s Mind (2020).
ACKNOWLEDGMENT
This work benefited from helpful conversations with Kristen Lindquist, an expert in affective neuroscience at the University of North Carolina, Chapel Hill, and Jordan Etkin, a behavioral specialist at Duke University’s Fuqua School of Business.
- Received April 14, 2020.
- Received (in revised form) June 10, 2020.
- Accepted July 31, 2020.
- Copyright © 2020 ARF. All rights reserved.
REFERENCES
ARF MEMBERS
If you are a member of the Advertising Research Foundation, you can access the content by logging in here
Log In
Pay Per Article - You may access this article (from the computer you are currently using) for 30 days for US$20.00
Regain Access - You can regain access to a recent Pay per Article purchase if your access period has not yet expired.