ABSTRACT
Attracting attention is a common goal for advertisers, but there is limited knowledge about how best to measure attention. Measuring attention to advertising is a complex task because there are different types of attention, tapped by different measures, that likely are more or less sensitive to varied attention-getting creative devices. This study examines how scalable biometric measures—eye tracking, skin conductance, and heart rate—respond to 10 creative devices executed across more than 100 television advertisements with known in-market sales-effectiveness results. The study documents the relationship of different attention measures with level of attention and type of creative device.
MANAGEMENT SLANT
Advertisers have multiple measures of attention at their disposal but lack evidence for which measure is most appropriate for specific conditions, including creative executions.
Comparing attention measures demonstrates that the measures respond differently to different levels of attention and attention-getting creative devices.
Multiple measures of attention are necessary to diagnose when attention-getting devices successfully capture attention or adversely reduce attention.
Sales-ineffective advertisements systematically attract lower levels of attention than sales-effective advertisements, measured by heart rate.
INTRODUCTION
Consumers are adept at screening out advertising, which means that advertisers cannot underestimate the challenge of getting noticed. Research has linked low attention to television advertising (measured by heart rate) with low in-market sales response (Bellman, Nenycz-Thiel, Kennedy, Larguinat, et al., 2017), which signals that biometric measures of attention can help advertisers weed out the most ineffective television advertisements.
Factors other than low attention obviously can contribute to poor advertising performance (e.g., poor branding). Given the evolving media landscape, however, attention to advertising is becoming scarcer, and the cost of being noticed consequently is increasing (Teixeira, 2015). Not naive to these macroenvironmental changes, many marketers set attention-related goals for their advertising, and many advertisers copy test to check whether their executions grab attention before going to market.
Attention to advertising—the ability to focus on advertising and also suppress attention to other things in the environment—can occur through a variety of attention processes. Attention can vary in terms of being volitional or nonvolitional, sustained or divided, and high or low. Sustained high attention intuitively seems desirable and is the easiest to measure. At high levels of attention, viewers are very involved, outwardly express their emotions, and think about advertisements (Petty, Cacioppo, and Schumann, 1983). High attention can be measured in different ways, such as by collecting self-report measures (Laczniak, Muehling, and Grossbart, 1989) or by observing psychophysiological responses (MacInnis, Moorman, and Jaworski, 1991).
Most advertising, however, particularly television advertising, is watched with low levels of attention (Krugman, Cameron, and White, 1995). Even at low attention, viewers continue to respond automatically to some video content, which allows them to follow a program's plot (Lang, 2000) and develop simple like–dislike responses to the advertisements (Petty et al., 1983). Measuring low-level attention responses is more difficult, however, and so dictates different measures than those for high attention.
Low- or passive-attention responses, which occur mainly in the brain, can be observed with images of brain activity from functional magnetic resonance imaging or electroencephalogram (EEG) recordings. Such studies are prohibitively expensive for most advertising researchers and so rarely have been applied to advertising, especially linked to sales response (Venkatraman, Dimoka, Pavlou, Vo, et al., 2015). The few studies that have been conducted had prohibitively small participant sample sizes (Button, Ioannidis, Mokrysz, Nosek, et al., 2013).
Advertisers need scalable and reliable measures of attention to support their decision making. This study investigates the diagnostic potential of a range of scalable biometric measures of attention, including viewers' eye movements, sweating, and heartbeat. These measures are widely available, relatively unobtrusive to collect (e.g., by webcam), and inexpensive, so they can be collected with large sample sizes.
This article aims to show that the attention goals set by advertisers—and associated measurement—are complicated by the fact that there are different types of attention. The authors argue that different measures are needed to detect when any type of attention is being paid; for this reason, no one measure of attention is enough. One further complication is that different attention measures likely respond differently to the various creative devices that advertisers use to attract attention to their advertisements (e.g., humor, animated characters).
It is necessary to link advertising measures—including biometrics—to content in order to determine the specific characteristics people are responding to. Only then can researchers learn how to improve advertising performance through execution, which is something that marketers actively control. This research hence seeks to confirm whether different attention measures are more or less diagnostic of the effectiveness of varied attention-getting creative devices.
In this study, the authors reanalyze data from previous work (Bellman et al., 2017) to answer the overarching research question: How do different biometric measures of attention respond to the presence of a range of key attention-getting creative devices, such as visual branding or product demonstrations? Attention data were collected on a large sample of television advertisements that mostly were for familiar, high-penetration consumer-goods brands that are sold in supermarkets. Advertising for such low-risk products well may attract low levels of attention.
The specific biometric measures collected for this study were as follows:
Eye tracking. The number and duration of fixations indicate focused attention on external stimuli, whereas blink rate and duration indicate attentional avoidance.
Skin conductance. An increase in sweating indicates arousal as well as orienting attention responses to external stimuli.
Heart rate. A decrease in heart rate—or an increase in the time between heartbeats—indicates orienting attention responses to external stimuli.
ATTENTION AND ADVERTISING SUCCESS
Measuring Attention
Attention is recognized as an antecedent or gatekeeper to other mental processes (Rossiter and Percy, 2017). Advertising therefore must attract sufficient attention to open the gate and influence viewers' memory structures to raise the mental availability of a brand. Measuring attention to advertising, however, is not so simple. Past research has identified a number of attention processes leading to different components, varieties, or types of attention (Cohen, Sparling-Cohen, and O'Donnell, 1993; Posner and Boies, 1971).
Two recurring types of attention across psychology and advertising literatures are top down (endogenous or volitional attention) and bottom up (exogenous or automatic attention; Shaw and Bagozzi, 2018). These mechanisms distinguish the degree of control a person has when directing his or her attention, stemming primarily from deliberative internal goals (e.g., seeking brand information) or from features of a stimulus (e.g., being startled by a sudden noise), respectively. Attention also can be switching or divided, which is of great interest to advertisers as multiscreening becomes more prevalent (Brasel and Gips, 2017; Segijn, Voorveld, Vandeberg, and Smit, 2017). Multiscreening—as but one form of distraction in the media environment—can interfere with viewers' capacity to notice and process advertising (Lang and Chrzan, 2015).
In addition to types of attention, there are levels of attention. Attention can be high or low (Heath, 2007) and partial or full (Teixeira, 2015). Television viewing often occurs at low attention because other activities effectively compete for attention (Krugman et al., 1995). One recent study identified that fewer than 5 percent of television looks last for more than 30 seconds (Brasel and Gips, 2017).
The natural inclination in advertising is to assume (inappropriately) that high attention—which is not the norm—equates to improved advertising effectiveness (Heath and Hyder, 2005). Some studies have found a positive correlation between (visual) attention and advertising effects (Maughan, Gutnikov, and Stevens, 2007). Others, however, have shown that advertisements do work at low attention (Heath, Nairn, and Bottomley, 2009).
One influential article referred to four levels of attention (ordered from low to high):
preattention (or inattention),
focal attention,
comprehension, and
elaboration (Greenwald and Leavitt, 1984).
The highest level, elaboration, is measured by thought listings and thought confidence (Petty, Briñol, and Tormala, 2002), but such high attention is very unlikely for most advertising. The second highest level of attention, comprehension, can be measured by message recognition (Lang, Gao, Potter, Lee, et al., 2015). The difference between low attention (focal attention) and preattention (inattention) is marked by the presence of orienting responses, such as skin-conductance responses (Benedek and Kaernbach, 2010) or heart-rate decelerations (Lang, 1994).
Given that there are different levels of attention and that different attention measures are needed to assess attention at these different levels, a number of attention measures could be useful in tapping different levels of attention, both self-reported and psychophysiological. Psychophysiological measures also should be particularly useful for identifying what in particular about an advertisement is attracting different levels of attention.
Attention to visual advertising (e.g., print, television, online) often is measured with eye tracking, which commonly is quantified as the amount of time that one's eyes fixate on advertising (Venkatraman et al., 2015). Longer fixations are considered a good indicator of focused or high attention. Blinking is an alternative measure, whereby people likely will keep their eyes open more when they want to watch something (Campagne, Pebayle, and Muzet, 2005). Shorter fixation times and higher blink rates (or longer blink duration) both can signal low attention, because viewers are bored, tired, or actively avoiding the content. Relying solely on eye tracking can be problematic when people are daydreaming. In that case, they are looking (i.e., eyes open, long fixations), but they are not attending or responding to the specific stimulus.
Other research highlights the importance of psychophysiological arousal and the amount of cognitive resources allocated to encoding, storing, and retrieving information (Lang, 2006). The most common measure of psychophysiological arousal is skin conductance, a biometric measure of sweating that is associated with sympathetic nervous-system activation (Bailey, 2017). High attention is characterized by high arousal (i.e., increased sweating) and so is observable as a high skin-conductance level (Potter and Bolls, 2012). Low attention can be measured by a reduction in skin-conductance level, such as when viewers disengage their attention during advertising breaks (Bellman, Treleaven-Hassard, Robinson, Rask, et al., 2012; Potter, 2009).
Even at low levels of attention measured by skin-conductance level, however, certain stimuli, such as snakes and spiders, still may attract automatic orienting or reflexive responses (Öhman, Flykt, and Esteves, 2001). These rapid-phasic skin-conductance responses show up as peaks or waves on top of the concurrent longer term tonic skin-conductance level (Benedek and Kaernbach, 2010).
Another psychophysiological measure of automatic attention responses is heart rate (Lacey, 1967; Lang, 1994). Heart-rate deceleration, or increasing time between heartbeats, indicates quieting down to increase attention and facilitate encoding (Wise, 2017). Interbeat interval provides a useful measure of reactive phasic heart-rate response at both high and low levels of psychophysiological arousal. Over prolonged periods, a lack of tonic heart-rate decelerations can be indicative of low attention and even inattention.
Attention-Getting Creative Devices
Through comparing multiple attention measures, the researchers' main aim was to observe how these measures track attention to attention-getting creative devices. There is a rich body of research that covers literally hundreds of creative devices that advertisers can use. One study reviewed empirical evidence for almost 200 creative strategies and tactics (Armstrong, 2010). Many studies have related creative devices to either intermediate or behavioral advertising outcomes, such as recall, brand linkage, likability, and sales, but few have related creative devices to processing measures of attention.
The authors compiled a list of key advertising creative devices likely to attract attention from a comprehensive literature review. They identified 10 devices for investigation (See Table 1), because they had sufficient numbers of observations across the dataset of advertisements for those devices (discussed further in the Methodology section).
The authors also present a selection of studies that have found relationships between the inclusion of these creative devices in advertising and process or outcome measures (See Table 1). Humor, for example, has been studied across a range of contexts and is noted for its ability to attract (most often self-reported) attention to advertising. The authors therefore anticipated that inclusion of these creative devices would manifest in some form of attention.
Practically all creative devices included in this study descend from a codebook developed to analyze the content of television advertisements (Stewart and Furse, 1986). This seminal work confirmed the reliability of the codebook for almost 160 creative devices and associated these with multiple effectiveness measures. Many studies since have referred to the codebook to investigate how advertising creatively works (e.g., Armstrong, 2010; Bellman, Schweda, and Varan, 2012; Phillips and Stanton, 2004; Stanton and Burke, 1998; Stewart and Koslow, 1989). A recent study applied the codebook to a large sample of present-day television advertisements, finding that the codes continue to describe adequately advertising executions across time (Hartnett, Kennedy, Sharp, and Greenacre, 2016).
The authors expected that the different measures of attention might respond with greater sensitivity to different attention-getting creative devices. Visual presentation of the brand name, for example, might trigger recognition of the advertiser's persuasive intent (Campbell, 1995), which could trigger disengagement, measured by skin-conductance level (Bellman, Treleaven-Hassard, et al., 2012; Potter, 2009). Distinctive assets, which are sometimes integral to the storyline—such as with characters (e.g., Kellogg's Tony the Tiger)—might receive comparatively greater visual attention (Hartnett, Romaniuk, and Kennedy, 2016), measured by eye tracking. Because the prior literature was not clear enough to allow the authors to hypothesize which or how many attention measures would detect which attention-getting creative devices, they investigated the following exploratory research question:
RQ: Which attention measures are most sensitive to different attention-getting creative devices?
The authors did not expect that consumers would respond to all occurrences of attention-getting creative devices. If two devices are present simultaneously in different parts of the screen (e.g., visual branding and product demonstration), viewers may look only at one device. If two attention-getting devices follow each other in rapid succession, skin-conductance response may be too slow for viewers to respond to the second device (Benedek and Kaernbach, 2010), whereas heart rate, determined by interbeat interval, can change from one interval to the next (Lang, 1994).
Amount and Complexity of Advertising Messages
A major contention of the low-attention-processing theory (Heath, 2007) and the limited capacity model of motivated media message processing (Lang, 2000) is that processing of video content is mainly bottom up and stimulus driven. Video advertising is stimulus rich. Many new stimuli can be introduced across scene changes, with great diversity in audio and visual tactics.
Prior research has found that when more resources are consumed by information introduced in television content, fewer resources are available for a secondary task, so, for example, button-pressing response time increases (Lang, Kurita, Gao, and Rubenking, 2013). Other work found that cognitive resources similarly are absorbed by video and audio complexity (Lang et al., 2015; Lee and Lang, 2015). The authors expected that when more information was introduced in an advertisement or when the audiovisual complexity of that information increased across scenes, this would demand more of a viewer's available processing resources. To accurately assess the relationships between the attention-getting creative devices and attention measures, the authors needed to control for the amount of information introduced and audiovisual complexity.
METHODOLOGY
Data Collection
The data were collected in a lab with 1,040 consumers, for 118 advertisements. Of these advertisements, 109 had single-source sales-index data collected in-market as an overall measure of effectiveness (see Bellman et al., 2017, for a description of the dataset and single-source sales index). The sample of consumers was drawn from MediaScience's audience panel according to demographic criteria designed to reflect category users for the relevant categories—51 percent women, ages 18 to 83 years, across a broad range of occupations.
Each participant watched eight of the 118 advertisements, which were shown in randomized order among other filler advertisements presented as advertising breaks throughout one of three half-hour television programs. Participants sat in individual NeuroQube® stations (at MediaScience, Austin, Texas) and watched the content on a large computer screen. A NeuroQube is a portable computer desk with a large television-like computer screen, a small smartphone screen, and a medium-sized tablet screen, also used for answering questionnaires. The computer in the NeuroQube runs content on the screens and stores time-locked data from multiple biometric and facial-tracking sensors.
Dependent Variables: Attention and Traditional Measures
Eye Tracking. The authors used unobtrusive infrared technology to track locations where participants looked at the screen (number of fixations) and for how long (sum of fixation duration, measured in seconds). The authors also used high-definition camera recordings of facial expression to detect when a participant's eyes were open or closed to measure blink rate and duration.
Biometrics. Electrodes were attached to the first three (index to ring) fingers on the participant's nondominant hand. Two of these electrodes measured skin conductance, indicated by an increase in the conductivity of electricity between the two electrodes as a result of rapid increases in sweating (Potter and Bolls, 2012). The authors processed skin-conductance data using Matlab to partition out the orienting, fast-moving phasic component from the slow-moving tonic component (Benedek and Kaernbach, 2010).
The authors used peaks in the phasic component to count the presence of skin-conductance responses (coded as 1) in each second, averaged across viewers of the advertisement. To control for individual differences, they calculated overall tonic skin-conductance level of arousal as a percentage change from a moving baseline, which was the average measured across the last five seconds before the advertisement began (Potter and Bolls, 2012).
The third electrode used photoplethysmography to detect pulse and therefore heart rate. A decrease in heart rate or interbeat interval requires a longer time between heartbeats. Longer interbeat intervals (in milliseconds) were used as the phasic heart-rate measure of attention. To control for individual differences, the authors calculated interbeat intervals as the percentage change from a baseline, which was the average measured over a two-minute period while participants watched a relaxing Zen video before seeing the program.
Traditional Measures. It is recommended that researchers collect traditional measures alongside psychophysiological measures to validate and triangulate the data (Varan, Lang, Barwise, Weber, et al., 2015). Brand recall is a traditional measure of attention and advertisement processing, when it results in the ability to retrieve memories of advertised brands. To measure long-term memory, the authors collected two brand-recall measures—unaided and aided recall—with a questionnaire after a 15-minute delay after exposure to the last test advertisement (Eysenck, 1976).
For unaided brand recall, participants were asked the open-ended question, “Please list the brands you remember. Separate each brand with a comma. If you do not clearly remember any advertisements, please feel free to guess.” Correct recall (misspellings allowed) of each brand was coded as 1; other responses were coded as 0.
For aided brand recall, participants were shown three still images from each advertisement with the branding removed and were asked to recall the brand advertised. Again, correct recall of each brand was coded as 1, and other responses were coded as 0. Advertisement liking also was measured. Participants were shown three still images from the advertisement, this time with branding, and were asked to rate how much they liked the advertisement on a validated, single-item 6-point scale (Bergkvist and Rossiter, 2007).
Independent and Control Variables
The attention-getting creative devices represented the independent variables. Definitions for all creative devices, as well as for information introduced and audiovisual complexity, are described (See the Appendix). Some of the codes were replicated directly from originating works or adapted to reflect evolved terminology, such as “distinctive assets,” which represent a range of indirect branding devices (i.e., exclude the brand name but are associated strongly with the brand; Hartnett, Romaniuk, and Kennedy, 2016).
There were more codes initially, but the final list of 10 creative devices was determined empirically. All were present in at least 10 percent of the second-by-second data (lowest = “product in use”: 12 percent of 2,874 observations). All attention-getting creative devices were coded on a second-by-second basis (e.g., humor was present in Seconds 1, 2, 3), such that they could be overlaid with the continuous biometric measures: skin conductance and heart rate.
All creative devices were triple coded for reliability, with the final code decided by majority vote (i.e., at least two of the three coders agreed the device was present or absent). All codes were reliable above the 0.7 agreement cutoff proposed by previous researchers (Rust and Cooil, 1994). The authors further converted the coding of the creative devices into a whole-advertisement measure, reporting the average number of creative devices used per second for each advertisement.
Information introduced, audiovisual complexity, sales effectiveness, prior exposure, and category usage were entered as control variables throughout the analyses. Information introduced and audiovisual complexity were coded for each scene change (as detailed by Lang et al., 2013; Lee, 2009) and then converted into second-by-second measures to align with the continuous biometric measures and creative devices. Sales effectiveness was measured as an index relative to category norms; on the basis of this information, the advertisements were grouped as either ineffective (i.e., Level 1 = below-average sales effectiveness) or effective (i.e., Level 2 = average sales effectiveness or better).
Prior exposure to the advertisements and relevant category usage were self-report measures collected with traditional measures in the questionnaire. Both were single-item measures. Prior advertisement exposure was measured on a 5-point scale (0 = never to 5 = five times or more; as per Crosby and Stephens, 1987), and category usage was measured on a 7-point scale (0 = never to 6 = several times a week).
Analysis
To assess how the advertisement content, process (biometric), and outcome variables interrelated, the authors ran Spearman nonparametric rank correlations between the whole-advertisement measures (average or average per second) and traditional measures, as well as prior exposure and category usage. Whole-advertisement measures included
number of creative devices,
information introduced,
audiovisual complexity, and
attention (eye tracking and biometrics).
Prior research using time-series modeling suggests that biometrics data have lagged responses, with skin conductance having a two- to seven-second delay in response and heart rate having a five-second delay (Wang, Lang, and Busemeyer, 2011). The authors calculated correlations between the number of creative devices present per second and skin conductance or heart rate at zero, one, two, three, five, or seven seconds later. Delayed correlations were not stronger than the concurrent one, so the concurrent biometric responses are reported here.
The authors used mixed regression analyses to identify which specific attention-getting creative devices were responsible for changes in the second-by-second biometric measures of attention. Principal-components factor analysis revealed that the creative devices could not be reduced meaningfully to a smaller number of factors, because they were relatively uncorrelated with each other. Creative devices therefore were empirically, theoretically, and practically distinct (e.g., product shown need not imply packaging shown, and vice versa). Regression analysis revealed that the largest variance-inflation factor was 2.99 (for showing the packaging), which was less than the critical level of 10, indicating no multicollinearity problems (DeMaris, 2004).
The Durbin–Watson statistic was 0.96 (less than 2), however, indicating autocorrelation. To control for this autocorrelation, the authors estimated mixed regression models specifying a lag-1 autoregressive structure for the residual matrix. Additionally, these regressions controlled for differences in trend slope among the 118 advertisements using random coefficient models.
The authors log-transformed the two skin-conductance measures to normalize their distributions and Winsorized them to control for outliers three standard deviations from the mean (Kirk, 2013). The authors also Winsorized heart rate, which was normally distributed. They ran a regression analysis for whole-advertisement traditional measures, as well. They did not use mixed regression models for these three time-invariant dependent variables, which were distributed normally.
RESULTS
Correlations Results
Eye Tracking. The eye-tracking measures' correlations reveal two processes at work when viewers watch television advertisements (See Table 2). First, the more viewers looked at the advertisement, the more likely they were to remember the advertisement and like the advertisement. The number of fixations was correlated positively
with brand memory (unaided recall, p = .005; aided recall, p = .029), consistent with prior research (Venkatraman et al., 2015), and
advertisement liking (p < .001), also consistent with prior research (Maughan et al., 2007).
Similarly, the less viewers looked at an advertisement, measured by more blinking, the less likely they were to remember and like the advertisement, although only the correlation between blink rate and aided brand recall was significant statistically (p = .029). These positive relationships between attention and effectiveness occurred only when the advertising was unfamiliar, however.
Second, if viewers recognized the content as advertising, they withdrew visual attention. Number of fixations was correlated negatively with prior exposure (p = .032), which suggests that existing knowledge of the advertisement meant participants did not need to commit an extended amount of focused attention. Prior exposure was correlated positively with brand memory, however (unaided brand recall, p = .020; aided brand recall, p < .001).
Category usage was uncorrelated with the number of fixations but positively correlated with brand memory (unaided brand recall, p = .021; aided brand recall, p < .001), in line with prior research (Vaughan, Beal, and Romaniuk, 2016). There were also negative correlations between fixation measures and the number of creative devices (number of fixations, p < .001; fixation duration, p = .038), which suggests that the more creative devices advertisers included throughout the advertisement (probably in the hope of attracting attention), the easier it was to recognize the content as advertising. The undesirability of paying attention to advertising was indicated further by the significant negative correlations between the number of creative devices and the three traditional measures (unaided brand recall, p = .003; aided brand recall, p = .027; advertisement liking, p = .007).
These findings collectively indicate that prior exposure in the past improved viewers' ability to recall very recent exposures, but also to identify these advertisements as advertising quickly and withdraw their attention from them. These two processes of attention provide some evidence for the external validity of the eye-tracking measures.
Biometrics. None of the whole-advertisement biometric measures had a significant positive correlation with brand memory. The two skin-conductance measures, which were correlated highly with each other (p < .001), had significant negative correlations with advertisement liking (level, p = .036; response, p = .016), however. As with the eye-tracking measures, these results are consistent with viewers not liking content they recognized as advertising.
Skin-conductance response had the strongest correlation with the number of creative devices (p = .001). More skin-conductance responses were observed when more creative devices were present, as these devices elicited orienting response calls for cognitive resources to process the content. This allowed viewers to recognize the content as advertising, at which point they stopped looking at the advertisement.
There were several significant negative correlations between the biometric and eye-tracking measures. All three biometrics had significant negative correlations with number of fixations (heart rate, p = .012; skin-conductance level, p = .008; skin-conductance response, p < .001). Heart rate, an indicator of greater attention, also had a significant negative correlation with fixation duration, an indicator of less attention (p = .048).
Regression Results
Regression-analysis coefficients, which represent the effect of each independent variable when all other variables are zero (in this case, not present; Irwin and McClelland, 2001), were estimated and are reported (See Table 3). In these results, the intercept represents the mean for the sample of advertisements, with adjustment for the mixed regression model's estimate of the heterogeneity between advertisements and the autocorrelation between repeated measures. Two regression models were estimated for each dependent variable. For the three continuous biometric dependent variables, Model 1 included only the control variables:
time (passing in seconds),
information introduced,
audiovisual complexity, and
whether the advertisement was ineffective in terms of in-market sales response.
Consistent with the correlations among whole-advertisement measures reported above, the regression results revealed that longer attention (longer advertisements in seconds) had a positive effect on all traditional measures, in line with previous research (Patzer, 1991). Information introduced also had positive effects on all traditional measures. Audiovisual complexity had a significant positive effect on unaided recall (p = .04).
The regression results also indicated, however, that viewers withdrew attention once they recognized that the content was advertising. There were significant trends indicating a decrease in attention over time for all three biometric measures. The increase in average (tonic) heart rate would have indicated an increase in attention if (tonic) skin-conductance level were stable or increasing. Because skin conductance was decreasing, however, the increase in heart rate indicates a decrease in attention over the duration of an advertisement (Berntson, Cacioppo, and Quigley, 1993). Sales-ineffective advertisements were associated with reduced attention, measured by heart rate (p < .001), from the start of the advertisement (which was first identified by Bellman et al., 2017), as well as lower unaided recall and advertisement liking (both p < .001).
Model 2 (and Model 1 for the traditional measures) added the effects of the presence of the 10 creative devices on a second-by-second basis. Likelihood ratio tests (equivalent to change in R2 in ordinary least squares regression) and information criteria suggested that Model 2 was an improvement for all three of the biometric attention measures. In the correlations analysis using whole-advertisement measures (See Table 1), only skin-conductance response was correlated with the number of creative devices. In the second-by-second regression results, however, each biometric measure responded differently to different creative devices.
If one assumes that no other creative devices were present during any particular second, voiceovers and slogans reduced attention measured by heart rate (p = .033 and p = .096, respectively), whereas pack shots increased attention (p = .040). Showing the packaging did produce mixed results, however, in that it also reduced attention measured by skin-conductance level (p = .024). Depicting animals increased attention measured by skin-conductance response (p = .019).
Model 2 for the traditional measures included the effects of the biometric measures of attention. These models tested whether the direct effects of creative devices on traditional measures were mediated by biometric measures of attention. Consistency of effects of creative devices on both biometric and traditional measures provides evidence for the validity of biometric measures of attention. For example, voiceover had a significant negative effect on attention measured by heart rate and significant negative effects on all traditional measures (all ps < .001).
Showing the packaging increased attention measured by heart rate but most likely triggered withdrawal of attention measured by skin-conductance level, which would explain its negative effects on traditional measures (unaided and aided brand recall, p < .001; advertisement liking, p = .005). The presence of an animal increased attention, measured by skin-conductance response. It also had positive effects on unaided recall and advertisement liking but had a negative effect on aided recall (all ps < .001).
Further evidence of the validity of the second-by-second biometric measures is their significant direct effects on traditional measures, although each biometric measure affected a different traditional measure. Heart rate was related positively to advertisement liking (p < .001). Skin-conductance level was related positively to unaided brand recall (p = .006), whereas skin-conductance response was related negatively to aided recall (p = .007). Skin-conductance level also was correlated negatively with aided recall, although only marginally (p = .064).
There was evidence of partial mediation by the biometric attention measures. Sales-ineffective advertisements had a significant direct effect on heart rate. The effect of ineffective advertisements on advertisement liking reduced in significance (from p < .001 to p = .009) when heart rate was added to the model. For the other independent variables with significant effects on potential mediators, however, their direct effects on traditional measures did not change substantially when the effects of these potential mediators were controlled for.
Attention-Getting Creative Devices And Ineffective Advertisements
Finally, the authors explored whether ineffective advertisements differed from other advertisements by using different creative devices or by using the same devices with different effects on attention. They carried out tests using the most significant creative devices for each of the three biometric measures of attention. For each measure, the authors estimated attention controlling for all the other variables in Model 2, except the creative device being tested.
Voiceovers. The correlation between voiceover and heart rate was significantly negative for both ineffective advertisements (r = −.15) and other advertisements (r = −.17, both ps < .001). That said, ineffective advertisements probably overused voiceovers, consistent with findings from previous research (Hartnett, Kennedy, et al., 2016). During the average second, significantly more ineffective advertisements used voiceovers compared with effective advertisements (52 percent versus 31 percent; p < .001).
Packaging. Showing the packaging generally reduced attention measured by skin-conductance level, but the correlation was larger and more significant for ineffective advertisements (r = −.33, p < .001) than for other advertisements (r = −.05, p = .036). This finding suggests that the depiction of pack shots potentially triggered awareness of the advertising and its intent (Campbell, 1995), leading to disengagement and lowered skin-conductance level (Bellman, Treleaven-Hassard, et al., 2012). Ineffective advertisements also probably overused pack shots. During the average second, significantly more ineffective advertisements included pack shots compared with effective advertisements (30 percent versus 23 percent, p = .002).
Animals. The correlation between the presence of an animal and attention measured by skin-conductance response was significantly positive for ineffective advertisements (r = .34, p < .001) but negative and only marginally significant for other advertisements (r = −.04, p = .069). This is an interesting and conflicting result, especially given that both types of advertisements were just as likely to show an animal in any second (20 percent versus 22 percent respectively, p = .439).
DISCUSSION
The aim of this study was to demonstrate empirically the complexities of attracting and measuring attention to advertising. The authors' specific objective was to explore how different measures of attention respond to different creative devices. To be practically useful, measures of attention should identify when a creative device has been executed well because it attracts attention and, conversely, when a creative device fails to gain attention (or, more alarmingly, reduces attention). If different measures respond to different creative devices—which the authors found—then advertisers need to be selective with which attention measures they use for particular creative executions.
If attention is the gatekeeper for advertising processing, this suggests that attention is the gatekeeper for higher order advertising outcomes, too. Prior research has identified different types and levels of attention and has suggested that different measures are needed at different levels. One of the main contributions of this study is to demonstrate that across the three levels of attention that generally apply to television viewing—preattention (inattention), focal attention, and comprehension—biometric measures detect the lowest level of attention, which is focal attention (orienting responses) to advertising stimuli.
Previous researchers (Greenwald and Leavitt, 1984) proposed that orienting responses mark the boundary between preattention and focal attention to elements of advertising. By using a combination of measures, this study shows—for the first time—that it is possible to mark the transition between these two lowest attention levels. Eye tracking is a good measure of focal and higher (visual) attention, because the number of fixations had high correlations with traditional outcome measures of attention, such as brand recall and advertisement liking.
The decline in arousal after an advertisement begins—measured by skin-conductance level—suggests that viewers disengaged from television content once they recognized the content as advertising. Later in the advertisement, when viewers had a low level of arousal and attention, heart rate and skin conductance still were able to respond to the appearance of certain attention-getting creative devices (e.g., voiceovers, pack shots, animals). In the results, for example, pack shots, which typically are shown at the end of an advertisement when arousal is low, still were able to attract attention responses measured by heart rate. This momentary attention to the pack shot only served to reduce arousal further measured by skin-conductance level, however. Most likely for this reason, showing the packaging had negative effects on brand recall and advertisement liking. This ability to detect low-attention effects makes biometrics particularly useful for assessing advertising's creative effects. All three of these measures are highly scalable biometric measures.
Another contribution of this study is to explore the connection between attention-getting creative devices and direct process measures of attention, with controls for the effects of the amount of information and audiovisual complexity (Lang et al., 2013, 2015; Lee and Lang, 2015). Past research predominantly has related creative devices only to advertising outcomes, and researchers often have suspected that these creative devices sometimes work by attracting attention to advertising. It is notable that biometrics did not mediate the effects of attention-getting creative devices on outcome measures, which indicates that the creative effects potentially occurred after exposure in the viewer's memory, during consolidation and retrieval processes.
Of the creative devices that were found to have significant attentional effects, the relationships were largely directionally consistent with past research. Voiceovers were found to reduce attention, measured by heart rate. Prior studies also have found voiceovers to be associated with poorer outcomes. On-camera spokespeople, conversely, have demonstrated positive effects on advertising memory and sales (Hartnett, Kennedy, et al., 2016; Stewart and Furse, 1986; Stewart and Koslow, 1989).
Animals were found to increase attention, measured by skin-conductance response. Animals in advertising previously have been found to hold visual attention, measured with eye tracking (Brasel and Gips, 2017), and have demonstrated positive effects on advertising memory and evaluations (e.g., Lancendorfer et al., 2008; Stewart and Furse, 1986; Yelkur et al., 2013). Findings related to pack shots have been more sparse and inconsistent—the latter is also evident from this study—which suggests that there are complex conditions for presenting packaging throughout advertisements.
Practical Implications
This study is another step in the process of building a robust measurement toolbox for advertisers. A toolbox approach helps advertisers to choose the right tool for their strategic objectives. Just as previous research found that no single copy-testing measure can identify advertisements that are successful (Bellman et al., 2017), this study suggests that no single measure of attention will identify attention responses to all of the creative devices used in advertisements.
If advertisers are testing for attention, they should not use a single measure. If they do, people might be paying (or withdrawing) attention to a particular creative device that is not detected by the specific measure in use. The results support three scalable measures of attention for advertising testing: skin-conductance level and response, and heart rate.
Testing with multiple measures is ideal, but it may not always be possible or practical for advertisers to collect all biometric measures of attention in addition to traditional measures. If one biometric measure has to be prioritized before others, these results suggest heart rate as the best option because it was most strongly associated with in-market performance. Heart rate further provided clear diagnostic evidence of the attention-getting creative devices that failed to get attention, which in part can explain reduced sales performance.
Advertisers commonly use voiceovers and pack shots, which means the heart-rate measure should be relevant to a broad range of executions. Heart rate also can be measured cheaply by webcams for video advertising, similar to eye tracking. Skin conductance, conversely, still requires comparatively expensive lab studies. Heart rate appears as a safety-net measure to identify problem areas of executions to be rectified or to eliminate the poorest performing advertisements, which means it may not provide a lead indicator for creative excellence.
Advertisers that adopt biometric measures to support advertising decisions should take care with respect to how they respond to testing results. Biometric research applied to advertising is still a new frontier, and further empirical testing is needed to establish the reliability and validity of these measures (Kennedy and Northover, 2016; Varan et al., 2015). Making the decision to abandon or discontinue a campaign on the basis of poor results from a single biometric measure or a single exposure is potentially premature.
The authors further explored the potential of these measures to identify characteristics of ineffective advertisements that set them apart from effective ones. Voiceovers generally have a negative effect on attention, measured by heart rate, possibly because they encourage viewers to direct attention internally (on thinking about the audio) rather than directing attention externally to watching the screen (Lacey, 1967). Ineffective advertisements may be ineffective because they overuse voiceover, however, rather than because they poorly execute voiceovers (e.g., selection of voice actor, pronunciation, pace). If one assumes that these results replicate, the implication is that voiceovers should be used sparingly, perhaps just enough to make sure the message is conveyed when viewers are not looking at the screen.
Similarly, showing the packaging is useful for recognition at point of purchase, but overuse may trigger cognitive avoidance and a reduction in attention, measured by skin-conductance level. If the device is used more sparingly, viewers still can notice the pack at low levels of attention, measured by heart rate. Finally, showing an animal attracts attention, measured by skin-conductance response, and ineffective advertisements could increase attention, and potentially sales, by showing animals at all or more often.
Limitations
The study does have some limitations. Direct causality between attention and sales response to advertising was not established here. Further investigation is needed into how attention interacts with other process variables—emotional response and cognitive processing, for example—linked to the behavior of individuals.
Participants in this study also were exposed to the advertisements just once. In contrast, for many campaigns consumers may receive many exposures over time, which provides some reinforcement. Because real television advertisements were studied, the authors captured prior exposure with a questionnaire to determine whether participants had seen the advertisements before. This is a rather weak measure of reinforcement, but if confirmed, it would suggest that the study itself represents a second exposure (at least). Prior exposure was related positively to eye tracking and brand-memory measures but not related to other biometric measures. Future research would benefit from assessing multiple exposures to advertisements in the collection of biometric data, to more deeply explore these relationships or lack thereof.
Any research looking at creative devices has considerable variability to contend with, even in this sample of more than 100 advertisements. To avoid the noise and correlations present in samples of real advertisements, the authors hope to conduct controlled experiments with advertisements specifically designed to test further for the attentional effects associated with specific creative devices, such as pack shots and voiceovers. The authors nonetheless encourage further replications of these results to test whether and how they generalize across other large samples of advertisements.
That said, future studies using different samples of advertisements, different measures of attention, or other measures of effectiveness almost certainly will produce somewhat different results than this study with respect to effective creative devices, as evidenced by prior advertising replication studies (Hartnett, Kennedy, et al., 2016; Stewart and Koslow, 1989). Substantive future contributions would be to explore the interactive effects of the creative devices on attention and outcome measures or temporal effects whereby the timing or order of creative devices could be important. For example, evidence across studies indicates that including the brand earlier in the advertisement is better for brand recall (Romaniuk, 2009).
Conclusions
This study highlights that different types of attention need their own measures and that different creative devices also may benefit from specific measures. The contributions of a toolbox approach were explored, and the promise of some preliminary measures—primarily drawing on biometrics and eye tracking—was demonstrated. More research clearly is needed, but studies such as this one are vital to ensuring that advertising measures are evidence-based across conditions and are pegged to key strategic objectives and to in-market success. Such research can help develop further and validate an attention toolbox. This, in turn, will help advertisers to produce more sales-effective advertising that viewers give attention to and will helpresearchers to advance theoretical understanding of the different types of attention to advertising.
ABOUT THE AUTHORS
Steven Bellman is a research professor at the Ehrenberg-Bass Institute for Marketing Science, University of South Australia. His research on media and advertising responses is funded by the Beyond:30 project, whose sponsors include television networks and advertisers worldwide. His research has appeared in the Journal of Marketing and Journal of the Academy of Marketing Science, among others, and he is on the editorial boards of the Journal of Advertising, Journal of Advertising Research, and Journal of Interactive Marketing.
Magda Nenycz-Thiel is a research professor at the Ehrenberg-Bass Institute. Her areas of expertise are category and industry growth, e-commerce, and neuromarketing. Her work has been published widely in Journal of Advertising Research, European Journal of Marketing, and Journal of Business Research, and she is an author of book chapters on physical availability management and e-commerce.
Rachel Kennedy is a professor, director and one of the founders of the Ehrenberg-Bass Institute. Her research is focused on advertising and media knowledge to help grow brands. She has a strong track record of successful industry engagements. As well as being on a number of editorial boards including the Journal of Advertising Research, she is widely published including in the Journal of Business Research, Journal of Retailing and Consumer Services and the Journal of Advertising.
Nicole Hartnett is a senior marketing scientist at the Ehrenberg-Bass Institute. Her areas of research revolve around advertising creativity, effectiveness, and management. Her research has been published in the Journal of Advertising and European Journal of Marketing, among others, and she is the coauthor of a book chapter on advertising.
Duane Varan is the chief executive officer of MediaScience in Austin, Texas. He also oversees Beyond:30, a collaborative industry project exploring the changing media landscape. His work can be found in the Journal of Advertising Research, Journal of Communication, and Journal of Economic Psychology.
Appendix Definitions for Creative Devices, Information Introduced, and Audiovisual Complexity
INFORMATION INTRODUCED (Lang et al., 2013)
Seven dimensions were summed for each scene change. Scene changes tended to be fewer in number than seconds (i.e., a scene change every two or more seconds).
Object change: Focal objects in the scenes preceding and following a camera change are different (coded as 1, otherwise 0).
New object: Focal objects in the scenes preceding and following a camera change are different and the object following has not been seen previously (coded as 1, otherwise 0).
Unrelated: Information following a camera change does not follow logically the story, context, or expectations established prior (coded as 1, otherwise 0).
Closer: Focal object following a camera change appears closer than the focal object preceding the camera change (coded as 1, otherwise 0).
Emotion: Emotion changes from positive to negative or from calm to arousing, or vice versa, following a camera change (coded as 1, otherwise 0).
Perspective change: Information following a camera change is seen from a new angle or perspective (e.g., directly in front to looking from above; coded as 1, otherwise 0).
Form change: Features of the information following a camera change are different (includes color to black and white, moving to still pictures, live action to animation, adding superimposed video graphics; coded as 1, otherwise 0).
AUDIOVISUAL COMPLEXITY (Lee, 2009)
Nineteen items were summed for each scene.
Audiovisual Redundancy
Related audio and video: Degree of correspondence between the audio information (e.g., spoken words and sounds) and video information (e.g., images and text; if not present = 0, if semantically related [exact match] = 1, if thematically related = 2, if no relationship/conflicting = 3).
Talking head: A person is on screen and speaking to the camera, or two people are conversing and only the listener is on screen (coded as 1, otherwise 0).
Single channel: Audio or video channel is completely missing (no audio = 1, no video = code 2, if not present = do nothing).
Audio Structural Complexity
4. Onset of sound: How many onsets are heard? Onsets include human voice, background (sound or music), natural or computer-generated sound effects (if not present = 0, one present = 1, two present = 2, etc.).
5. Human voice: How many human voices are heard? Can include a speaker, announcer, or character (if none = 0, one present = 1, two present, = 2, etc.).
6. Background sound or music: Are background music or sounds heard? People are talking in the background, but the language is unrecognizable (coded as 1, otherwise 0).
7. Natural sound effect: Is natural sound heard? Natural sounds include those that can be recorded from nature, such as birds singing or fire crackling (coded as 1, otherwise 0).
8. Computer-generated sound effect: Is compute-generated sound heard? These sounds are impossible to record in nature (coded as 1, otherwise 0).
9. Other: Is there a sound that you cannot code for any of the audio complexity variables above? If any, count and name them (if present = 1, then specify, otherwise 0).
Video Structural Complexity
10. Number of focal object changes: How many times do focal object changes happen? Focus objects could include a person or a new part of a room (if not present = 0, one present = 1, two present = 2, etc.).
11. Colored movie (coded as 1, otherwise 0)
12. Colored still picture (coded as 1, otherwise 0)
13. Text (coded as 1, otherwise 0)
14. Animation or computer graphics (coded as 1, otherwise 0)
15. Black-and-white white movie (coded as 1, otherwise 0)
16. Black-and-white picture (coded as 1, otherwise 0)
17. Other: Is there a visual that you cannot code for any of the visual complexity variables above? If any, count and name them (if present = 1, then specify; otherwise 0).
18. Number of objects: Assess how many objects are presented. If background has no information or is meaningless then don't include (if none = 0, if 1~5 [a few] = 1, if 6~15 [some] = 2, if 15~ [a lot] = 3).
19. Object movement: Are objectives moving? Noting if more than one object is moving, the highest category should be selected (if no movement = 0, object moving their part/not moving through space = 1, object moving away or across = 2, object moving toward = 3).
Attention-Getting Creative Devices
Humor: Attempt is made to induce humor (e.g., there is a joke, pun, witticism, or slapstick; Stewart and Furse, 1986).
Animation: All or some of the visual presentation is animated (computer generated), could include animated scenes or characters (Stewart and Furse, 1986).
Product demonstration: A demonstration of the product in use (e.g., pouring cereal into a bowl, a person feeding a pet; Stewart and Furse, 1986).
Voiceover: Audio message delivered by a voiceover announcer (i.e., person not on camera; Stewart and Furse, 1986).
Product shown: Showing the actual product out of its packaging (e.g., pieces of chewing gum; Stewart and Furse, 1986).
Packaging shown: Showing the product wrapped in its packaging (Stewart and Furse, 1986).
Animals: Includes an animal, either real or animated (Stewart and Furse, 1986).
Slogan: Identifiable slogan is presented visually (i.e., in text) or verbally (i.e., spoken words), usually (but not always) at the close of an advertisement (Hartnett, Romaniuk, and Kennedy, 2016).
Visual branding: Showing the brand name, either as a stand alone, on packaging, or potentially as part of a logo (Romaniuk, 2009; Stewart and Furse, 1986).
Distinctive assets: Nonbrand-name brand elements, such as logos, characters, or slogans that are connected to the brand in consumers' memory (Hartnett, Romaniuk, and Kennedy, 2016).
- Received October 25, 2017.
- Received (in revised form) June 4, 2018.
- Accepted July 7, 2018.
- Copyright© 2019 ARF. All rights reserved.