ABSTRACT
The growing number of consumer data sources—with the complexity of integrations across multiple consumer touch points—poses a challenge for end users to assess data quality. Gaining a better understanding of the underlying quality of data, to inform how best to deploy for advertising and marketing decisions, is the principal issue. The primary purpose of this study (conducted on behalf of the Coalition of Innovative Media Measurement [CIMM]) was to help inform the general media community about the quality, recency, consistency, and representative aspects of third-party data. The secondary purpose was to provide feedback on the industry's appetite for master data and reporting standardization.
MANAGEMENT SLANT
To improve data quality, the marketing and advertising industry could establish a “Principles of Data Quality” disclosure;
create a roster of items to ask of data suppliers;
request transparency—data firm disclosures—including consumer authentication procedures, multiple data source descriptions, model target techniques and testing results, recency of data, integration techniques, data labeling, and top-line descriptions for all of the above;
create best practice standards.
INTRODUCTION
Data Integration Overview
In recent years the marketing and advertising industry has accelerated the integration of large data sets to provide richer insights for linking advertising and promotional efforts to consumer response. These integrations have reached critical mass application for digital media placement and are gaining increased traction in the television-advertising arena. The sheer volume and appearance of different types of data likely will increase in the future while the industry fast tracks Big Data capabilities.
Although “fast” is the operative word, there is concern among many research/analytic experts about the completeness, quality, and transparency of these data for consumer segmentation and marketing/media investment decision making. Furthermore, with the ever-expanding number of data sources, there's a need to examine the notion of data harmonization, or standardization of nomenclature, to make the data integration process simpler for end users.
Data integrations can come in many shapes and sizes but can be broadly categorized into consumer segmentation and targeting, customer relationship management (CRM), media optimization, and measurement of marketing and advertising sales impact. It appears that marketers now are attempting to apply the precision of direct marketing CRM segmentation and targeting at larger scale by connecting consumer data across myriad sources, for more pinpointed messaging during the customer acquisition phase of advertising and marketing.
The Targeting Accuracy Conundrum
Users of third-party data recently have been very vocal about the quality of consumer target segments created by third-party data suppliers, contending that target accuracy can vary widely from vendor to vendor. In some cases, the additional cost of buying enhanced targets may not be commensurate with the level of improved targeting accuracy. According to Paul Rostowski, president of online adbuying company Varick Media Management, “If you're doubling your costs by adding in data, the performance has to be at least twice as good. Often the cost of the data won't justify itself.”1
The concept of target accuracy refers to how closely a consumer segment definition reflects a marketer's actual media target. Improving accuracy means reducing waste, which results in elevating marketing and advertising return on investment (ROI). One of the big challenges marketers and third-party data firms face is developing a means to scale precise, verified purchase behavior profiles of existing customers to the set of consumers of competing brands and services.
To that end, third-party firms create target models from CRM data that resemble existing customers. In addition to describing target consumers with greater accuracy, marketers often want to identify customers and prospects who are “in-market” to purchase.
Hypothetically, the ideal marketing and advertising targeting plan calls for communicating only with consumers that meet two criteria:
They are in-target.
They are in-market to buy.
The ability to execute an advertising plan against these conditions is challenging from two perspectives:
accurately defining the target and addressing the issue of timing the communication when consumers are ready to purchase;
the addressable level of the media required to execute against the two target criteria, especially the in-market component.
The marketing and advertising industry has looked to electronically capture digital activity as a robust, immediate source of consumer intention that would indicate in-market buyer readiness. So the ability of digital advertising technology to pin-point and advertise to in-target, in-market consumers is presumed to be easier than, for example, television, where advanced addressable capability is just beginning to take root.
The preponderance of industry concern about targeting accuracy largely has been centered in the digital media space where third-party data enrichment is highly prevalent among publishers and ad tech firms. Kraft Foods Group, citing its digital experience, quantified the level of accuracy across multiple third-party data suppliers from in-house studies they've conducted. Julie Fleischer, senior director, data + content + media at Kraft, said, “With a few exceptions, intransigent publishers are shooting for profit, they are creating opaque systems that defy tracking and measurement, privileging themselves and their operations over their customers.”2
Lift in Accuracy vs. Untargeted Control Campaign
After vetting third-party data across four suppliers on accuracy of fundamental demographics and some behavioral characteristics, Kraft concluded the levels of target accuracy to be grossly underwhelming. For example, the targeting of owners of Keurig coffee brewers resulted in a target hit rate ranging between 14 percent and 20 percent across the data providers examined.
In an analysis independent of Kraft's investigation, media agency MediaSmith conducted a study that assessed targeting accuracy for pro-bono campaigns in the United Kingdom and United States during a three-week period in November, 2014. The analysis compared targeting accuracy for vendor-created segments versus an untargeted control campaign. Findings showed an enormous chasm between the most- and least-accurate vendors: In five out of 14 cases the targeting accuracy was at least double that of the untargeted control group, while improving only 5 percent to 15 percent in four instances (See Figure 1).
The Data Supplier Ecosystem and Process
Third-party data suppliers in the advertising and marketing space can be described as a checkerboard of companies that have many similar traits but different, nuanced personalities. At the base of nearly every offering is a description and/or behavioral profile of the consumer but seen through the lens of that particular company's expertise.
For example, some firms specialize in tracking and reporting consumer packaged goods (CPG) activity, others excel in department and retail store transactions, while a few might offer digital behavior as a means for identifying in-market consumers. The older firms' ancestry tracks back to direct marketing with data often sourced from the credit card industry as well as direct mail; they became the CRM stalwarts of marketers. The digital companies grew their business from processing and matching consumers' online activity; their business spawned from web publishers and they've branched out into the offline world. There are also firms that process television viewing data, integrating it with descriptors from other companies within the ecosystem. Net, net: End users can pick from a wide diversity of suppliers, depending on relevance and fit to their business needs.
The Ecosystem of Data Enrichment—Simplified View
Many marketers put data enrichment into action by matching their customer files with consumer transactions and media exposure from external third-party sources. The author of this study created a simplified view of this process that uses the power of an advertiser's CRM data base to identify key customers and their noncustomer look-alikes to be targeted in the media universe (See Figure 2). Many third-party firms work closely with advertisers in building and maintaining their customer data bases and often have permission to access customer files for authenticating consumer information across multiple sources. This process benefits the marketer and other customers who engage specific third-party firms; it also is used by media companies as a service to advertiser clients and as a means of increasing the value of their advertisement inventory.
Defining Data Quality
The notion of data quality can be viewed in three tiers:
Data preparation hygiene—cleansing processes to address missing fields, spelling conventions, typos, value, and logic checks
Underlying quality—source credibility, recency, consumer classifications, collection method, representativeness
Integration process—quality of techniques/methods used to combine disparate datasets.
Data preparation hygiene is a critical requisite for promoting overall data quality, since lack of scrubbing indirectly can throw off the underlying data quality as well as introduce error during the integration process.
While this study focuses primarily on underlying data quality and integration process procedures that have received prominent industry attention, the importance of rigorous data preparation measures cannot be understated. According to Gartner, poor data quality is the primary reason for 40 percent of all business initiatives failing to achieve their targeted benefits.3
CIMM Study Details
Two modes of data gathering were deployed:
a literature search
interview-format discussions with subject matter experts (SMEs) on the topic of third-party data quality.
Literature Search
Search criteria: Third-party Big Data and marketing, advertising, media placement, data quality
Sources: Google queries and WARC literature search.
As expected, the search yielded virtually no articles from marketing research trade journals that specifically addressed the issues of consumer authentication and target modeling. There were, however, marketing and advertising industry-related articles that voiced concerns and provided quantitative evidence of inaccuracies in identifying target audiences for some third-party supplier modeled solutions.
SME Discussions/Recruitment
In order to provide a rich account of end user data-quality concerns and corresponding feedback from data suppliers, CIMM conducted 22 interviews as follows (See Figure 3):
Research Initiative Overview
10 end users—a combination of senior research personnel and media placement experts at agencies, advertisers, media organizations, MVPDs, and data intermediaries.
12 data supplier companies—senior personnel in third-party data supplier firms broken out by
general purpose (core consumer data sourced from credit transactions and census data; multicategory data offerings; direct marketing legacy; new entrants focused on digital behavior),
television audience (third-party processors of digital set-top box data; match television data with consumer descriptors from general purpose firms),
specialty firms (focus on one or two categories of data; e.g., Hispanic consumers, movers, medical), and
data intermediary—firms specializing in providing connective infrastructure for multiple data sources, e.g., identifying consumers both online and offline.
END USERS' FEEDBACK
This section re-caps key data quality perceptions that were gathered during personal interviews, including anonymous quotes from end-user companies.
Data Quality Generally Good (But Needs to Improve)
End users were mostly positive about the overall quality of data provided by third-party firms, particularly the larger ones who have been perfecting their standards and processes for decades. There was recognition that some firms tended to excel in delivering data for specific industry sectors like automotive versus insurance or CPG, and no one firm scored high grades across the entire gamut of marketplace data. Most important to note, end users felt that third-party data is making their marketing and advertising campaigns more effective; however, they would like to see some improvement in data quality. Among their comments:
“‘Will never be absolutely satisfied with data quality but feel our company has industry edge in performance through data integrations.” (Advertiser)
“Significant lift in effectiveness driven by 1st party data but third-party data generates better lift than standard targeting.” (Multichannel video programming distributor [MVPD])
“Big firms that originate their own third-party assets, they tend to do it very well, use same approach on clients' 1st party data.” (Data intermediary)
“All do a decent job but occasionally make back-end mistakes (calculations, wrong data) when, for example, matching viewer to transaction, lack of quality control.” (Media agency)
Consumer Segments: Top Data Quality Issue
First and foremost, end users expressed concern about accuracy of modeled consumer segments created by data partners for use in look-alike and/or behavioral targeting. They cited the practice of creating target-audience profiles from transactional data sourced from a small base of consumers and extrapolated to the larger population.
End users also sought to understand how effective modeled targets were in driving sales; they called into question sales-predictive behavioral indicators, such as “visited an automobile site” or “read travel content” as markers for automotive or travel intenders, for example. The feeling was that these consumers may be targeted too late, after they've purchased the car or made their travel plans.
Not surprisingly, data recency was cited as another data quality issue that ties back to the ability to identify in-market consumers, as well as capturing timely changes in consumer marketplace dynamics. Finally, it was acknowledged that smaller, low-incidence targets tend to have the highest rate of error when modeling due to the paucity of actual consumer behavior data available to model from. Comments:
“We have used firm A for modeling television show websites visitors and found little agreement in profile with syndicated sources.” (Television network)
“Small modeled targets can be land mines with large error.” (MVPD)
“Recency of information is key, especially for behavioral, in-market aspects.” (TV network)
“Want to identify in-market prospects, propensity modeling; some advertisers want to break out actual target vs. modeled.” (Media agency)
Transparency and Decision Making
End users recognize the fact that there is no perfect data source; there will always be some amount of error inherent in any dataset depending on the quality/reliability of source, data preparation procedures, survey response accuracy, and so forth. But disclosure of the extent of the error would go a long way to inform how the data should be used in making marketing and advertising investment decisions.
To that end, the callout for greater data quality transparency resounded repeatedly during end-user conversations as they sought to understand, for example, techniques used for modeling target consumers, recency of data and quality, and reliability of underlying data sources. Insight into key-driver model variables as well as modeled ROI results versus actual consumer ROI were of keen interest (See Figure 4).
“Data not perfect but want to know how imperfect it is.” (Media agency)
“Need to understand sources of error and how they're are mitigated.” (Advertiser)
“Integrations introduce error, for example, how wrong are they getting personal data?” (Media agency)
To Be, or Not To Be, Transparent
Inconsistent transparency was a sore spot for many end users when it came to answers received from third-party firms regarding data quality. Some data companies were found to be vague or nonresponsive about the techniques and methods used to create propensity or look-alike models, a key area of concern.
End Users—Modeled Target Transparency
Another transparency issue that surfaced was the lack of information about original data sources, especially culled from digital media behavior metrics. In general, larger third-party companies were considered to be more revealing of their approaches to consumer target creation than their intermediate-sized counterparts. Some end users trusted the third-party solutions and tolerated limited transparency, whereas others enforced strict disclosure policies. As one advertiser put it, “If they're not forthcoming in answering our questions, we don't work with them.”
“Insufficient transparency, the more we dig the more inconsistencies and gaps are uncovered.” (Advertiser)
“Vague about modeling techniques and the sources their models are built on; can be defensive.” (MVPD)
“Big companies = lots of transparency. Smaller can be elusive. Companies within AdTech space are not very transparent.” (Data intermediary)
“Firm A won't reveal retail partners. Firm B and Firm C won't break out known vs. modeled results.” (Media agency)
Compound Data Integrations Can Mean Compound Error
When the mortgage crisis hit the United States during the last decade, it was discovered that consumer debt was sold and re-sold so often that it became nearly impossible to trace records back to the point of origination. Similar situations sometimes exist with data integrations, particularly within digital advertising technology targeting tools, where integration may be layered on integration to the point where original source identification is eclipsed.
Furthermore, there may be no quality assurance procedures followed to ensure that these thickly layered solutions deliver superior results; the implication is that data error may be compounded with introduction of each additional data source. And the lack of transparency within and across these layers makes it nearly impossible for end users to assess the sources as well as the data error. It would be valuable to know, for example, which data elements are key drivers of developing a media target, how recent the data are, where they are sourced from, and how accurate they are at capturing the true advertiser target.
“The farther the distance from the original data source, the greater the chance of error; there's no opportunity truly to understand the underlying target elements and their impact on media decision making.” (Data intermediary)
“I'm very suspicious when companies sell pre-segmented packages; data could come from any number of sources.” (Media agency)
“There tends to be transparency for single, independent metric sources but not for multiple sources.” (Television network)
Each data source within an integration comes with an inherent amount of error that end users should know about in order to gauge impact on marketing and advertising decision making. In the digital space, for example, the practice of high-volume cookie pooling can pose a data quality mystery as end users seek to understand information about
cookies' point of origination,
active lifespan,
recency,
whether from a registered versus non-registered site, original versus look-alike, model selection criteria, etc. (See Figure 5).
There are currently no industry audits or standards to ensure the quality of cookies or at least facilitate disclosure about their level of business vitality.
Data Intermediaries: Muscle in the Middle
The task of ingesting, cleaning, and formatting data across multiple sources can be daunting for many end users who do not possess the in-house staff or expertise to handle complex data integrations.
Data intermediaries, firms that sit between data suppliers and end users, often fill the end-user resource gap for prepping data. These companies will implement quality assurance measures and build them into integrated data products they market to the industry, or provide custom work-for-hire where they will simply prepare and format data for delivery to end users. Data intermediaries have a unique vantage point in that they have visibility into multiple third-party data supplier offerings and capabilities.
End Users: Compound Data Integrations = Compound Error
Data Harmonization: Yes, But Quality First
Bearing in mind the proliferation of data integration and onset of automation in the marketing and advertising industry, end users supported the idea of a common set of names for consumer target segments. But many mentioned that the industry should first focus on data quality issues before standardization, citing the current concerns around consumer authentication, modeled targeting, and transparency.
Once these issues were addressed, end users felt that consistent nomenclature would make life simpler by helping to mitigate differences across data coming from multiple third-party firms. For example, one naming convention for “Men-25-54,” “Males aged 25-54,” and “M 25-54” should be made possible.
In addition to common names, it was felt there should be consistent target definition meanings, which likely would work well for standard age and demographics descriptors but be a bit more challenging for targets like “auto intenders” and “travel planners” due to the fluid and customized nature of how these consumer segments are created by third-party firms.
Last, virtually everyone interviewed indicated that their company would be directly involved in any industry effort for data naming standardization, citing the 4As, Association of National Advertisers (ANA), Interactive Advertising Bureau (IAB), and the Advertising Research Foundation (ARF) as industry associations to help steer the process.
DATA SUPPLIERS' FEEDBACK
This section covers how data suppliers address the data-quality issues posed by end users by describing quality assurance procedures they can practice.
Consumer Authentication and Targeting: Top Customer Concerns
Data suppliers were asked to recount the core questions posed by end users regarding data quality. Their responses were consistent with feedback gathered during the end-user interviews: Consumer authentication and modeled targeting were high up on the short list of data quality questions asked of third-party firm customers. According to data suppliers, end users wanted to know, for example, that males ages 18 to 34 years had been accurately classified and gender-validated. Techniques used for modeled targeting, particularly the portion of actual versus estimated data to create the targets, and recency of data also were on the most-asked list, the latter playing a core role in estimating which consumers are in market to purchase:
“Data source, validation. Is it representative, how do you know its Men 18–34?” (Data firm F)
“What is underlying data used to build the modeled solutions and how are the models built?” (Data firm B)
“What goes into the modeled projections?” (Data firm I)
“How recent is the data? Where is it sourced from?” (Data firm B)
Consumer Cross-Checking To Address Authentication Issues
Consumer authentication is a core aspect of data quality. Misidentification or misclassifications of consumers at the very beginning of the integration process could have a rippling effect of error that compounds as each data source is combined with the next. For this reason, most data suppliers will run consistency checks to determine whether the same type of information about a consumer, for example, age or marital status, is in agreement across multiple sources; they also will validate residential address by frequently checking various sources to determine if there has been a recent move.
Cross-checks also will be made against gold-standard CRM data, if available to the data supplier.
On the digital side, data will be checked for valid domain names, e-mail addresses, search preferences, and recency and frequency measures; the goal is to ensure that electronic activity actually represents live consumers who recently have exhibited the desired in-market behavior. Perhaps the most common bridge for identifying the same consumer in the online, mobile, and offline world is a validated e-mail address, preferably checked against first-party CRM data and/or registrations from trusted websites. A repository of credible e-mail addresses enables data suppliers and data intermediaries to tie smartphones and tablets, for example, to households and consumers through a digital footprint, like hashed e-mail.
As an example of how a data supplier might validate descriptive information about consumers (See Figure 6):
The data firm cross-checks Consumer A's age, gender, residential address, and income against six sources of information. The firm found that three of its 20 client CRM databases contained information about Consumer A, and these were supplemented with externally sourced data from syndicated survey research, automotive registrations, digital publisher registrations, and television viewing data. Although some data disparities surfaced regarding age, address, and income, there was enough agreement across the sources to conclude that Consumer A was a 47-year-old male living at 123 Smith Street with a household income of more than $125,000.
General purpose third-party data firms—those offering core consumer data across multiple categories—sometimes conduct studies that benchmark the quality of
Data Suppliers—Authenticating Consumers
household size,
education,
home market value,
household income,
length of residence,
gender and age,
the range of accuracy varied from a low of 60 percent to a high of 78 percent (See Figure 7) after removing records with missing data (See Appendix for calculation detail).
If accuracy rates were provided for each individual attribute, some scores likely would be much lower than the average range reported here. Typically household income and gender can present accuracy challenges; income because of its personal nature, and gender due to much of the data being collected at the household level, which must be attributed to household members.
Modeled Target Creation
Data suppliers were asked about the procedures used for developing segmentation/targeting models for deployment in look-alike targeting. Some data firms resisted this question, citing the need to maintain confidentiality due to the proprietary nature of the modeling techniques. Those who disclosed details did so in a very generic way.
In general, the most common approach was to identify the demographic and behavioral variables in their databases that exhibited the strongest relationship with the desired consumer activity. For example, a target description for those likely to take a European vacation during the next year might look something like this:
Traveled to Europe in past 18 months
Visited European travel sites in past 3 months
Searched online for European travel destinations in past 3 months
Read any travel magazine publications in past month
Has frequent flyer membership with European airline
Downloaded mobile travel apps
Searched mobile apps for European travel destinations in past 3 months
College education
Household (HHI) $125,000-plus
Attended graduate school
Own nondomestic automobile
Reside in the Northeast, Pacific, or Mid-Central census regions
Speak a non-English, European language.
Data suppliers generally have access to thousands of behavioral and demographic variables that can be used to generate a target descriptor list that resembles the one above. As an example of the process for combing the database for target
Competitive Data Study Example: Accuracy Rate (Attribute X)
Validating Modeled Target Segments
Drawing from multiple streams of consumer touch-point data, suppliers have the opportunity to develop propensity models that identify consumers who are likely to buy. Most data suppliers cited the use of traditional “holdout” testing to determine how well the modeled target will perform versus the benchmark of CRM/first-party or other actual sales data (See Figure 9). If the modeled target does not meet the performance standards of the first-party benchmarks, the data supplier likely will test new combinations of variables to improve the model's effectiveness. It's important to note that end users would like to see more transparency of the holdout test results so they can be confident that the modeled targets are driving results.
Data Suppliers: Creating Modeled Targets
Data Suppliers: Validating Modeled Targets
Monitoring Data Fluctuations
The blending of multiple data sources requires keeping an eye out for unusual fluctuations from one reporting period to the next to ensure that integrated data set metrics are sound and stable (See Figure 10). To that end, most data suppliers indicated that they have alert systems set up that trigger when unusual swings in key metrics occur across standard time-period reporting. These alerts are made operational in quality assurance processing software and are activated when a metric falls outside the acceptable range of established normative benchmarks.
Data suppliers stated that once a flag is triggered through their automated systems, a manual investigation takes place to determine the reason for the variation. After vetting the anomaly to understand whether there was an actual change in consumer description or behavior versus some type of error, the data firm will adjust its estimates accordingly.
Independent Testing Of Data Supplier Solutions
In general, data suppliers indicated that they test their modeled targeting and segmentation solutions, using techniques such as holdout samples, to ensure that modeled target performance exceeds the norm (See Figure 11). End users who want to assess performance differences across multiple data suppliers have the opportunity to test and compare the various solutions.
One data intermediary interviewed for the study stated that media agencies are in a unique position to objectively test data supplier offerings: “Agencies can look across the providers with an impartial eye, have the best vantage point for evaluation.” Although media agencies hold a distinct position for evaluating data supplier offerings, they also must meet certain logistics and resource requirements before they can take advantage of their vantage point:
Analytic resources: staff time must be available to design and steward the testing process and results evaluation;
Figure 10Data Consistency Checks
Figure 11End Users: Independent Testing Of Data Supplier Solutions
Number of data suppliers: usually dictated by advertiser client preference and user history, likely to number no more than two data supplier firms;
Synchronized campaign timing: ideally, testing across multiple providers should occur at the same time mix, seasonality of response, special events, etc.;
Matched testing technique: design and execution of test should be the same across all data supplier solutions;
Multiple testing occasions: tests should be repeated to ensure help to understand changes in data supplier targeting or segmentation approach that may impact testing results.
CONCLUSIONS AND NEXT STEPS
The growing number of consumer data sources and complexity of integrations across multiple consumer touch points poses a challenge for end users to evaluate data quality. The primary issue is gaining a better understanding of underlying quality of data to inform how best to deploy for advertising and marketing decisions.
The author of this study recommended the following steps to be taken by the marketing and advertising industry:
Establish Principles Of Data-Quality Disclosure
Following are guidelines for gathering the information necessary to understand the quality level of datasets and data integrations. The purpose of these principles is not to prescribe any specific sources or approaches to data suppliers, but rather to encourage richer transparency so that end users can make informed decisions on which data products and services meet their standards requirements.
Tolerance of data precision and quality may vary across end users and within specific marketing and advertising questions they're looking to address, so clear transparency is needed to ascertain how close or far afield a data source or integration may be from fulfilling end-user needs.
Data Source
Very first point of origination, including intermediary specialty firms sub-contracted to provide feeds into commercial-ready data products
Data Collection Technique
Method of data gathering, for example, pixel tracking, scanning, sensor, internet survey, telephone survey and other monitoring
Data Integration Matching / Aggregation Method and Key(s)
Matching methodology, e.g., direct household-/device-/individual match, geographic cluster aggregation (ZIP+4, PUMS), etc.
Specific data fields used as matching keys to combine multiple datasets—e.g., name, physical address, e-mail address, IP address, device MAC address, software device identifier, ad identifier, etc.
Initial Cleansing, Processing And Formatting
Correction and/or removal of inaccuracies, misspellings, inconsistencies, contradictions, disparities, data entry mistakes, missing fields, etc.
Recency/Frequency of Data
Freshness of data, e.g., purchases made during the last week, month, quarter, etc.
Depth of activity, e.g., how often purchases were made during the last week, month, quarter, etc.
Consistency Over Time
Monitoring for significant changes over time that might surface due to new data sources, collection techniques, etc.
Consumer Authentication
“Truth” data source(s) to validate description of the current and correct household and/or household members (minus PII) in the dataset
Cross-checking systems that help establish the accurate description of homes and persons (minus PII) through multiple source validation; how they work, how often they are implemented, and at what level of data granularity
Propensity or Look-Alike Modeling Techniques
methodology description of modeling approach
underlying data sources
underlying metrics or variables
recency of data sources, metrics, and variables
sample size composition of modeled purchaser target versus actual purchaser target
Propensity or Look-Alike Modeling Validation
methodology description of validation approach
effectiveness of modeled target purchaser versus actual purchaser target
frequency of model validation.
Request Transparency
End users should ask for disclosure of a list of items from the Principles of Data Quality Disclosure roster that meets the needs of their specific marketing and advertising scenarios.
Create Best Practice Standards
The advertising and marketing industry should draft acceptability guidelines for prepping and integrating multiple data sources.
Involvement of data suppliers is crucial for shaping guidelines.
ABOUT THE AUTHOR
Gerard Broussard is founder and principal of Pre-Meditated Media, LLC, an advertising research advisory firm in Norwalk, CT. He specializes in digital media, audience measurement, return-on-advertising-investment modeling, media strategy, and third-party data integration. Additionally, he has consulted in the areas of programmatic television, addressable television, automatic-content recognition, and mobile measurement. Broussard's career in media insights and analytics spans work in cable television and digital media, including 13 years at WPP/GroupM. He authored the current study on behalf of the Coalition for Innovative Media Measurement. He also is the author of A Primer for Defining and Implementing Big Data in the Marketing and Advertising Industry (Council for Research Excellence, 2014).
APPENDIX
Data Supplier Accuracy Rates
Third-Party Data Suppliers
Footnotes
Editors' Note:
When founded in 2009, the Coalition for Innovative Media Measurement (CIMM) was named for its mission: “to promote innovation in audience measurement for television and cross-platform media.” This New York City-based organization—a group of television content providers, media agencies, and advertisers—collaborate on research in their quest for “new methodologies and approaches to audience measurement.” In 2015, CIMM hired media consultant Gerard Broussard to interview data providers and end users to develop guidelines for assessing the data quality of various data enrichment offerings. Data providers increasingly are delivering purchasing, demographic, and lifestyle data for matching to digital and television usage behavior for segmentation, media targeting, and ROI analyses. Broussard's report, which CIMM published on its website last June, recommends greater disclosure and transparency to improve data quality and standardization of nomenclature and metric definitions to support the expected growth of programmatic media transactions. We trust you'll find this adapted excerpt both useful and inspiring.
↵1 “Marketers Question Quality of Ad-Targeting Data Providers.” The Wall Street Journal, February 23, 2015.
↵2 “Kraft Tackles Advertising's Data Integrity Issues.” Warc, January 20, 2015.
↵3 “Measuring the Business Value of Data Quality.” Gartner, October 10, 2011.
↵4 “Sparsity” is defined as the degree to which variables in a dataset exhibit a scattered or weak relationship.
- © Copyright 2015 The ARF. All rights reserved.