“Bad data like poor targeting leads to missing the target”

Regardless of the type of business niche, it appears that the application of Big Data is consistently discussed by most if not all medium to large size companies.  There is a continual effort to design and build data infrastructure and retain petabytes of consumable data covering nearly every detail of customer behavior.  Best practitioners of data collection and warehouse strategies agree that data must be contextual, precise, timely, and appropriately maintained to ensure the efficacy of the information from both an operational and risk perspective.

Somewhat lost in this seemingly madhouse quest to capture “anything data” are the basic questions concerning why and what is the intention of the data and most importantly, is it “good quality data?”

In this blog, my intent is to share several thoughts regarding some of the real-life data quality challenges  many organizations may face in their quest to obtain “good quality data.”

Below are three examples where using data can become complicated and frustrating:

1. Customer Relationship Management (CRM) – “The Devil is in the Details”

Engaging a prospective or existing client base through various communication channels is critical to the success of most business models. However, the capability to keenly interact with customers through their “preferred channel” is highly dependent upon the precision of client information collected by the platform owner.  For example, the website provider can request a cell phone number, email address, and street address of a respondent, but wouldn’t this process be more effective by collecting and validating information during the early stages of the prospective customer relationship?  These minimum requirements cover the “collection phase” of the information gathering; but a critical question arises concerning the “validation” step.  If the website collected information proves inaccurate, the information provides limited value.  

Another example is the treatment of email format characteristics.  A failure to validate this basic information seemingly nullifies the effort to collect it. One simple but effective solution is to validate the “Top Level Domain” (i.e. .com, .co, .us) to ensure the likelihood that an email address is valid, or pursue validation a step further by sending a test email to the customer for “click on” verification.  The same type of format based validation can be implemented for phone number and addresses.  How comprehensively one executes effective data collection and validation during those first critical steps is a key factor in the success of CRM strategy.

2. Operational Strategies and Decision Support Data:  

To become a successful “data driven” operation the information must be dynamically accurate and timely. Many businesses center decision making on key performance indicators (KPIs) such as sales counts and volume, customer profitability, time series, or geographical location. Correctly manipulating this information ultimately drives marketing and advertising strategies, leading to either expansion or contraction of product lines, or  improved understanding of the cyclical nature of the business.

As businesses look towards expansion, their supporting operations require dynamically gathered consumer information for refining strategies and realizing significant growth.  This requirement may include transaction data such as time of orders placed with an overlay of a staffing plan to ensure that orders are fulfilled within quality service standards and capacity requirements.  A business may additionally need to know which orders are placed “by the consumer” (BTC) versus other business channels (B2B).  This type of information can certainly drive hours of daily operations and product pricing.  The additional foresight dedicated to ascertaining long term consumer information needs up-front will certainly increase the likelihood that critical information is available at the right time in the product fulfillment process.  Possessing the capability to establish and maintain metric driven timelines and trends or dimensions historically instead of waiting to collect the data is critical.

3. Application Development:  

Application Development is an area that is frequently overlooked in terms of its critical impact on interpreting customer responses, data collection, and ultimately decision management.  A typical misstep is to develop a website from solely an end-user perspective without considering other dimensions such as capturing accurate customer responses, reporting findings based on internal consumption needs, response analysis, and strategies in which the collected data will be accessible and utilized.

Consideration to what targeted information is required and how it will be utilized to run the business is critical to the ultimate success of the company.  Let us imagine a hypothetical website design with a simple introductory question “How did you hear about us?”.  To many website developers, the initial design should clearly capture a customer’s response in text form.  But what if a customer’s response is “by word of mouth” while another responds by entering the word “referral”?  Both responses are loosely similar; but your marketing or advertising teams may have a different interpretation of these responses.  You may have to further add clarity by providing respondents with  a “drop-down” list of options.  This modification may improve the chances that customer responses can be standardized and easier to compile for analysis.  

What happens if the term “referral” is added to the selection list, but your Customer Experience team concludes that many respondents do not know what the term “referral” means and that the choice “word of mouth” makes greater sense.  You decide to change the value in your drop-down list to “Word of Mouth”, but you will also need to modify previous customer responses captured under “referral” to appear as “Word of Mouth”.  This is where many developers create a “lookup” of some type where the selection options are mapped to a unique numeric value.  In the “data store” for the web site, the numeric value is stored, not the value appearing in the drop down list.  This simple solution allows you to change unique numeric value 3 from “referral” to “word of mouth” without changing any historical data.


Data Quality is one of those areas where “you get out of it what you put into it.” Focusing on data quality during the early stages of data collection will result in significant improvements across all aspects of your business.