Analysis

CPG data analytics, part 2

Analysis

CPG data analytics, part 2


As mentioned in the preceding article, data analytics have quickly become indispensable for many sectors, including omnichannel CPG.

Machine learning and the data it generates can improve everything from micro-targeting to supply chains to syndicating user-generated content and more. The insights data analysis provides to brands is essential to being competitive, but most manufacturers are not yet harnessing all that information effectively.

This article shows how CPGs can make good use of the information data analytics provides.

If you’ve not read the first article in this series, it’s a good introduction to this one. It introduces the topic and discusses data management, reliable data, the potential therein for CPGs, and the sources of data they typically use or require.


Establishing a data strategy

A data-driven culture that trickles down from the CEO is an essential first step to harnessing the potential of data. Once the buy-in of leaders is established, companies need to assess their existing data strategies (if they have any) and see what is and what is not working.

Most companies will have an idea of what isn’t working. To dig deeper into what’s needed, the next step is to be as clear as possible about what the objective is. Once this is established it will be possible to determine whether or not data can meet that need. 

Many CPGs try to solve these data needs internally and usually they don’t succeed. Partnering with an experienced analytics solution provider is recommended. They will guide the CPG in creating a digital and analytics road map, refine use cases and deliver results in a measurable way. 


Data lakes

One of the most effective ways of handling data is a data lake. Data lakes are data management platforms that hold, process, and analyze both structured and unstructured data. (Structured data is predefined and formatted, and as such is easily searchable. Unstructured data is data that is stored in its native format–of which there can be many.)  

There are several benefits to data lakes. One is that they can provide users with direct access to raw data without significant IT involvement. Another is that data can be collected and later used for new uses. 

Data lakes are also cost effective because data is only reconfigured when needed.

As companies develop their data lakes they must keep reviewing functionalities like how many concurrent data users do they need to be prepared for, how old are the development tools they’re using for the data, and are their analytic tools up to date?

Ideally, a data lake should be populated with highest-priority business uses first, incrementally, so the company can tackle them as needed. At the same time, to make them a productive growth source, they require regular cleansing and movement. 

Multiple teams can participate in the use of this method of data management. The collaboration inherent in the agile methodology is ideal for data lakes. It can help create a shared forward path for the data lake while instilling a data-friendly work environment.


Data lake development

Since data can drive incredibly impactful results, looking too long for the perfect data lake solution can be a mistake. Opportunities to deploy analytics programs that support digital sales, marketing, new product development, supply chain management, etc, could be lost in the rapidly evolving omnichannel environment. That said, consult with a solution provider before creating a data lake if you don’t already have one. A data lake is not a solution to every analytics problem. Also, a typical problem facing companies is securing data scientists once the data lake is created. It’s ideal to onboard these key players as early as possible. 

McKinsey has identified four typical stages of data lake development companies often go through:

Stage 1: A landing zone for raw data 

At this stage, the data lake begins to get populated separately from core IT systems. It functions as a low-cost, “pure capture” environment which is also scalable. The data lake is largely a reservoir for raw data that can be kept indefinitely or prepared for use in analytics. 

Stage 2: Data science environment 

Here, CPGs start to experiment with their data lake. Data scientists typically appreciate the rapid access to data—and run experiments such as building prototypes for analytics programs.

Stage 3: Offloading to data warehouses

At the next level, data lakes start to get integrated with existing data warehouses. Due to the low storage costs of a data lake, companies can use “cold” or inactive data to generate insights.

Stage 4: Critical component of data operations

At this stage of development, much of the information that comes into the CPG goes through the data lake. The data lake is a central part of the data infrastructure, and has replaced existing data marts or stores, enabling data as a service. Companies use the data lake to conduct advanced analytics or to deploy machine learning programs.


Data analytics challenges

The most common reason for a failure to manage data properly is neglecting to connect digital and analytics programs to the enterprise strategy.

Another common mistake is investing in analytics before thoroughly thinking through a strategy and use cases. Often companies aren’t precise about what a data lake will enable, or they invest in attempting to harmonize their existing tech stack which quickly becomes outdated.


The next generation of CPG data analytics

For CPGs, digital shelf tracking data is the first generation of analytics. As they become more commonplace, a second generation has emerged. 

The second generation uses location-based data as mentioned earlier in combination with sales data. When location-based data (or comprehensive digital shelf data) is combined with sales data it becomes causal data and is particularly powerful. 

These analytics are called performance analytics because they can indicate which digital shelf levers are responsible for sales performance. As the causal data changes according to retailer, performance analysis analytics show CPGs which causal levers to push to drive sales at individual retailers. They have a predictive capability which CPGs can use to forecast sales performance, and measure against actual sales.