🗓️ Live Webinar November 9: How HealthMatch.io Used Customer.io and RudderStack to Launch Their New Business Model in 24 Hours

Pricing
Log in

Blog

COMPANY

The Future of Customer Data Platforms: Unbundling the Right Way

Blog banner
Subscription

Subscribe

We'll send you updates from the blog and monthly release notes.

Soumyadeb Mitra

Soumyadeb Mitra

Founder and CEO of RudderStack

June 01, 2022

To understand the push to unbundle the CDP, it’s important to understand why we’re here in the first place. In part one of The Future of Customer Data Platforms, we unpacked the limitations of the traditional, bundled approach to the Customer Data Platform. Giving a nod to Chesterton’s Fence, we articulated why the bundled CDP came about and detailed why it’s not fit for modern use-cases. This analysis led us to the conclusion that it is time to unbundle the Customer Data Platform, but there’s more than one way to unbundle your CDP.

The key motivation for unbundling is to expose customer data so that data analysts and scientists can build more sophisticated use-cases such as marketing attribution, churn prediction, and product recommendations. Ideally, this would be done without having concerns around syncing customer data between your CDP and your cloud data warehouse. However, unbundling comes with trade offs. Unbundle the wrong way, and you run the risk of creating more problems than you solve.

Here, we’ll unpack two different approaches to unbundling and make a case for a three-layer decoupling rather than a wholesale unbundling.

Unbundling the CDP the wrong way

As represented by the different colors in the diagram above, one tempting way to unbundle the CDP is to use a separate product for each of the boxes:

  • Streaming (and real-time transformations)
  • ETL
  • Warehouse Transformations
  • UI/Segmentation
  • Activation/rETL
  • Storage

This approach makes for a pretty diagram, but putting it into practice is a nightmare. For starters, procuring tools from multiple vendors creates an unnecessary management burden. You have to update config in multiple tools to set up an end-to-end connection from a source to a cloud destination. And this leads to observability and debugging headaches. If something goes wrong, you’ll have to check multiple dashboards, and you could end up having to connect with multiple support teams to fix the issue.

More importantly, these glued together solutions do not support some critical data paths. Because the tools don’t talk to each other directly, they require the data warehouse as an intermediary (as a source or a destination), which means they cannot support real-time streaming use-cases. This is a deal breaker for any company that may implement real-time personalization now or in the future.

Also, because each vendor only sees a fraction of the customer data, critical transformations like identity stitching and creating the user-table (“customer 360”) require custom SQL/python code. This non-trivial lift is offered out of the box by bundled CDPs.

Unbundling the CDP the right way

To create the most robust and capable customer data stack, we believe unbundling is best accomplished by decoupling three layers:

  • Storage layer: By decoupling the storage layer, you eliminate data silos to avoid different versions of the same customer data stored in different tools. This is the whole premise behind the warehouse-first approach.
  • Transformation layer: Providing flexibility around user-defined transformations is critical in enabling customers to unlock basic use-cases (such as computing total_revenue) to advanced use-cases (such as computing predictive features).
  • Integration, Real-Time Transformation, and Activation layer: At a high level “integration” refers to the  movement of data from source to destination. Reverse ETL (or “activation”) is a subset of integration and refers to moving data from the warehouse to a downstream destination. Decoupling these two results in integration challenges which typically surface when you want to add a new field to the customer record - if decoupled, you have to make configuration changes manually across your entire stack. We include real-time transformations here as the data is transformed while it is “in-flight”.

Keeping the Integration, Activation, and Real-Time Transformation layer interlinked, as shown in the diagram below, is the key.

Delivering Integration, Activation, and Real-Time Transformations on the same platform provides a number of benefits:

  • Seamless Integration into Marketing and Product Tools: You can integrate from every data source to every SaaS destination without having to set up configs across multiple tools.
  • Support for real-time and batch: Every CDP use-case is unlocked, including those requiring real-time integration into marketing and product tools.
  • Automated Identity Stitching and Customer 360: You get the promise of a complete user table without having to manage complex data models and pipelines stitching data between different vendors. Because the platform understands all the data models for every pipeline, it can deliver identity stitching and customer360 out of the box.
  • Single Observability Plane: Ensuring the health of your data pipelines is easier with a single observability plane for monitoring, alerting, and triaging issues. It reduces engineering overhead and streamlines the triage process, accelerating time to resolution.

At RudderStack, we’re solving the integration pain-point regardless of source or destination, unconcerned with the pipeline nom-du-jour. We believe customers are best served by a transparent, flexible, and extensible integration solution that enables data teams to solve their most pressing pain-points (e.g., getting data from point A to point B) without losing sight of future business needs (e.g., predictive modeling).

Future-proof your stack with RudderStack’s warehouse-first architecture

Sign up today to get started for free

Soumyadeb Mitra

ABOUT THE AUTHOR

Soumyadeb Mitra

Founder and CEO of RudderStack

Recent Posts

COMPANY

A Practical Guide to The Modern Data Stack: The Data Maturity Journey

By Eric Omwega
COMPANY

Why it's hard to build a 360-degree view of your customer

By Soumyadeb Mitra, Eric Dodds
COMPANY

It's Time for the Headless CDP

By Soumyadeb Mitra, Eric Dodds
arrow

See all posts

Subscription

Subscribe

We'll send you updates from the blog and monthly release notes.

Get Started Image

Get started today

Start building smarter customer data pipelines today with RudderStack. Our solutions engineering team is here to help.

Sign up for freeGet a demo

COMPANY

  • About
  • Contact us
  • Partner with us
  • 🚀 We’re hiring!
  • Privacy policy
  • Terms of service

JOIN THE CONVERSATION

Learn more about the product and how other engineers are building their customer data pipelines.

Join our Slack Community

READ OUR DOCUMENTATION

Technical documentation on using RudderStack to collect, route and manage your event data securely.

Go to Docs

© RudderStack Inc.

This site uses cookies to improve your experience. If you want to learn more about cookies and why we use them, visit our cookie policy. We’ll assume you’re ok with this, but you can opt-out if you wish Cookie Settings.