10+1 Reasons Why You Should Not Build a Customer Data Platform (CDP)

8.7.2019
clock-icon9 min read

An idea of building your own customer data platform came into your mind? Have you thought about the risks you will need to expose yourself, the efforts you will need to give, and how much time it will consume? If not, you need to read this!

author image
Vojtech Kurka

CTO | Co-founder

This is not to say, you don’t need a CDP. You probably do. If you are not sure why read this article. This is simply a summary of our years of experience building one, distilled into a Q&A article that is hopefully as fun as it is informative. In essence, this is an article for anyone wanting to build a Customer Data Platform, a product that is not their main focus. So, let’s say you sell tires. Does it mean you need to build the e-commerce solution, advertising platform, emailing platform, or CRM yourself? Of course not. Why would you build a Customer Data Platform on your own, then? reasons-to-not-build-own-cdp.gif Below, a few questions with answers and some other notes for you, from the guys and girls who built a CDP.

1. Resources spent.

For sure building a new product is fun. You make a bunch of new features, your users (employees and co-workers) use them. All seems great until one day it is not. Because you realize you spend more time on bug fixing rather than on developing new features. It happens to all software. For the first few weeks or months, you spend 20% of your time on bug fixing and 80% on new features. In about a year, this goes sideways - you spend 80% of your time and resources on bug fixing and 20% on new features. And you NEED to do that because your users rely on it. This means keep the engineering resources locked on the product. In other words, maintenance is expensive and hard to budget for, because all plans are optimistic. If you didn’t reach scale to justify this cost and if you are building this as an internal solution for your company, surprise, surprise, you are never going to reach the scale that would make this economical.

2. Re-inventing the wheel.

When you are building something from the ground up, you will run into the same problems as someone before you. And if that is not a problem covered by some open-source project or other good samaritans, you are in this on your own. Having someone go through all the problems before you and solving them using their own money, is priceless. When you are buying a product, you don’t just pay for the software you see and get to use, you also pay for all the learning, the good and bad that comes with building it so you don’t have to relive those lessons and can get down to business right away and you save the most valuable asset of all - Time!

3. Data model.

Each product needs a data model. I guarantee that the first version of the data model you create will not be the last and you will go through countless iterations, changing and breaking it. Speaking for Meiro, I think we are at version 367, I can’t tell exactly. We lost track after version 200. For example, you will find out that you have too much data in there and you need to reduce it. So you start thinking and you go into: “Let’s not store zero data!”. This means that you will not store “0 number of transactions”, you just omit the record in the database if the value is 0, NULL or empty string. You save some rows, but now your conditions do not work… So yeah, been there, done that.

4. Customer profile stitching and identity resolution.

Looks easy from the start - Let’s just use foreign keys in the event data in some SQL joins. Nope, does not work, because customer entities can evolve and there is no join in SQL with proper recursion. You start to write it in Python. You hit the wall with memory consumption. You optimize it for CPU, now it runs too long because of I/O. You opt-in for graph database only to find your data is too big for that. Solved all of that? Welcome probabilistic profile stitching!

5. Monitoring & Alerting.

When you build a customer data platform data model, you integrate the data, build up the data pipelines as cron jobs. Now every other morning someone calls you to tell you there is “no fresh data in CDP”. And you are the last to know about it. You need bulletproof monitoring and alerting about the performance not only of your infrastructure, the application itself but also about the data quality and regression testing of data.

  1. Integrations updates.

If you are loading data from, say, MailChimp, everything works great until it does not. One day, data stops coming. After reading logs you find out that the API changed and the old version is deprecated. And for each data source, this happens every 6 months. Good luck with keeping up!

7. Security.

You build yourself a CDP with users, some permissions system and you store ALL OF YOUR CUSTOMER DATA in it. Are you sure you do not have any SQL Injection, CSRF, a man in the middle, XSS, and approximately a gazillion other vulnerabilities in your code? Are you ready to bet your business on that? Is your compliance and security officer? Ask yourself a question: Who will have better security? A homemade product which only my company is using and is developed by a small, internal team? Or a product used by hundreds of customers in dozens of industries? A product that has gone through multiple security scans by banks, insurance companies, and others, not talking about external penetration test? You know who my money is on.

8. The team.

I get it that super talented and skilled developers with a background in data can do a lot but we are talking enterprise product. You are going to need a software architect, data architect, data engineer, backend dev, frontend dev, tester, documentation writer, product owner, and perhaps a business stakeholder. Sounds easy enough to put the team together but with the current lack of developers in the industry?

9. Documentation.

As your company is the only user of your product, there is no big incentive to write proper documentation. So you cut some corners to develop more new features faster instead, because, who likes admin right? Then, one, two, or three of your key people leave the company, because they are tired of “bug fixing all the time ”or “ no time for new features” drill. You hire new guns, but they do not know the product and tell you: “this is a mess, we need a rewrite”. See you in a year. You have just lost a year, which you could have spent doing actual work. At Meiro, we invest heavily in detailed documentation of everything we do. That is the only way to keep your product growing sustainably.

10. Data Science - Data and Science.

You know the famous saying: “Garbage in, garbage out.”, right? Well, with advanced analytics algorithms this is ten times more applicable. You will not develop a great algorithm without good data. The more data you have, the merrier. Who will have more data for better algorithms? You as a single company, or a vendor servicing dozens of clients? That is “Data” in Data Science. Now the Science part. Do you know who creates these specials algorithms? Scientists. Are you sure you can keep them busy, motivated, and driven with only your own use-cases? Are they going to be interested in working only on one problem in one domain? In this market where great Data Scientists are paid with gold?

Bonus reason - Security

You are going to say that customer data is the most sensitive and valuable asset your company has and I will agree 100%. You are also going to say that because of that you simply can’t allow a software vendor, an external company, and its employees that could at any point pivot, go away entirely, get acquired, go through security incident, get hacked, or go through any one of another hundred nightmare scenarios, handle your most strategic asset in the long term. I will also agree. You will probably also say that you don’t want to put your customer data into a location you have no direct control over, despite it being the cloud with the best safety track record. I will also agree with that. Exactly for those reasons have we built Meiro Customer Data Platform in a way that allows our customers to install and run it in their own environment, wherever that might be any cloud provider, any location. Even on-premise. We don’t even need access and you will control everything including software updates.

Conclusion

My final advice:

You need to keep moving. If it ain’t your core business, buy it, don’t build it. Ask all senior execs, they will tell you. This is why we do not build databases, programming languages, message queues, UI kits, logging systems, documentation systems, version control systems, CMS, CRM, emailing platform, communication tool, reverse proxy, web server, data formats, authentication methods, wikis, roadmap systems.

At Meiro, we build a Customer Data Platform. That’s what we do.

Disclaimer: I am the CTO of Meiro, a CDP vendor. As you can tell from the tone of this article, I have a very strong opinion, but throughout my career, I was fortunate to stand on both sides of the fence so I have the needed perspective. I am also lucky to have assembled a world-class team to build our CDP, so I am biased. But I mean every single word I have written and I stand by it by 100%. If you beg to differ, I would love to talk to you. You can reach me at v@meiro.io

Ready to take your personalization game to the next level?

Unleash the full potential of your customer data. Let’s talk!

Spread the love:

Image
Vojtech Kurka

As nerdy as they come, V holds the R&D fort in Brno. He is about all things data engineering, analytics and data processing technology. When he is not doing that, he is usually obsessing about coffee or motorbikes.