What is the Big Deal about MDM + Big Data?

Mark Beyer of Gartner spoke about the emerging relationship between Big Data and MDM at the IBM Big Data, Integration and Governance Forum.  His session had a provocative title – “Big MDM?” – i.e., is MDM big data?  Several interesting discussions with the audience emerged.

He highlighted the two major use cases for MDM and Big Data integration – extracting master data from  big data, and using MDM as a ‘starting point’ as you mine big data (see this recent blog post – http://bit.ly/HOJJFI).  Organizations could potentially accelerate their initial MDM implementations by extracting master data from previously untapped big data sources.  For example, a company may want to analyze SEC filing documents for risk exposure, to understand their organization customers, their financial health, and key individuals at those companies.  Implicit is the notion of master data, to determine unique records for organizations and people, and the relationships among them.  The danger in big data projects lies in not recognizing the requirement for MDM – and treating data quality, matching, and storing unique records as a “one off” tactical task.  It isn’t. 

 The second use case features MDM as a starting point, or as Mark described it, a “search index” for big data.  Start with master data concepts and then analyze new sources of data for specific master data records.  Don’t analyze all customers, analyze the most valuable ones.  Don’t analyze all of your products, analyze the most profitable ones.  This may initially be expressed entirely as an “analytics” requirement from business owners.  I recently visited with the CIO of an entertainment and betting company who’s CEO set a direction to “analyze social media to understand potential online bets their customers might make.”  Wait a second – what does that mean?  Which customers?  What constitutes a “betting event”?  And how will you respond in time to capture that opportunity?  That company realized they didn’t have the answer to the first and most fundamental question – who are their customers?  There’s no point in analyzing all available social media feeds and then determining who your customers are.  There are 2 billion internet users globally.  How many customers do you have – less than 2 billion?  Doesn’t it make sense to start the other way around?  Know what you’re looking for before you start looking. 

In both scenarios, the initial big data analytics project may require master data, only many organizations do not realize it. 

Here are 3 clues that indicate when you should integrate MDM and Big Data:

  1. You are searching and matching for the same entity types over and over – If your big data project requires you to know whether a social media blogger is a customer, and you will run this same determination every time an interesting social media post is detected, then you have a master data problem – you need to know your customers. 
  2. You are performing targeted analysis, not an aggregate analysis – When you are looking for particular product feedback to respond to isolated incidents vs. general sentiment towards your brand, or you are looking for a particular customer’s multi-channel service experiences vs. tracking the general service levels, then you have a master data problem – you need to know specific customers and products in order to guide your big data analysis. 
  3. You want to combine the analysis of multiple master data domains from new big data sources – If your big data use case involves matching multiple data domains and gleaning new insights from big data sources, you likely have an MDM requirement.  For example, telecommunications companies are increasingly interested in mobility – understanding the location of mobile devices and the potential implications (selling new products, proactive service alerts, etc).  In order to realize that use case, the telco will need to understand unique accounts, devices, customers, households, and locations.  That’s a multi-domain MDM problem to be sure, and MDM can be a great starting point for big data analytics. 

The answer to Mark’s provocative title “Is there Big MDM” was no, there isn’t an MDM technology that stores all big data and there shouldn’t be – that’s what a big data platform is for.  But there is absolutely a need to integrate the two, and it’s often overlooked.  Big data analytics will certainly encompass the most important concepts within your organization – customers, products, prospects, accounts, locations, suppliers, among others.  Those are all master data concepts – and therefore MDM is a good starting point for many big data projects. 

Don’t reinvent the wheel.  Make sure that big data leverages your existing enterprise technologies, MDM being just one of them.

Tags: , , , ,

About David Corrigan

I’ve spent my entire career helping clients utilize emerging technology to solve their customer data problems. I've always enjoyed solving abstract problems. I've worked with hundreds of companies to utilize new technology, plan and drive to a roadmap, and evangelize and drive momentum for their information projects. During the day, I work on product strategy and marketing for @InfoTrellis, and I'm busy trying to disrupt the customer data and analytics market so that organizations can finally understand every single one of their customers. After hours, I like to take photographs, read, write, practice yoga, or watch soccer - Manchester United and Toronto FC are my teams of choice. Follow me on Twitter @DCorrigan or on LinkedIn at http://ca.linkedin.com/pub/david-corrigan/3/aa3/92.

One response to “What is the Big Deal about MDM + Big Data?”

  1. dborean says :

    This and your other posts on MDM – Big Data are excellent and to the point.

    MDM integration with Big Data is similar to MDM integration with an Enterprise Data Warehouse. In an EDW you want your key dimensions such as Customer, Product, Supplier, etc, to be cleansed and high quality otherwise how can you trust your analytics? Therefore, it is valuable (and common) too feed master data to an EDW, and likewise to feed insight back to MDM.

    Similarly when required, master data must feed big data platforms as described in your posts. Quite often, I would expect that master data is a subset of big data.

    There is one additional point I’d like to make, extending on your posts. There is more and more talk about the fact that big data must be treated as an enterprise asset with proper governance and controls. We must embrace best practices with MDM and DQM and apply it to Big Data. Call it “Big Data Management” if you will.

    Often we use the term “ENTERPRISE Data Warehouse” because through hardships we’ve learned that we shouldn’t have many data warehouses (as it results in data duplications, poor quality, inconsistencies, etc). The data in the warehouse is an enterprise asset and it has many uses (through data marts, for example).

    We are in the early stages of adopting big data and we have the opportunity to learn from the past and get it right the first time and NOT create big data silos, ignore data quality and move forward with no governance.

Leave a reply to dborean Cancel reply