Does your data have a passport?
A passport does many things – it establishes unique identity, origin, prior destinations, and ultimately it helps determine entry to the destination. For a traveler, it is the primary document to establish trust.
Only 1 on 3 business executives trust the information used to make decisions. 1 in 3. Many research papers cite data growth as a challenge, but perhaps the greater challenge is to understand and trust the growing number of data sources. The growth of sources, points of integration, and complexity further drives the need for a “data passport”. Does your data have a valid passport?
Here are 4 questions of trust that a data passport will answer.
1 – Where did this data come from?
The data passport must identify origin, one of the pillars of trust. Business users may trust an address from a billing system over a marketing system. You’d think this would be easy – you could simply ask where the data came from in a certain report. But … who would you ask? IT? And is IT really staffed to handle “passport questions”? Think about a real passport; if a customs officer had to call your country’s consulate to verify your identity, how efficient would that be? Just like a real traveler, data needs to carry its passport with it.
2- Where has this data been?
Like a real passport, a ‘data passport’ needs to have stamps to indicate where the data has been on its trip. In other words, to document the transformations (standardization, verification, matching) and combinations (combining data from system A with system B) that have occurred since the data left the point of origin. This is another pillar of trust – if the data has been standardized and verified, the business user will trust it more.
3 – What is the data profile?
Real passports have notes, and customs officials also maintain notes on an individual. Have they previously not claimed goods? Have they travelled to watch-list countries? The same concept applies to data. Establish a quality profile for each point of origin (source system). Source profiles are a pillar of trust, and help you combine data from multiple sources (survivorship); when the same data comes from more than one source, which one do you trust?
4 –Admitted for entry?
Ultimately a passport helps customs determine who may enter a country. The same is true of a data passport – it determines which data should be permitted for entry, and usage, in enterprise applications and data warehouses. But this is when the analogy breaks down. Once you enter a country, you put your passport away. Each citizen that you encounter determines whether they trust you without seeing your passport. But a data passport has a life after ‘crossing the border’ – the data passport should be surfaced to each and every business user to foster trust. When a business user is reading a report, running an analytic query, or viewing data in an enterprise application, they should be able to access the data passport. Business people want to know where the data came from and what happened to it. Trust isn’t established at delivery. Trust is established during usage, when a business user accesses the data and decides whether they trust it. That is why a data passport needs to be a living document – accessible on demand to end users to establish trust.
A data passport may be an interesting analogy, but is it anything more than a concept (that requires a lot of manual work)? It is. Technology can help you manage the data passport and improve trust in information. Data integration and quality technology discovers, profiles, cleanses, and delivers data. With integrated metadata management, it can record the point of origin, the transformations and combinations, the glossary of business terms, and the delivery of data. It can also surface that ‘data passport’ via services APIs (SOA), so that it may be consumed by business users on demand.
A data passport is the foundation for information governance. If the fundamental promise of governance is “Business users will trust information”, then a data passport, or enterprise metadata management, is a mandatory requirement for any governance initiative.
Join me on an upcoming webcast when I will present with Gartner’s Eric Thoo on Data Integration Styles: Choosing an Approach to Match Your Requirements.
Register here http://bit.ly/KLhZ7J for the Gartner webcast on June 13
About David CorriganI’ve spent my entire career helping clients utilize emerging technology to solve their customer data problems. I've always enjoyed solving abstract problems. I've worked with hundreds of companies to utilize new technology, plan and drive to a roadmap, and evangelize and drive momentum for their information projects. During the day, I work on product strategy and marketing for @InfoTrellis, and I'm busy trying to disrupt the customer data and analytics market so that organizations can finally understand every single one of their customers. After hours, I like to take photographs, read, write, practice yoga, or watch soccer - Manchester United and Toronto FC are my teams of choice. Follow me on Twitter @DCorrigan or on LinkedIn at http://ca.linkedin.com/pub/david-corrigan/3/aa3/92.
- 360 view
- Big Data
- customer 360
- Customer big data
- Customer context matching
- Customer data
- customer intelligence management
- customer personalization
- Data Confidence
- Data Quality
- Data Warehousing
- Hadoop Systems & Analytics
- Information Governance
- Information Integration
- Information Lifecycle Management
- Master Data Management
- omnichannel personalization
- Privacy and Security
- Stream Computing
- Visual Context for Data
- Interesting read, & I've seen AI applied to form an identity graph for customers/prospects for better targeting.… twitter.com/i/web/status/1… 3 weeks ago
- Good read, and aligns with other stats that suggest that 67% of customers will churn after 1 bad experience - even… twitter.com/i/web/status/1… 3 weeks ago
- An interesting but also troubling read. There's huge upside to automation in industries like insurance, not to men… twitter.com/i/web/status/1… 1 month ago
- Have to retweet & elaborate - #graphbd is a great technology but the queries can be intense - part of the value of… twitter.com/i/web/status/1… 1 month ago
- RT @AllSight: "By putting intelligence before insight, you have the right context to get the right answers." - @DCorrigan #CIP #BigData #Da… 1 month ago