You Can’t Forget What You Can’t Remember

In order to forget something, first you need to remember it.  That simple premise will cause organizations a great deal of pain as consumer privacy legislation takes effect.

The concern about consumer data privacy is at an all-time high.  70% of Europeans are concerned about the reuse of their personal data.[1]  86% of Americans are concerned with data collection from internet browsing and how it is used to generate personalized banner advertisements.[2]   Their primary concern is how that data may be used for other purposes, or packaged and resold to other entities.  With data breaches and issues such as the NSA’s collection of private data making headlines each week, it’s no wonder that consumer sensitivity is heightened.

This will present a very large problem for companies, because law makers are starting to take action.  The European Union announced changes to the 1995 Data Protection Directive to take effect starting in 2014.[3]  It contains one very logical and innocent looking directive – “the right to be forgotten” which means that upon request from a consumer, an organization must delete all of their personal data.  That sounds simple.  It’s actually a wildly complex problem, because of the premise above – you cannot forget what you cannot remember.  And most organizations aren’t particularly good at remembering their customers.

Click here to READ THE REST OF THIS BLOG ON  – or click or go to

Watch a video discussion on this topic here -

[1] Forrester Research.  EU Regulations And Public Opinion Shift The Scope Of Data Governance

by Henry Peyret, October 17, 2013

[2] Perfect Storm For Behavioral Advertising:

How The Confluence Of Four Events In 2009 May Hasten Legislation (And What This Means For Companies Which Use Behavioral Advertising) By:  Susan E. Gindin

[3] Forrester Research.  EU Regulations And Public Opinion Shift The Scope Of Data Governance

by Henry Peyret, October 17, 2013

No Security? No Data! Big problem ….

unlockProtecting and security sensitive big data is necessary to ensure data is shared for new forms of analysis.  Before the owners of that data will share it  (yes, political silos still exist, and yes individuals still feel they own data and can say no to sharing it), they want to ensure it is adequately protected.  Especially if they are the ones in the cross-hairs if that data is misused.

At the Data Governance Financial Services Conference last week in New York, I spoke on the issue of Confidence in Big Data.  And boy, did that topic ever resonate with the audience.  I spoke with a Chief Data Officer who said confidence was really the main issue she deals with – governance is all about confidently ensuring that her business users trust and protect their information.  A head of governance approached me to discuss confidence in customer data; they were struggling to ensure they were confident in accurately identifying customers and households as the basis for big data analytics.   There were a lot of common themes that came out of my discussions – customer data and big data, rapid integration of new data and business user self-service, how to visually display data confidence to business audiences …. but one issue dominated the conversations – privacy and security.

Ensuring privacy and security for big data, or any data for that matter, is always a top concern. Why?  Well, someone might go to jail if sensitive data is exposed.  Or face compliance fines.  That’s always a compelling reason to act.  But I heard something different at this conference.  One Chief Data Officer described it this way – “Imagine you want to buy a new car and safety and security is your top concern.  10 years ago you could always decide to add a security device or alarm after you buy the car.  But now, you want a system integrated with the ignition.  And for safety you want front and side curtain airbags – you’re never going to install those after the fact.  So the issue becomes a non-starter – you’ll only buy a car with the features already integrated.  The same thing is happening at our firm.  Security is a pre-requisite for big data.  If we can ensure data security for sensitive information, that project will be approved over one that lacks security.  It’s a non-starter for big data and analytics – no security, no data.”

That certainly makes sense.  Data security is as fundamental to sharing big data for new analysis as policing is to a healthy and thriving society and economy – it’s a fundamental pre-requisite.  And it offers an interesting twist on the reason to worry about privacy and security.  If you want to share big data freely, combine it in new an interesting ways in new technologies such as Hadoop or NoSQL, then you need to ensure it is protected.  Big data is by definition sensitive data – it’s important information about your customers, your products, your suppliers.  That data must be masked when it’s appropriate to do so (good rule of thumb – if the actual data value isn’t relevant for the analysis, mask it).  It must be monitored to ensure that internal users aren’t accessing it inappropriately.

Before embarking on a new big data and analytics project, make sure you’ve taken care of the fundamentals.  Make sure you can adequately protect and secure sensitive data before you ask a data owner to share it.

For tips on how to protect and secure big data, check out this ebook – Top Tips for Securing Big Data Environments

Perfect is the Enemy of Good when Building Big Data Confidence

Social-Puzzle1-300x219Confidence in big data is highly variable.  Some data sources have inherent uncertainty.  So why shouldn’t you spend as much time as needed to make big data perfect?  Time.  You simply don’t have enough time to sort out every data irregularity, every ambiguity, every incomplete attribute.  And for many big data use cases, you don’t need to.  That’s why perfect is the enemy of good.  In the era of big data, governance has evolved to first diagnose the usage, then prescribe the appropriate amount of governance.  So the objective is not to make it perfect for every possible usage up front, it’s to make it good enough for the use case at hand.

Tony Baer of Ovum explains this in more detail in his blog post here –

For more information on building big data confidence check out IBM Big Data Hub


Get every new post delivered to your Inbox.