Showing posts with label MDM. Show all posts
Showing posts with label MDM. Show all posts

Friday, December 9, 2016

The Top 7 Reasons for Data Governance

In the Age of Big Data, many people might think that the practice of Data Governance is a thing of the past – nothing could be further from the truth. Data Governance has often been misunderstood or underappreciated and relatively few organizations have taken the time and made the investment to integrate it into their enterprise processes. So, there are actually several questions that need to be answered here:
  1. Does the de-normalization of data through exploitation of Big Data technologies discount the need for Data Governance?
  2. Why isn’t Data Governance more widespread if it indeed still has value and
  3. What is the value proposition behind Data Governance? (what are the 7 reasons why you need it)
We’ll tackle these questions one at a time.
1 - Does Big Data Require Governance?
The immediate expectation in response to this question might be – well no - as Governance seems to represent the formal and complex approach used for both RDBMS and OLAP data structures. But does this make sense per se? Classifying a data model as normalized, star schema or a de-normalized Big Table doesn’t necessarily impact the nature of the data attributes themselves. In other words, we still need to understand that data regardless of where or how it housed – we still need to know where it comes from, who owns it, where it goes, how it is transformed and so on. If we want the data to be valid, accurate and managed across a lifecycle, Governance is still needed. The technology itself does nothing to prevent us from experiencing a ‘Garbage in / Garbage out’ situation. The adoption of new technology doesn’t imply the need to discard common sense.
2 – Why isn’t Data Governance More Accepted?
This is a tougher question and in fact can’t easily be broken down into a single reason. Some of the most common reasons include:
  • To govern data you first have to understand it holistically, and that initial assessment / analysis is a generally the hardest part – and is often why things don’t progress beyond that point (as many of these assessments simply never get completed)
  • Often times, all Governance within an organization may be lacking because of the perception that some processes can’t be Agile and just hold things back or slow them down too much. While there is some truth to that, there is also truth in the lesson learned innumerable times that bypassing that Governance causes tremendous impacts later (to cost, efficiency and the ability to deliver and maintain capability).
  • Because sometimes the tools get confused with the practice and while there are a number of great data governance tools available – sometimes they become an obstacle in themselves (e.g. some may be considered too expensive, others too complicated or perhaps there might be too many in the mix). The reality is that a lot of Data Governance can occur before or even sometimes without making that investment. It is the practice and not the software used to facilitate the practice that really matters.
3 – Why do Most Enterprises Need Data Governance? Here are 7 good reasons that tend to represent the more or less universal value proposition:
  • Data Governance reduces enterprise complexity. At first, as I alluded to earlier, the impression here might be the opposite. But one only needs consider a highly typical data Use Case to see how Governance cuts right through complexity. Perhaps the number one integration issue I’ve seen faced over the past twenty years pretty much everywhere is the proliferation of similar or even the same data across multiple systems (this can include both multiple databases and reporting platforms). This quickly leads to all sorts of confusion and ultimately costs more to manage as long as it stays, well, confused. Governance tackles this type of problem at its core, by first designating authoritative systems and then more strictly controlling the use or reuse of such data. This can translate into business rules across the stack and often results in the elimination of both redundant data elements as well as duplicate systems.
  • Data Governance Enhances Security – How one might ask, does it do that? Well, precisely through some of what I’ve already mentioned; including an assessment and classification of what data assets your enterprise has as well as determination of rules and architectural requirements for safeguarding both Data at Rest and Data in Motion. All of this starts with and becomes part of Data Governance. And if we think a little deeper about it, this is only logical when we consider that Data Assets are in fact the number one target of every major Cyber-Attack ever launched. To protect your enterprise, you must first know what’s in it and secondly you must have the ability to control the flow of that information.
  • Data Governance is the best 1st Step for Integration – Almost every integration challenge is at its heart a data challenge. How we transport data, transform data and keep everything aligned is to a large part dependent on how well we understand that data. Messaging / Middleware / API Frameworks / EDI / SOA /EAI – you name it - it’s all about the data. Once integration is place, it must be governed – data interfaces (through messaging or other similar mechanisms) – is actually one of the most pragmatic initial places where Data Governance can be instituted.
  • Data Governance Enables more Sophisticated Capabilities – such as MDM – Master Data Management is an example of a valuable enterprise capability that simply couldn’t exist if some level of Governance weren’t in place. To deploy MDM, an organization has to understand its core business entities and how they relate to attributes and be able to control them in a consistent manner. Every MDM solution I’ve ever seen either has Data Governance built in or relies on some other existing Data Governance process. MDM is not the only capability dependent on Governance though.
  • Data Governance is Critical to Achieving an Effective Analytics Solution – The last thing any organization wants to be getting different answers to the same or similar questions. Data Governance not only helps to de-conflict issues at the data level – it can be used to de-conflict entire solutions. In other words, data governance helps drive consolidation of reporting and reporting architectures as well as the source systems underneath them.
  • Data Governance can Impact the Bottom Line – Having Data Governance can make your enterprise more effective, not just from an IT perspective, but also the Business perspective as well. I’ve seen many organizations reduce duplicate systems and eliminate conflicting data and experience immediate results. The amount of benefit is dependent on how many systems can be consolidated or turned off and how improving data accuracy will impact whatever the business mission of the organization may be – but in almost every case – these types of benefits will be realized to some degree.
  • Data Governance is often the Keystone upon which more Effective Enterprise Governance is Built – It is a great place to start if no Governance is in place or an even better place to expand if perhaps there are already some pockets of Governance already deployed. Since Data tends to be a cross-cutter, both organizationally and architecturally – it can become the foundation for a wider Governance framework.
In my experience, even in the organizations that didn’t fully implement Data Governance, the elements which were deployed provided obvious and immediate value. The current technology trends tend to point to a heightened need for Governance rather than the other way around, especially with the massive levels of adoption of Hybrid Cloud capability. I’ll talk about that in an upcoming post.

Copyright 2016, Stephen Lahanas

Sunday, November 16, 2014

Why Most Organizations need a Data Strategy

One of the most important tasks that a Data Architect is often asked to help with is the creation of an Enterprise Data Strategy. But why is Data Strategy so important and what exactly does it consist of, and lastly why is this a task that a Data Architect should be leading or supporting?
So, what is a Data Strategy? Let's review what it isn't first…
  • A Data Strategy is not a list of generic principles or obvious statements (such as "Data is an Enterprise Asset")
  • A Data Strategy is not merely a laundry list of technology trends that might somehow influence the organization in coming years
  • A Data Strategy is not a vague list of objectives without a clear guiding vision or path for actualization.
  • A Data Strategy is not merely the top level vision either, it can expand into critical data domains such as Business Intelligence and eventually represent a family of strategies.
Now we will attempt to define what an Enterprise Data Strategy really is:
Enterprise Data Strategy is the comprehensive vision and actionable foundation for an organization's ability to harness data-related or data-dependent capability. It also represents the umbrella for all derived domain-specific strategies, such as Master Data Management, Business Intelligence, Big Data and so forth.
The Enterprise Data Strategy is:
  • Actionable
  • Relevant (e.g. contextual to the organization, not generic)
  • Evolutionary (e.g. it is expected to change on a regular basis)
  • Connected / Integrated - with everything that comes after it or from it
This definition helps to understand what Data Strategy is; so now we need to understand why most organizations need one. Here are a few of the reasons why…
  1. Without a centralized vision and foundation, different parts of the enterprise will view data-related capabilities differently. This inevitably leads to duplication of both data and data systems across the organization and thus makes it quite difficult to determine the 'truth' of one's data and will also drive up costs.
  2. The Data Strategy provides the basis for all enterprise planning efforts connected to data-related capability.
  3. The Data Strategy is the tool that allows for unification of Business and IT expectations for all enterprise data-related capabilities. The more detailed and comprehensive it is, the better the chance that both sides will fully understand each other.
  4. There is no better place to define the metrics or service level expectations that should apply across the enterprise.
  5. This is the best place to explain thoroughly how management of enterprise data can be leveraged to support organizational mission objectives or processes.
A Typical Enterprise Data Strategy includes the following components:
  • A definition of what types of data or information needs to managed from an enterprise perspective (and yes this ought to be fairly specific).
  • A determination in regards to roles (organizational) in terms of who owns what data or data systems.
  • A mission statement in relation to exploitation of data assets. So, we've taken for granted here that these are enterprise assets - what's important is understanding how they ought to be used.
  • Initial or top level expectations for enterprise-wide service level metrics (for data systems and data quality).
  • Introductory versions of all domain-level or specific sub-strategies, such as; Information/Data Governance, EDW, MDM, Content Management, Big Data etc.
  • Top level planning decisions or expectations for making those designs.
  • Identification of key enterprise challenges and anticipated design decisions.
Now we're ready to address why Data Architects are typically involved in creating and executing Enterprise Data Strategy. Data Architects are specialists within the larger field of IT Architecture, while some have wider architecture experience - others do nothing but work with data and data systems. Data Architects make good candidates for helping to craft Enterprise Data Strategy because they are typically charged with defining all existing and future data related systems capability. Architects often also have a good deal of experience working directly with business stakeholders and thus help to ensure both business and IT perspectives are taken into consideration while crafting the Data Strategy.
There are in fact few other roles qualified to lead this type of an effort. While CTO, CIO or CDO’s (Chief Data Officer) might quality to lead such a task, often times they are stretched too thin to focus on building the comprehensive Strategies necessary to make a real difference for the organization. The Data Architect can typically dedicate their full attention to this task and have the full support of all necessary resources (including the CXO level personnel) to ensure that the necessary analysis, negotiation and planning goes into the Data Strategy so it can be relevant and ultimately successfully.

copyright 2014, Stephen Lahanas

Monday, September 1, 2014

Understanding Master Data Management

I was speaking with someone before on the topic of Master Data Management (MDM) - it occurred to me not too long after we began the conversation that we more than likely had differing perspectives as to what Master Data Management really meant. That's inspired me to write this post to talk a little bit about how better to understand MDM.

MDM is one of the key focal or practice areas for Data Architects today; although the overall success rate for Master Data Management is not as high as it should be given the number of years it has been an industry standard solution. There are many reasons for this and we will address those in greater depth in future posts.

MDM, The Core Concept:Let's start with what Master Data is not, Master data is not:
  1. Meta-data, which is a description  of data  (or data about data as it's commonly referred to as).
  2. Ontology,  Taxonomy or Vocabulary - Master data can be derived from these but is not in itself a      formal semantic construct.
  3. Software Tool - ultimately, Master Data is technology-agnostic; it is a logical construct which can be defined through various modeling tools and realized through a variety of data management software solutions.  At the point where Master Data becomes tightly coupled with any one software tool or any one modeling technique it will likely loose a great deal of its potential value to the enterprise.
So, then what is it? How would we characterize what can become Master Data or not ?
  1. It may be considered "data of record" or an authoritative data source, but it might not be also. Data of record implies that there is a system of record with sanctioned data elements that are not meant to be repeated throughout the enterprise across other systems. Or this might refer to data entities which are determined to be unique and authoritative across the enterprise regardless of their current use (in a system).
  2. Master data is reference data, sort of. If we consider that reference data is a definitive set of element definitions or entities associated with any particular business domain, sub-domain or problem space. In this capacity, Master Data may serve multiple roles, including: discovery, registry or repository access, data dictionary foundation.
  3. Benchmark - this is a critical consideration; any data entities defined as Master Data elements within an enterprise are unlikely to remain unchanged or unmodified. Eventually there will be variations of Master Data sets, these variations must be tracked back to their source and there must also be a mechanism whereby others in the enterprise can understand where, why and how those modifications occurred 'atop' the core sets of Master Data. Thus the Master Data is a baseline or benchmark wherein the data chain of custody can be managed or tracked. 
  4. It can be a canonical data model or data exchange model - this is important in cases where the core data architecture has not yet been designed or deployed, or in cases where it is anticipated that there will be a major or radical transformation of the existing architecture to a new one. The model can contain Master Data elements or sets within it.
MDM in Today's Implementations
Much of what we refer to now as MDM solutions have been borne out of previous product solutions that were describe as meta-data management solutions. For many of the MDM solutions on the market, the "repository or registry" architectural construct / pattern is how this capability is harnessed.  Another architectural approach related to MDM might be referred to as the middleware design - this extends MDM into data transport and is focused on supporting accurate message translation. And of course there are solutions that combine both aspects.

One of the most important aspects of MDM is identifying what constitutes Master and Reference Data

Perhaps we can consider that there are at least two philosophical approaches to MDM:
  • Passive MDM - This is most closely aligned to the original Meta-data management solutions with a central repository to support discovery and high level data reconciliation.
  • Active MDM - This is most closely aligned with solutions stemming from EAI, Middleware, ETL based solutions where data reconciliation rules are being applied at multiple levels and in more detail.
  • Hybrid MDM - Both solutions are relatively weak in dealing with bi-directional reconciliation focused heavily on transactional systems (it is much easier to reconcile historical data from multiple sources than real-time data from multiple sources).  Hybrid MDM applies both previous techniques and new ones to tackle the most problematic use cases.
It is clear to anyone who has worked with database development and data-system integration that having the ability to reconcile data sources adds tremendous value to the enterprise helping to improve performance, integrity and overall efficiency. Being able to do some of this automation using COTS tools is even more appealing, however there is still a set of processes which ultimately takes precedence here if one is to deploy a successful MDM solution. The enterprise data governance approach must be defined first, the data and business environment must be modeled and if ownership of data sets is to be handed off to user groups (either fully or partially) the impact to both governance and model maintenance must be considered and mitigated in advance.

As we've discovered with nearly every IT technology and product over the past 40 years - implementation without process or architectural considerations leads to many issues, often more issues than existed before the technology was introduced.  This is no exception with MDM - the most important thing to consider here is that deployment of MDM software can significantly impact or influence both solution performance and integrity but proceeding without working through the implicit architectural / enterprise issues is risky.



Copyright 2014,  Stephen Lahanas






#ITarchitectureJournal
#StephenLahanas
#Semantech-Inc