Data complexity is a ubiquitous problem. In fact, the sprawl of data stores across on-premises data centres and multiple clouds continues to grow as thousands of new data sources emerge and expand, increasing data complexity on a day-in/day-out basis. This complexity even extends to analytical systems where the days of a simple data warehouse and a few data marts for query and reporting have long gone. Today, while multiple data warehouses still exist, new analytical platforms optimized for different analytical workloads continue to emerge. Examples of these include streaming data platforms, Hadoop, cloud storage with Spark as a service, graph databases and even edge devices. Also, the number of data structures and files is on the increase within individual data stores. Looking at this from a data access perspective, greater complexity is forcing the average business analyst to connect to multiple data stores to access the data they need. The question is: can the complexity be simplified? The answer, of course, is yes.
One method of doing this is to create an enterprise data marketplace, which acts as an internal ‘shop window’ for ready-made data and analytical assets that people can pick up and use in ‘last-mile’ processing to deliver value. The way in which this can be achieved is by:
- Establishing a centralised (e.g., cloud storage, Hadoop Distributed File System) or logical (an organised set of data stores) data lake within which to ingest raw data from internal and external data sources
- Creating an information supply chain whereby information producers can work together in project teams to:
- Curate this data to produce trusted, data products (assets) that can be stored in one or more trusted data stores
- Produce trusted analytics to process, score and cluster data
- Publishing trusted, curated data and analytical assets in an information catalogue which acts as an enterprise data marketplace.
This is shown in Figure 1 below:
The idea behind an information supply chain is to act as a process for curating data products. That means from an organisational perspective, we have information producers who curate ‘business ready’ data products, and information consumers who shop for this business ready data and use it to deliver value. The whole point here is to produce trusted, commonly understood, ready-made data products that can be re-used in multiple places. The benefit is that we can save information consumers considerable amounts of time because they do not have to prepare all the data they may need from scratch. Instead, we create ready-made data products or assets that are already fit for consumption. An example might be customer data, product data, orders data, etc. Having data ‘ready-to-go’ should therefore shorten the time to value and reduce costs.
However, we can go further. What if you could simplify access to those trusted data assets in one or more data stores by using a logical data connector to create reusable, high-value business views of that trusted data? One way to do this is with a solution like the Magnitude Gateway platform. Its Intelligent Adapters materialize business views on top of data sources which are then accessed via a standards-based universal connector. This capability allows subsets of data within a trusted data asset to be defined within a business view using common business data names. Tools and applications can then connect to one or more of these additional, reusable business view assets to access the ready-made trusted data within them. This is like creating rapid connectivity to relevant trusted data without the need to worry about what structures the data is stored in, or any other data that may be irrelevant to a particular business need. A number of business views can be made accessible via a single universal connector. The concept of business views really helps business users to find the data they need because they can immediately see relevant, re-usable data subsets described in business terms. The mapping from business view to physical data is hidden by Magnitude Gateway.
It seems that the combination of a data lake, an information supply chain, an enterprise data marketplace and a universal connector to data with multiple business views might be a pretty powerful combination.
Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an independent analyst and consultant, he specializes in business intelligence, analytics, data management and big data. With over 37 years of IT experience, he has consulted for dozens of companies, spoken at events all over the world and written numerous articles. Formerly he was a principal and co-founder of Codd and Date Europe Limited – the inventors of the Relational Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing Director of DataBase Associates.