Gartner 2017 Market Guide for Data Preparation

Data preparation — the most time-consuming task in analytics and BI — is evolving from a self-service activity to an enterprise imperative. We profile 28 data preparation tools for data and analytics leaders to consider to accelerate agile data preparation for a range of distributed content authors.

Overview

Key Findings

  • The market for data preparation has now evolved from tools supporting only self-service use cases into platforms that enable data and analytics teams to build agile and searchable datasets at an enterprise scale for distributed content authors.
  • Most vendor offerings support data profiling, data exploration, transformation, modeling and curation, and metadata support. More than 80% of the vendors surveyed embed some data cataloging features and offer varying degrees of machine-learning capabilities.
  • The market is crowded with a range of choices, from stand-alone specialists to vendors that embed data preparation as a capability into analytics and BI, data science, or enterprise data integration platforms. Although accelerating the shift toward broadly deployed modern analytics and data science, these tools, if unmanaged, can introduce multiple versions of the truth.

Recommendations

Data and analytics leaders modernizing their data management and analytics strategies:
  • Develop a deployment strategy for data preparation to enhance user understanding of data, reduce data preparation efforts and increase agility. Evaluate vendors based on capabilities, integration points, pricing and roadmaps.
  • Create a formal process for vetting and reusing models developed by business users, for operationalizing data preparation flows, and for incorporating them into the enterprise data integration workflow, as warranted. Recognize that, while data preparation tools can be used for an increasing number of new data integration use cases, they do not yet replace the need for enterprise data integration solutions for all requirements.
  • Investigate your data preparation vendors' roadmap on their current or planned support for extended data preparation capabilities to improve the interactive experience, facilitate timely insights and enhance enterprise readiness. Examples include the inclusion of data science libraries, more-intuitive data preparation workflows, improved governance, collaboration, machine learning and cataloging.

Strategic Planning Assumptions

By 2020, data preparation tools will be used in more than 50% of new data integration efforts for analytics.
By 2023, machine-learning-augmented master data management (MDM), data quality, data preparation and data catalogs will converge into a single modern enterprise information management (EIM) platform used for the majority of new analytics projects.
By 2019, data and analytics organizations that provide agile, curated internal and external datasets for a range of content authors will realize twice the business benefits as those that do not.
Details >>> (provided by Tamr here)

Comments