Posts

Showing posts with the label DWH

Top 9 Data Modeling Tools & Software 2021

Image
  Data modeling is the procedure of crafting a visual representation of an entire information system or portions of it in order to convey connections between data points and structures. The objective is to portray the types of data used and stored within the system, the ways the data can be organized and grouped, the relationships among these data types, and their attributes and formats. Data modeling uses abstraction to better understand and represent the nature of the flow of data within an enterprise-level information system.  The types of data models include: Conceptual data models. Logical data models. Physical data models. Database and information system design begins with the creation of these data models.  What is a Data Modeling Tool? A data modeling tool enables quick and efficient database design while minimizing human error. A data modeling software helps craft a high-performance database, generate reports that can be useful for stakeholders and create data de...

The Future of Data Engineering

Image
Data engineering’s job is to help an organization move and process data. This generally requires two different systems, broadly speaking: a data pipeline, and a data warehouse. The data pipeline is responsible for moving the data, and the data warehouse is responsible for processing it. I acknowledge that this is a bit overly simplistic. You can do processing in the pipeline itself by doing transformations between extraction and loading with batch and stream processing. The “data warehouse” now includes many storage and processing systems (Flink, Spark, Presto, Hive, BigQuery, Redshift, etc), as well as auxiliary systems such as data catalogs, job schedulers, and so on. Still, I believe the paradigm holds. The industry is working through changes in how these systems are built and managed. There are four areas, in particular, where I expect to see shifts over the next few years. Timeliness: From batch to realtime Connectivity: From one:one bespoke integrations to many:many Cen...

Forrester Wave Cloud Data Warehouse, Q4 2018

Image
Evaluated Vendors And Inclusion Criteria Forrester included 14 vendors in the assessment: Alibaba, AWS, Exasol, Google, Hortonworks, Huawei, IBM, MarkLogic, Micro Focus, Microsoft, Oracle, Pivotal, Snowflake, and Teradata. Each of these vendors has ( see Figure 1 ): A comprehensive CDW offering. Key components of the CDW include the provisioning, storing, processing, transforming, and accessing of data. The CDW should provide features to secure data, enable elastic scale, provide high availability and disaster recovery options, support loading and unloading of data, and provide various data access tools. A standalone data warehouse service running in the public cloud. Vendors included in this evaluation provide a CDW service that organizations can implement or use independent of analytics, data science, and visualization tools. The service should not be technologically tied to or bundled with any particular application or solution. Data warehouse use cases. The CDW service shoul...