Posts

Showing posts from August, 2021

Announcing Databricks Serverless SQL

Image
Databricks SQL   already provides a first-class user experience for BI and SQL directly on the data lake, and today, we are excited to announce another step in making data and AI simple with Databricks Serverless SQL. This new capability for Databricks SQL provides instant compute to users for their BI and SQL workloads, with minimal management required and capacity optimizations that can lower overall cost by an average of 40%. This makes it even easier for organizations to expand adoption of the lakehouse for business analysts who are looking to access the rich, real-time datasets of the lakehouse with a simple and performant solution. Under the hood of this capability is an active server fleet, fully managed by Databricks, that can transfer compute capacity to user queries, typically in about 15 seconds. The best part? You only pay for Serverless SQL when users start running reports or queries. Organizations with business analysts who want to analyze data in the data lake with t...

2021 Gartner Magic Quadrant for Data Integration Tools

Image
  Strategic Planning Assumptions Through 2022, manual data management tasks will be reduced by 45% through the addition of machine learning and automated service-level management. By 2023, AI-enabled automation in data management and integration will reduce the need for IT specialists by 20%.  Read report >>>

Cost-Efficient Open Source Big Data Platform at Uber

Image
  In this blog post, we shared efforts and ideas in improving the platform efficiency of Uber’s Big Data Platform, including file format improvements, HDFS erasure coding, YARN scheduling policy improvements, load balancing, query engines, and Apache Hudi.  These improvements have resulted in significant savings.  In addition, we explored some open challenges like analytics and online colocation, and pricing mechanisms.  However, as the framework outlined in our previous post established, platform efficiency improvements alone do not guarantee efficient operation.  Controlling the supply and the demand of data is equally important, which we will address in an upcoming post. As Uber’s business has expanded, the underlying pool of data that powers it has grown exponentially, and thus ever more expensive to process. When Big Data rose to become one of our largest operational expenses, we began an initiative to reduce costs on our data platform, which divides challe...

30 ways to leave your data center: key migration guides, in one place

Image
  One of the challenges with cloud migration is that you’re solving a puzzle with multiple pieces. In addition to a number of workloads you could migrate, you’re also solving for challenges you’re facing, the use cases driving you to migrate, and the benefits you’re looking to gain. Each organization’s puzzle will likely get solved in their own unique way, but thankfully there is plenty of guidance on how you can migrate common workloads in successful ways.  In addition to working directly with our Rapid Assessment and Migration Program (RAMP), we also offer a plethora of self-service guides to help you succeed! Some of these guides, which we’ll cover below, are designed to help you identify the best ways to migrate, which include meeting common organizational goals like minimizing time and risk during your migration, identifying the most enterprise-grade infrastructure for your workloads, picking a cloud that aligns with your organization’s sustainability goals...