Why Apache Beam? A Google Perspective
When we made the decision (in partnership with data Artisans, Cloudera, Talend, and a few other companies) to move the Google Cloud Dataflow SDK and runners into the Apache Beam
incubator project, we did so with the following goal in mind: provide
the world with an easy-to-use, but powerful model for data-parallel
processing, both streaming and batch, portable across a variety of
runtime platforms. Now that the dust on the initial code drops is
starting to settle, we wanted to talk briefly about why this makes sense
for us at Google and how we got here, given that Google hasn’t
historically been directly involved in the OSS world of data-processing.
Details: https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective
Why does this make sense for Google?
Google is a business, and as such, it should come as no surprise there’s a business motivation for us behind the Apache Beam move. That motivation hinges primarily on the desire to get as many Apache Beam pipelines as possible running on Cloud Dataflow. Given that, it may not seem intuitive to adopt a strategy of opening the platform up to other runners. However, it’s quite the contrary. Opening up the platform yields many benefits:- The more runners Apache Beam supports, the more attractive it becomes as a platform
- The more users adopt Apache Beam, the more users there are that might possibly want to run Apache Beam on Google Cloud Platform
- The more folks we get involved in developing Apache Beam, the more we can push forward the state of the art in data processing
Details: https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective
Comments
Post a Comment