Posts

Showing posts with the label Docker

Turbocharging Analytics at Uber with Data Science Workbench

Image
Millions of Uber trips take place each day across nearly 80 countries, generating information on traffic, preferred routes, estimated times of arrival/delivery, drop-off locations, and more that enables us to facilitate better experiences for users. To make our data exploration and analysis more streamlined and efficient, we built Uber’s data science workbench (DSW), an all-in-one toolbox for interactive analytics and machine learning that leverages aggregate data. DSW centralizes everything a data scientist needs to perform data exploration, data preparation, ad-hoc analyses, model exploration, workflow scheduling, dashboarding, and collaboration in a single-pane, web-based graphical user interface (GUI). Leveraged by data science, engineering, and operations teams across the company, DSW has quickly scaled to become Uber’s go-to data analytics solution. Current DSW use cases include pricing, safety, fraud detection, and navigation, among other foundational elements of the trip experi...

What Open Source Software Do You Use?

To gather insights on the current and future state of open source software (OSS), we talked to 31 executives. This is nearly double the number we speak to for a research guide and believe this reiterates the popularity of, acceptance of, and demand for OSS. We began by asking, "What Open Source software do you use?" As you would expect, most respondents are using several versions of open source software. Here's what they told us: Apache Apache Cassandra, Elassandra  (ElasticSearch + Cassandra) , Spark, and Kafka  (as the core tech we provide through our managed service) are the big ones for us. We find that the governance arrangements and independence of the Apache Foundation make a great foundation for strong open source projects. 95% of what we do with big data is open source. We use  Apache Hadoop  and contribute back to grow skills and expertise. We use so much that it would be impossible to list. The core of our software is based on  Apache So...