Posts

Showing posts with the label scikitlearn

Turbocharging Analytics at Uber with Data Science Workbench

Image
Millions of Uber trips take place each day across nearly 80 countries, generating information on traffic, preferred routes, estimated times of arrival/delivery, drop-off locations, and more that enables us to facilitate better experiences for users. To make our data exploration and analysis more streamlined and efficient, we built Uber’s data science workbench (DSW), an all-in-one toolbox for interactive analytics and machine learning that leverages aggregate data. DSW centralizes everything a data scientist needs to perform data exploration, data preparation, ad-hoc analyses, model exploration, workflow scheduling, dashboarding, and collaboration in a single-pane, web-based graphical user interface (GUI). Leveraged by data science, engineering, and operations teams across the company, DSW has quickly scaled to become Uber’s go-to data analytics solution. Current DSW use cases include pricing, safety, fraud detection, and navigation, among other foundational elements of the trip experi...

Best Machine Learning Tools

The best trained soldiers can’t fulfill their mission empty-handed. Data scientists have their own weapons  —  machine learning (ML) software. There is already a cornucopia of articles listing reliable machine learning tools with in-depth descriptions of their functionality. Our goal, however, was to get the feedback of industry experts. And that’s why we interviewed data science practitioners — gurus, really  — regarding the useful tools they choose for  their  projects. The specialists we contacted have various fields of expertise and are working in such companies as Facebook and Samsung. Some of them represent AI startups (Objection Co, NEAR.AI, and Respeecher); some teach at universities (Kharkiv National University of Radioelectronics). The AltexSoft data science team joined the discussion, too. And if you’re looking for a particular type of tools, just skip to your sector of interest: Languages used in machine learning Data analytics an...

Machine Learning algorithms and libraries overview

Nice brief overview of some Machine Learning algorithms highlighting their strengths and weaknesses. Big 3 machine learning tasks, which are by far the most common ones. They are:     Regression     Classification     Clustering Details: https://elitedatascience.com/machine-learning-algorithms Here are also some observations on the top five characteristics of ML libraries that developers should consider when deciding what library to use: Programming paradigm Symbolic: Spark MLlib, MMLSpark, BigDL, CNTK, H2O.ai, Keras, Caffe2 Imperative: scikit-learn, auto sklearn, TPOT, PyTorch Hybrid: MXNet, TensorFlow Machine learning algorithms Supervised and unsupervised: Spark MLlib, scikit-learn, H2O.ai, MMLSpark, Mahout Deep learning: TensorFlow, PyTorch, Caffe2 (image), Keras, MXNet, CNTK, BigDL, MMLSpark (image and text), H2O.ai (via the deepwater plugin) Recommendation system: Spark MLlib, H2O.ai (via the sparkling-water plugin), Mah...