Posts

Showing posts with the label AI

Emerging Architectures for Modern Data Infrastructure

Image
As an industry, we’ve gotten exceptionally good at building large, complex software systems. We’re now starting to see the rise of massive, complex systems built around data – where the primary business value of the system comes from the analysis of data, rather than the software directly. We’re seeing quick-moving impacts of this trend across the industry, including the emergence of new roles, shifts in customer spending, and the emergence of new startups providing infrastructure and tooling around data. In fact, many of today’s fastest growing infrastructure startups build products to manage data. These systems enable data-driven decision making (analytic systems) and drive data-powered products, including with machine learning (operational systems). They range from the pipes that carry data, to storage solutions that house data, to SQL engines that analyze data, to dashboards that make data easy to understand – from data science and machine learning libraries, to automated data pipe...

The DataOps Landscape

Image
Data has emerged as an imperative foundational asset for all organizations. Data fuels significant initiatives such as digital transformation and the adoption of analytics, machine learning, and AI. Organizations that are able to tame, manage, and unlock their data assets stand to benefit in myriad ways, including improvements to decision-making and operational efficiency, better fraud prediction and prevention, better risk management and control, and more. In addition, data products and services can often lead to new or additional revenue. As companies increasingly depend on data to power essential products and services, they are investing in tools and processes to manage essential operations and services. In this post, we describe these tools as well as the community of practitioners using them. One sign of the growing maturity of these tools and practices is that a community of engineers and developers are beginning to coalesce around the term “DataOps” (data operations). Our conver...

The Growing Importance of Metadata Management Systems

Image
As companies embrace digital technologies to transform their operations and products, many are using best-of-breed software, open source tools, and software as a service (SaaS) platforms to rapidly and efficiently integrate new technologies. This often means that data required for reports, analytics, and machine learning (ML) reside on disparate systems and platforms. As such, IT initiatives in companies increasingly involve tools and frameworks for data fusion and integration. Examples include tools for building data pipelines, data quality and data integration solutions, customer data platform ( CDP ) ,   master data management , and   data markets . Collecting, unifying, preparing, and managing data from diverse sources and formats has become imperative in this era of rapid digital transformation. Organizations that invest in  foundational data technologies  are much more likely to build solid foundation applications, ranging from BI and analytics to machine learn...

Data Observability Ushers In A New Era Enabling Golden Age Of Data

Image
Have we entered the Golden Age of Data? Modern enterprises are collecting, producing, and processing more data than ever before. According to a February 2020 IDG survey of data professionals, average corporate data volumes are increasing by 63% per month. 10% of respondents even reported that their data volumes double every month. Large companies are investing heavily to transform themselves into data-driven organizations that can quickly adapt to the fast pace of a modern economy. They gather huge amounts of data from customers and generate reams of data from transactions. They continuously process data in an attempt to personalize customer experiences, optimize business processes, and drive strategic decisions. The Real Challenge with Data In theory, breakthrough open-source technologies, such as Spark, Kafka, and Druid are supposed to help just about any organization benefit from massive amounts of customer and operational data just like they benefit Facebook, Apple, Google, Microso...

AIOps Platforms (Gartner)

Image
AIOps is an emerging technology and addresses something I’m a big fan of – improving IT Operations.  So I asked fellow Gartner analyst Colin Fletcher for a guest blog on the topic… Roughly three years ago, it was looking like we were going to see many enterprise IT operations leaders put themselves in the precarious role of “ the cobbler’s children ” by forgoing investment in Artificial Intelligence (AI) to help them do their work better, faster, and cheaper. We were hearing from many IT ops leaders building incredibly sophisticated Big Data and Advanced Analytics systems for business stakeholders, but were themselves using rudimentary, reactive red/yellow/green lights and manual steps to help run the infrastructure required to keep those same systems up and running. Further, we’re all now familiar in our personal lives with dynamic recommendations from online retailers, search providers, virtual personal assistants, and entertainment services, Talk about a paradox! Now I...
Image
Once an outsider category, cloud computing now powers every industry. Look no further than this year’s Forbes Cloud 100 list, the annual ranking of the world’s top private cloud companies, where this year's standouts are keeping businesses surviving—and thriving—from real estate to retail, data to design. Produced for the fifth consecutive year in partnership with Bessemer Venture Partners and Salesforce Ventures, the Cloud 100 recognizes standouts in tech’s hottest category from small startups to private-equity-backed giants, from Silicon Valley to Australia and Hong Kong. The companies on the list are selected for their growth, sales, valuation and culture, as well as a reputation score derived in consultation with 43 CEO judges and executives from their public-cloud-company peers. This year’s new No. 1 has set a record for shortest time running atop the list. Database leader Snowflake takes the top slot, up from No. 2 last year and just hours before graduating from the list by g...

The unreasonable importance of data preparation

Image
We know data preparation requires a ton of work and thought. In this provocative article, Hugo Bowne-Anderson provides a formal rationale for why that work matters, why data preparation is particularly important for reanalyzing data, and why you should stay focused on the question you hope to answer. Along the way, Hugo introduces how tools and automation can help augment analysts and better enable real-time models. In a world focused on buzzword-driven models and algorithms, you’d be forgiven for forgetting about the unreasonable importance of data preparation and quality: your models are only as good as the data you feed them. This is the garbage in, garbage out principle: flawed data going in leads to flawed results, algorithms, and business decisions. If a self-driving car’s decision-making algorithm is trained on data of traffic collected during the day, you wouldn’t put it on the roads at night. To take it a step further, if such an algorithm is trained in an environment with car...

Technical Guide to Ocean Compute-to-Data

Image
With the v2 Compute-to-Data release, Ocean Protocol provides a means to exchange data while preserving privacy. This guide explains Compute-to-Data without requiring deep technical know-how. Private data is data that people or organizations keep to themselves. It can mean any personal, personally identifiable, medical, lifestyle, financial, sensitive or regulated information. Benefits of Private Data. Private data can help research, leading to life-altering innovations in science and technology. For example, more data improves the predictive accuracy of modern Artificial Intelligence (AI) models. Private data is often considered the most valuable data because it’s so hard to get at, and using it can lead to potentially big payoffs. Risks of Private Data. Sharing or selling private data comes with risk. What if you don’t get hired because of your private medical history? What if you are persecuted for private lifestyle choices? Large organizations that have massive datasets know their d...

Researchers love PyTorch and TensorFlow

Image
In a recent survey—AI Adoption in the Enterprise, which drew more than 1,300 respondents—we found significant usage of several machine learning (ML) libraries and frameworks. About half indicated they used TensorFlow or scikit-learn, and a third reported they were using PyTorch or Keras. I recently attended an interesting RISELab presentation delivered by Caroline Lemieux describing recent work on AutoPandas and automation tools that rely on program synthesis. In the course of her presentation, Lemieux reviewed usage statistics they had gathered on different deep learning frameworks and data science libraries. She kindly shared some of that data with me, which I used to draw this chart: The numbers are based on simple full-text searches of papers posted on the popular e-print service arXiv.org. Specifically, they reflect the number of papers which mention (in a full-text search) each of the frameworks. Using this metric, the two most popular deep learning frameworks among resear...

Gartner Hype Cycle for Emerging Technologies, 2019

Image
The Gartner Hype Cycle highlights the 29 emerging technologies CIOs should experiment with over the next year. Today, companies detect insurance fraud using a combination of claim analysis, computer programs and private investigators. The FBI estimates the total cost of non-healthcare-related insurance fraud to be around $40 billion per year. But a maturing emerging technology called emotion artificial intelligence (AI) might make it possible to detect insurance fraud based on audio analysis of the caller. Some technologies will provide “superhuman capabilities” In addition to catching fraud, this technology can improve customer experience by tracking happiness, more accurately directing callers, enabling better diagnostics for dementia, detecting distracted drivers, and even adapting education to a student’s current emotional state. Though still relatively new, emotion AI is one of 21 new technologies added to the Gartner Hype Cycle for Emerging Technologies, 2019. Original article ...

Top 10 AI Jobs, Salaries and Cities 2019

We discovered that machine learning engineer job postings had the highest percentage of AI and machine learning keywords this year (as they did in 2018). Machine learning engineers develop devices and software that use predictive technology, such as Apple’s Siri or weather-forecasting apps. They ensure machine learning algorithms have the data that needs to be processed and analyze huge amounts of real-time data to make machine learning models more accurate. While machine learning engineer jobs still have the largest number of postings containing the relevant keywords, in 2018, they composed a greater percentage of these postings (94.2%, versus 75% in 2019). Many of the jobs requiring AI skills on 2019’s top 10 were nowhere to be found on 2018’s list — such as deep learning engineer, appearing for the first time in second place. Deep learning engineers develop programming systems that mimic brain functions, among other tasks. These engineers are key players in three rapidly growing fie...