Posts

Showing posts from November, 2018

AGILE-IoT: More Than Just Another IoT Project

The AGILE-IoT project ( www.agile-iot.eu ), co-funded by the Horizon 2020 programme of the European Union, aims to address this concern, by providing a solution based on four main pillars: Agnosticity : depending on the technical background of the platform user, the background of his organisation or the software components he has to (re-) use, we cannot predict what would be the programming language of the solution. It might be built with a combination of languages, some of them being compiled (e.g., C, C++), others translated into intermediate languages (e.g., Java, Python) and, finally, some others interpreted (e.g., JavaScript). If the user chooses one platform because of the programming language(s) it supports, he may limit his options for developing his solution. AGILE-IoT, by leveraging a micro-service-based architecture, supports all the programming languages a platform user might require to implement their solution. Openness : Lots of platforms are provided by

IEEE IoT - Nine IoT Predictions for 2019

Image
By 2020, the Internet of Things (IoT) is predicted to generate an additional $344B in revenues, as well as to drive $177B in cost reductions. IoT and smart devices are already increasing performance metrics of major US-based factories. They are in the hands of employees, covering routine management issues and boosting their productivity by 40-60% [1]. The following list of predictions (Figure 1) explores the state of IoT in 2019 and covering IoT impact on many aspects business and technology including Digital Transformation, Blockchain, AI, and 5G. Read full article >>>

Getting started with Apache Airflow

In this post, I am going to discuss Apache Airflow, a workflow management system developed by Airbnb. Earlier I had discussed writing basic ETL pipelines in Bonobo. Bonobo is cool for write ETL pipelines but the world is not all about writing ETL pipelines to automate things. There are other use cases in which you have to perform tasks in a certain order once or periodically. For instance: Monitoring Cron jobs transferring data from one place to other. Automating your DevOps operations. Periodically fetching data from websites and update the database for your awesome price comparison system. Data processing for recommendation based systems. Machine Learning Pipelines. Possibilities are endless. Before we move on further to implement Airflow in our systems, let’s discuss what actually is Airflow and it’s terminologies. What is Airflow? From the Website: Airflow is a platform to programmatically author, schedule and monitor workflows. Us

Understanding How Apache Pulsar Works

Image
I will be writing a series of blog posts about Apache Pulsar, including some Kafka vs Pulsar posts. First up though I will be running some chaos tests on a Pulsar cluster like I have done with RabbitMQ and Kafka to see what failure modes it has and its message loss scenarios. I will try to do this by either exploiting design defects, implementation bugs or poor configuration on the part of the admin or developer. In this post we’ll go through the Apache Pulsar design so that we can better design the failure scenarios. This post is not for people who want to understand how to use Apache Pulsar but who want to understand how it works. I have struggled to write a clear overview of its architecture in a way that is simple and easy to understand. I appreciate any feedback on this write-up. Claims The main claims that I am interested in are: guarantees of no message loss (if recommended configuration applied and your whole data center doesn't burn to the ground) strong

Forrester Wave Cloud Data Warehouse, Q4 2018

Image
Evaluated Vendors And Inclusion Criteria Forrester included 14 vendors in the assessment: Alibaba, AWS, Exasol, Google, Hortonworks, Huawei, IBM, MarkLogic, Micro Focus, Microsoft, Oracle, Pivotal, Snowflake, and Teradata. Each of these vendors has ( see Figure 1 ): A comprehensive CDW offering. Key components of the CDW include the provisioning, storing, processing, transforming, and accessing of data. The CDW should provide features to secure data, enable elastic scale, provide high availability and disaster recovery options, support loading and unloading of data, and provide various data access tools. A standalone data warehouse service running in the public cloud. Vendors included in this evaluation provide a CDW service that organizations can implement or use independent of analytics, data science, and visualization tools. The service should not be technologically tied to or bundled with any particular application or solution. Data warehouse use cases. The CDW service shoul

Facebook Marketplace powered by artificial intelligence

Image
Facebook Marketplace was introduced in 2016 as a place for people to buy and sell items within their local communities. Today in the U.S., more than one in three people on Facebook use Marketplace, buying and selling products in categories ranging from cars to shoes to dining tables. Managing the posting and selling of that volume of products with speed and relevance is a daunting task, and the fastest, most scalable way to handle that is to incorporate custom AI solutions. On Marketplace’s second anniversary, we are sharing how we use AI to power it. Whether someone is discovering an item to buy, listing a product to sell, or communicating with a buyer or seller, AI is behind the scenes making the experience better. In addition to the product index and content retrieval systems, which leverage our AI-based computer vision and natural language processing (NLP) platforms, we recently launched some new features that make the process simpler for both buyers and sellers. Multimodal