Apache Superset in the Production Environment

Visualizing data helps in building a much deeper understanding of the data and quickens analytics around the data. There are several mature paid products available on the market. Recently, I explored an open source product name Apache Superset which I found a very upbeat product in this space. Some prominent features of Superset are:

A rich set of data visualizations.

An easy-to-use interface for exploring and visualizing data.

Create and share dashboards.



After reading about Superset, I wanted to try it, and as Superset is a Python programming language-based project we can easily install it using pip; but I decided to set it up as a container based on Docker. The Apache Superset GitHub Repo contains code for building and running Superset as a container. Since I want to run Superset in a completely distributed manner and with as little modification as possible in the code, I decided to modify the code so that it could run in multiple different modes.

Below is a list of specific changes/enhancements done in the code.

Different version of a Superset image can be built using the same code.

Superset configurations can be easily edited and mounted into the container, with no need to rebuild the image.

We can use asynchronous query executions through Celery-based executors and manage it through Flower UI.

Full Article >>>

Comments