The State of Open-Source Data Integration and ETL

Open-source data integration started 16 years ago with Talend. Since then, the whole industry has changed. Let's compare the different actors.

Open-source data integration is not new. It started 16 years ago with Talend. But since then, the whole industry has changed. The likes of Snowflake, Bigquery, Redshift have changed how data is being hosted, managed, and accessed while making it easier and a lot cheaper. But the data integration industry has evolved as well.

On one hand, new open-source projects emerged, such as Singer.io in 2017. This enabled more data integration connectors to become accessible to more teams, even though it still required a significant amount of manual work. 

On the other hand, data integration was made accessible to more teams (analysts, scientists, business intelligence teams). Indeed, companies like Fivetran benefited from Snowflake’s rise,  empowering non-engineering teams to set up and manage their data integration connectors by themselves, so they can use and work on the data in an autonomous way. 

But even with this progress, a large majority of teams still build their own connectors in-house. The build vs. buy leans strongly on the build. That’s why we think it’s time to have a fresh new look at the landscape of the open-source technologies around data integration. 

Continue reading >>>

Comments