Project Hop - Exploring the future of data integration


Project Hop was announced at KCM19 back in November 2019. The first preview release is available since April, 10th. We’ve been posting about it on our social media accounts, but what exactly is Project Hop? Let’s explore the project in a bit more detail. In this post, we'll have a look at what Project Hop is, why the project was started and why know.bi wants to go all in on it. 


Project Hop - Exploring the future of data integrationWhat is Project Hop?
hopAs the project’s tagline says, Project Hop intends to explore the future of data integration. We take that quite literally. We’ve seen massive changes in the data processing landscape over the last decade (the rise and fall of the Hadoop ecosystem, just to name one). All of these changes need to be supported and integrated into your data engineering and data processing systems. 

Apart from these purely technical challenges, the data processing life cycle has become a software life cycle. Robust and reliable data processing requires testing, a fast and flexible deployment process and a strict separation between data and metadata.

Project Hop wants to be your goto-tool for data processing. Our main goals are:

  • Open Source: this is to state the obvious. The only way to build an innovative software platform in this day and age is by relying on open standards and open source software, leaving open source as the only viable option.
  • Visual design: data processes need to be easy to design, easy to test, easy to run and easy to deploy. We believe that visually designing data processes greatly increases developer productivity. Although visually designed, all of our work items can be managed like any other piece of software: version control, testing, CI/CD, documentation are all first class citizens in the Hop platform. Let’s put the prejudice to rest: visually designed code is code and can be handled just like any other type of code.
  • Metadata driven: a strict separation of data and metadata allows you to design data processes regardless of the data itself
  • Runtime agnostic: design once, run anywhere. We’re all working to solve data problems, not Spark, Flink, AirFlow or any other engine-specific problems. We want you to be able to design a data process and run it on any engine you want.
  • Pluggable: all of the components in the Hop platform should be pluggable. As a developer, this makes it easy to add new functionality. As a system administrator, if gives you full control over the functionality you want to allow in your systems, as a data designer, it gives you full control to pick and choose the functionality you want to use.

Comments