Posts

Showing posts with the label VectorH

Actian VectorH architecture and Amazon Redshift benchmark papers

Actian Vector in Hadoop (VectorH for short) is a new SQL-on-Hadoop  system  built  on  top  of  the  fast  Vectorwise  analytical database system.  VectorH achieves fault tolerance and storage scalability by relying on HDFS, and extends the state-of-the-art in SQL-on-Hadoop systems by instrumenting  the  HDFS  replication  policy  to  optimize  read  locality. VectorH integrates with YARN for workload management, achieving a high degree of elasticity.  Even though HDFS is an append-only filesystem, and VectorH supports (update-averse) ordered tables, trickle updates are possible thanks to Positional Delta Trees (PDTs), a differential update structure that can be queried efficiently.  The paper describes the changes made to single-server Vectorwise to turn it into a Hadoop-based  MPP  system,  encompassing  workload  management, parallel  query  optimizat...