Posts

Showing posts from September, 2016

Random Data Generator by Hortonworks

Useful tool for generating large amounts of random data for demos, exploring new tools and performance benchmark testing. https://github.com/jgalilee/data Install the data package with go get github.com/jgalilee/data . There are two sub-packages included with this the transactions package, and the points package.

Synchronizing Clocks In a Cassandra Cluster

Image
The Problem (part 1) Cassandra is a highly-distributable NoSQL database with tunable consistency. What makes it highly distributable makes it also, in part, vulnerable: the whole deployment must run on synchronized clocks. It’s quite surprising that, given how crucial this is, it is not covered sufficiently in literature. And, if it is, it simply refers to installation of a NTP daemon on each node which – if followed blindly – leads to really bad consequences. You will find blog posts by users who got burned by clock drifting. In the first installment of this two part series, it is covered how important clocks are and how bad clocks can be in virtualized systems (like Amazon EC2) today. Details: https://blog.rapid7.com/2014/03/14/synchronizing-clocks-in-a-cassandra-cluster-pt-1-the-problem/ Solutions (part 2) Some disadvantages of off-the-shelf NTP installations, and how to overcome them. Details: https://blog.rapid7.com/2014/03/17/synchronizing-clocks-in-a-cassandra-cluster