Apache Zookeeper's poison packet
The leader election and failure detection mechanisms are fairly mature, and typically just work… until they don’t. Four different bugs resulting in random cluster-wide lockups. Two of those bugs laid in ZooKeeper, and the other two were lurking in the Linux kernel. This is the story. https://www.pagerduty.com/blog/the-discovery-of-apache-zookeepers-poison-packet/