Posts

Showing posts from February, 2015

Adopting Microservices at Netflix: Lessons for Architectural Design

In some recent blog posts, we’ve explained why we believe it’s crucial to adopt a four‑tier application architecture in which applications are developed and deployed as sets of microservices . It’s becoming increasingly clear that if you keep using development processes and application architectures that worked just fine ten years ago, you simply can’t move fast enough to capture and hold the interest of mobile users who can choose from an ever‑growing number of apps. Switching to a microservices architecture creates exciting opportunities in the marketplace for companies. For system architects and developers, it promises an unprecedented level of control and speed as they deliver innovative new web experiences to customers. But at such a breathless pace, it can feel like there’s not a lot of room for error. In the real world, you can’t stop developing and deploying your apps as you retool the processes for doing so. You know that your future success depends on transitio...

8 ways to replace HDFS

Hadoop is on its way to becoming the de facto platform for the next-generation of data-based applications, but it’s not without flaws. Ironically, one of Hadoop’s biggest shortcomings now is also one of its biggest strengths going forward — the Hadoop Distributed File System. Within the Apache Software Foundation, HDFS is always improving in terms of performance and availability. Honestly, it’s probably fine for the majority of Hadoop workloads that are running in pilot projects, skunkworks projects or generally non-demanding environments. And technologies such as HBase that are built atop HDFS speak to its versatility as storage system even for non-MapReduce applications. But if the growing number of options for replacing HDFS signifies anything, it’s that HDFS isn’t quite where it needs to be. Some Hadoop users have strict demands around performance, availability and enterprise-grade features, while others aren’t keen of its direct-attached storage (DAS) architecture. Concerns arou...

HBase Bulk Loading

Apache HBase is all about giving you random, real-time, read/write access to your Big Data, but how do you efficiently get that data into HBase in the first place? Intuitively, a new user will try to do that via the client APIs or by using a MapReduce job with TableOutputFormat, but those approaches are problematic, as you will learn below. Instead, the HBase bulk loading feature is much easier to use and can insert the same amount of data more quickly. This blog post will introduce the basic concepts of the bulk loading feature, present two use cases, and propose two examples: https://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/