HDP-2.0 Beta

Hot on the heels of the Apache Hadoop 2.1 beta comes the beta of Hortonworks Data Platform 2.0, which is uses those Hadoop 2.1 binaries as the foundation of the platform.


HBase 0.96 is in there, which is the protobuf 2.5-compatible version, as is Hadoop 2.1-beta. It won't be appreciated to the outside world how traumatic the upgrade from protocol buffers to from 2.4 to 2.5 was, but it was, believe me. The wire format may be designed to be forward- and backward- compatible, but the protobuf.jar files, and java code generated by protoc, is neither. Everything had to be moved forwards to protobuf 2.5 in one nearly-simultaneous move, which, for us coding against hadoop branch-2.1 and HBase 0.96 meant a few days of suffering. Or more to the point: a few days holding our versions locked down. On slide 7 of my HBase HUG Hoya talk I give a single line to build HBase for Hoya "without any changes"

As the audience may recall, I diverged into the fact that this only worked on specific versions of Hadoop 2.1, and, if you had Maven 3.1 installed, by explicitly skipping any site targets when building the tarballs. Even then, I didn't go near the how to upgrade protoc on ubuntu 12.x either.

Anyway, it's done, we can now all use protbuf 2.5 and pretend it wasn't an exercise in shared suffering that went all the way up the stack, all a week before the 2.1 beta came out.

As I was in the US at the time all this took place, I really got to appreciate how hard it is to get a completely tested Hadoop stack out the door -identifying integration problems between versions -and addressing them, testing for scale and performance on our test clusters. Everyone: QA, development and project management were working really hard to get this all out the door, and they deserve recognition as well as a couple of days rest.  It also gives me a hint of how hard it must be for a Linux vendor to get that Linux stack out the door -as things are even more diverse there. It shows why RHEL own the enterprise Linux business -and why they don't rush to push out leading-edg

Returning to the HDP-2 beta, there's a lot of stuff in the product: now that YARN is transforming a Hadoop cluster in a data-centric platform for executing a whole set of applications, things are getting even more exciting:  download HDP-2.0 beta and see.

As for me, I note a mention of Hoya in the blog posting, as well as the way HBase appears above and not alongside YARN, which is a bit of a premature layout -either way it means I am going to be busy.

Elephants have right of way

No comments:

Post a Comment

Comments are usually moderated -sorry.