The Hive User Group held its semi-regular meetings last week at Hortonwork’s office.
This was a jam-packed evening as the Stinger initiative has started to deliver some of its firstfruits. Gunther’s session on Hive 0.11 was the longest talk of the night and generated some of the liveliest discussion in the room.
A Webex recording of this meeting has been made available by Hortonworks. I offer the time marks as a public service for anyone wanting to review this :
- +0:04:41: “Merge of HCatalog and Hive” by Alan Gate
- +0:13:30: “The Hive Security Situation” by Carl Steinbach
- Carl proposed a generalized AccessServer as part of HiveServer2’s future.
- +0:47:15: “New cool stuff in Hive 0.11” by Gunther Hagleitner
Lotsa details on what’s just around the corner in 0.11. The vectorization (or arguably less confusing columnization) feature is showing great promise.
- +1:33:30: “Optimizing Hive Queries” by Owen O’Malley
Owen shared a lot of the internals of the ORC file format including the great compression.
- +2:10:28: “The Hive Marketplace” by George Chow
This was open source collaboration in action. I shared our performance benchmarks of HiveServer2 from late last year. I had previously shared this privately with a few individuals including Carl. Carl was gracious to step up and explain his findings–that this was possibly due to a poor implementation of union in Thrift. (Update: refer to JIRA 3746.) I think we’re enroute to a solution real soon now.