It took Strata three years to outgrow a hotel. I had to consult my Google calendar to confirm that it was only three years ago–in September 2011–when Oreilly held the first Strata conference at the Marriott Marquis. By comparison, the 2014 event was at the spacious Javits Center. The difference is not subtle.

Strata continues to have a well rounded set of sessions: covering top-of-the-stack concerns like D3.js (my pick for the Wednesday AM tutorial) to updates on core technology (see Julian Hyde’s Calcite) to critical evaluations (Greg Rahn’s survey of SQL-on-Hadoop benchmarks). Unfortunately, Oreilly continues to pare back the amount of content available on their free Youtube channel so you will have to buy the video set to get to the good stuff.

Here are my observations on the key trends:

1. Apache Spark and Databricks are clearly at centre stage. Of day one’s five keynote speakers from a product company, three highlighted their integration with Spark. If you haven’t looked into Spark already, what are you waiting for? Databricks had a number of session across various tracks and they were not sponsored sessions so it’s clear that the organizing committee recognizes its significance too.

2. SQL-on-Hadoop is a given. Sessions covered every current option including Impala, Spark SQL, and Hive (with Tez). The only omissions were Hawq and Drill. And those were mentioned in Greg’s session or appeared in MapR’s booth. Greg’s session is spot-on and is a must see for anyone building or evaluating SQL-on-Hadoop technology.

3. Hive–the granddaddy of all SQL-on-Hadoop–is more vigorous than ever. The Hive meetup on Wednesday night saw a record seven presenters. Rohit Dholakia and I presented early findings from our research into Hive query resultset compression. Expect more to come on this subject in the coming months.

4. The established vendors–Oracle, SAP–are present though in a curious reversal, they’re the sideshow on the show floor. Of course, this is the premier Hadoop event so this is not surprising. But it is telling that the establishment sees fit to show up.

The 5500 in attendance occupied about two floors of Javits. The sprawl of Javits necessitated a lot of fast walking in between sessions. Data from my Jawbone UP says that I walked 48% more steps over the three days in 2014 than in 2013 (46491 steps vs 31360 steps). I did miss the Data Sensing Lab from past year which collected environmental data (temperature, sound, etc) from across the venue.

Be sure to catch the San Jose event in February.