A new year—2014—always brings hope and optimism. The world of data at least on the consumer end (e.g., smart home, IoT) will be fun to watch with CES just around the corner. The recent debates on the value/hype/reality of Big Data has at least one data point now. Let’s just declare 2013 the peak of “Big Data”; we can get past the hyperbole and dig into the challenges ahead of making data work.


The last event I attended in 2013 was the inaugural Shark Summit. It was a smaller event reflecting the early stage of commercialization. Astute observers may recognize that Spark and Shark have appeared in previous Hadoop events and that I have mentioned them as technologies to watch. I think 2014 will be Spark’s breakout year. (Refer to Thomas Dinsmore’s post from just before Christmas for a good rundown on Spark.)


The founding of Databricks this past September is a positive vote of confidence for what Spark and Shark offers. Some may question the need for yet another SQL-on-HDFS technology. But the problem (of commodity scaled-out query execution) is far from being solved. Consider 2013 the beginning of the Cambrian explosion for SQL-on-HDFS. We still have much to look forward to in 2014:

  • Drill 1.0
  • Spark in CDH5