Data Science

So the second Strata conference has came and gone. (Actually, the event consists of three sub-conferences: JumpStart, Summit and Conference; but I digress…) The general tone of the event is excitment. The presenters–consisting of vendors and practitioneers–were understandably excited and proud about the self-named field of "Data Science". Recent intersecting trends have brought us to our current point in time. Commoditization of computing power, proliferation of sensors, and low cost manufacturing together are ushering in the age of Big Data. It is possible to record just about any data stream that you can imagine and social networks is the obvious prominent driver of much of these development. Oreilly did a great job in bringing together a varied group. In particular, I found Monica Rogati's session on her experience with LinkedIn's exploding user base and TJ Patil's session on Data Science particularly engrossing for their mix of narrative, technology, and war stories. Unfortunatley, only TJ's session is available on Oreilly's channel at YouTube (here).


I stopped by the Tableau booth to check out the capabilities of the free public version of Tableau (available at It is limited to sourcing data from a textfile. I enquired casually about the possibility of connecting to NoSQL databases. Currently, there is no connectivity but there is apparently something in the works. In speaking with DataStax's Ben Coverston, he points out that Cassandra CQL is accessible via JDBC. Cassandra 1.1 is coming out RSN. I'm planning to look into how to hook up Tableau to Cassandra directly. Drop me a line if you're interested in this.