Connect Microsoft SSIS to Hadoop Hive

November 12, 2013 facebooktwittergoogle_plusredditpinterestlinkedinmail by in Big Data   Cloudera   Data Access   Datastax   Hadoop   Hortonworks   Intel   Interoperability   MapR   Microsoft   ODBC   Qubole   Relational Database Connectivity   SQL Server  

As Hadoop and Hive gain traction in the marketplace, many people are looking to connect Microsoft SSIS to Hadoop Hive.  Microsoft SSIS (SQL Server Integration Services) is a tool used to move data to and from other sources to Microsoft SQL Server.  Simba has built the industry standard ODBC driver for Hive – it is the one that comes with all major Hadoop distributions including Cloudera, DataStax, HortonWorks, Intel, MapR, Microsoft, and Qubole.  Matt Masson has written a great blog post entitled “Using Hive ODBC from SSIS“.  In the post, Matt explains how to get the Hive ODBC driver, how to set up the DSN, and then how to configure your data flow.  Definitely worth a read if you are using Hadoop, Hive, SSIS, and SQL Server.  By the way, since Simba is the industry standard ODBC driver for Hive, this will work for all major Hadoop distributions, not just HDInsight.