Connect Microsoft SSIS to Hadoop HiveNovember 12, 2013 by Amyn Rajan in Big Data Cloudera Data Access Datastax Hadoop Hortonworks Intel Interoperability MapR Microsoft ODBC Qubole Relational Database Connectivity SQL Server
As Hadoop and Hive gain traction in the marketplace, many people are looking to connect Microsoft SSIS to Hadoop Hive. Microsoft SSIS (SQL Server Integration Services) is a tool used to move data to and from other sources to Microsoft SQL Server. Simba has built the industry standard ODBC driver for Hive – it is the one that comes with all major Hadoop distributions including Cloudera, DataStax, HortonWorks, Intel, MapR, Microsoft, and Qubole. Matt Masson has written a great blog post entitled “Using Hive ODBC from SSIS“. In the post, Matt explains how to get the Hive ODBC driver, how to set up the DSN, and then how to configure your data flow. Definitely worth a read if you are using Hadoop, Hive, SSIS, and SQL Server. By the way, since Simba is the industry standard ODBC driver for Hive, this will work for all major Hadoop distributions, not just HDInsight.