This demo walks you through the steps required to setup Simba Hive ODBC driver and connect Microsoft Power BI Designer to Hive data source. You will also see a short demo of Power BI Designer data cleanup features and the use of native Hive Query Language.

Sample connection strings

  • SQL-92 translation:
  • Native HiveQL:
  • Please refer to Apache Hive ODBC Driver User Guide for description of the driver configuration options you will need to create your own connection string.
  • You can also define a system DSN using ODBC Administrator and refer to it by name in your connection string. E.g. for the DSN named “HIVE32”:

Links from the demo

Dataset Preparation

1 million row MovieLens dataset was used for the demo. Please refer to the following page for additional steps used to prepare the dataset for import (change of delimiters and addition of occupations table).

If you use different MovieLens dataset you may need to pre-process files differently and adjust Hive schema definition.

Hive schema definition

Bash script to upload/refresh the dataset in Hadoop