With the introduction of the Amazon Redshift service, data analysts get access to a fully managed petabyte-scale data warehouse service without the headaches of its provisioning and administration.
In this blog post and accompanying video, I will walk you through the simple steps to connect QlikView Desktop to Amazon Redshift using the Simba ODBC driver for Redshift and explore the popularity of movies by audience demographics using the open source Movielens dataset.
There are free evaluation versions for all four prerequisites.
Installation and Configuration
Before we can connect to the data and build a visualization, we need to get information about the Redshift service endpoint. Login to your AWS console and open the Redshift console. Start the cluster if necessary and load the data using any Postgres-compatible client. You may need to allow connections to your Redshift cluster from your current IP by adding it to the Security rules. Copy the name of the Redshift cluster endpoint and note the name of the database.
Now head over to Simba Redshift ODBC driver page and download the free evaluation version. Follow the instructions in the driver Installation and Configuration guide to install the driver and evaluation license.
Data source connections are created using the ODBC Data Source Administrator accessible from the Control Panel. Start the ODBC Administrator and click the “Add…” button to create a new DSN. Select Simba Amazon Redshift ODBC driver from the list and click OK. You will be presented with the DSN Setup dialog.
Give your DSN a name (e.g. Demo) and in the server/port fields provide the endpoint you copied from the Redshift cluster configuration. Enter the name of the database holding the demo data.
Finally, click “Test” button to verify connectivity. If you get the “SUCCESS” message, you are ready to go.
Analyzing Movie Popularity
Start QlikView desktop and create a new file. Click the edit script icon (Ctrl+E) and copy/paste the attached data load script. You may need to modify the name of the DSN to match the one you created. The script will load all Movielens tables from Redshift and rename the foreign key columns to allow QlikView to identify the relationships between the tables.
Click the “Load” icon (Crl+R) to load the data into QlikView and select the fields you want to use in your visualization. Follow the attached video to create a chart and associated filters needed to identify top rated movies for an audience of particular demographics.
For more information on connecting other ODBC-compliant applications like Tableau, Excel or Power BI, using Simba Redshift ODBC Driver see our Quickstart Guide.
Now it’s your turn!
Download a free evaluation version of Simba Redshift ODBC driver and start exploring!