Extracting data from databases and data files
The data within an organization is usually stored in relational databases and data files. Extracting data is the first step towards creating a data model. The following section demonstrates the steps to extract data from an MS Access database and a delimited (.CSV
) file. The procedure to extract data from other relational databases is the same as the process for extracting data from MS Access.
The dataset that we will use is available publicly and covers information about routes and fares of various transport systems in Hong Kong. The original data files have been downloaded from (https://data.gov.hk/) website. This dataset can also be obtained from the Packt Publishing website.
The data connections in the Qlik Sense data load editor save shortcuts leading to commonly used data sources, such as databases and data files. The following types of connections exist in Qlik Sense:
- ODBC database connection
- OLEDB database connection
- Folder connection
- Web file connection
This recipe deals with the ODBC, OLEDB, and Folder connections. The web file connection will be dealt with in a separate recipe.
Getting ready…
The dataset required for this recipe that is downloaded from the Packt Publishing website comes in a zipped folder called as QlikSenseData
. Extract all the files from this zipped folder and save them on the hard drive at a desired location.
If you are connecting to the database using Open Database Connectivity (ODBC) then:
- Install the relevant ODBC drivers on your system.
Note
For the sake of our exercise, we need the MS Access drivers. The system DSN connection can be set up through the ODBC administrator under the Administrative Tools in Control Panel.
- While setting up the ODBC connection, select the
ROUTE_BUS.mdb
file as the Data Source from theQlikSenseData
folder. - Name the ODBC DSN connection as
HongKong Buses
. - Create a new Qlik Sense application and open the data load editor.
- Click on the Create New Connection and select ODBC.
- Select HongKong Buses under System DSN.
- Name the data connection as
Qlik Sense CookBook ODBC
. - The following image shows the details we enter in the Create new connection (ODBC) window:
If you are connecting to the database using OLE DB connectivity, we can directly set this up through the editor:
- Open the data load editor in Qlik Sense.
- Click on the Create New Connection and select OLE DB.
- Select the Microsoft Jet 4.0 OLE DB Provider (32 Bit) driver from the provider drop-down list.
- Insert the Data Source file path, which in our case will be the path for the
ROUTE_BUS.mdb
file in theQlikSenseData
folder. - Name the data connection as
QlikSense CookBook OLE DB
. - The following image shows the details we enter in the Create new connection (OLE DB) window:
If you are extracting the data from a data file, such as .CSV
, perform the following steps:
- Open the data load editor in Qlik Sense.
- Click on Create New Connection and select Folder.
- Select the location of the
QlikSenseData
folder which contains our data files. Alternatively, one can directly enter the path of the source folder under Path. - Name the data connection as
Qlik Sense CookBook Data
. - The following image shows the details we enter in the Create new connection (folder) window:
- Once the connections are created in the Qlik Sense library, they will be seen as a list under Data connections in the data load editor, as shown in the following screenshot:
How to do it…
If you are working with an ODBC or an OLEDB data connection, follow the steps:
- Insert the relevant data connection string to the script by clicking on Insert connection string, as shown in the following screenshot:
- Next, click on Select data under Data connections to view and extract data from the
ROUTE
table in the MS Access database, as shown: - The preview of the
ROUTE_BUS.mdb
table will look like the following. The fields in the table can be excluded or renamed while working in the Preview window, as shown in the following screenshot: - Click on Insert Script in the Preview window. This will insert the connection string as well as load the statement to the script. Make sure that you delete the duplicate
LIB CONNECT TO 'Qlik Sense CookBook ODBC';
statement from your script. - Load the data in your application by clicking on the
button.
Keep the Close when successfully finished option checked in the data load progress window. If the data is loaded successfully, then the window automatically closes or else the error encountered is highlighted.
- On a similar note, in order to test the Qlik Sense data files, Click on the Select data option under the Qlik Sense CookBook Data connection.
- Next, select the
ROUTE_GMB.csv
file from theQlikSenseData
folder and load it in the application. - The preview of the
ROUTE_GMB.csv
table will look like the following screenshot. Make sure that you select Embedded field names under Field names. Note that the Delimiter in this case is automatically set to Comma. - Insert the script and then save and load it.
How it works…
The LIB CONNECT TO
statement connects to a database using a stored data connection from the Qlik Sense library; thus, acting as a bridge between our application and the data source.
There's more…
This recipe aimed at extracting data from common data sources, such as RDBMSs and data files. Qlik Sense can also extract data from web files and custom data sources such as SAP. We will see this in the forthcoming section.
See also…
- Creating a Master Library from the Data model viewer