Importing Raw Data

This section contains an overview of the Raw Data panel of your data pod. Here you will find out how to load your source data into tables where your data is stored in raw data packages for later proce

To begin, select your source and follow the steps indicated below:

Getting started

Start your data pod:

Data import from Sources takes place in the Raw Data panel.

Here, the raw data from your data sources is loaded into entities called Raw Tables.

An overview of your Raw Tables is displayed on the left-hand side of the Raw Data panel:

Depending on the type of data contained in the Raw Table, each table has a different icon.

Existing tables

To view or modify an existing Raw Table, access the Raw Table editor for each existing table from the list of available tables located in the left-hand side panel in Raw Data.

Use the search bar to locate a specific Raw Table from your list of available tables.

Click on the edit icon located next to each table name to proceed.

New tables

To create a new Raw Table, go to Raw Data and click on +Table:

Now you have the option to specify your data source. The currently available source options, as already discussed in Sources, are an IoT router and web data.

Click on Create Raw Table next to your selected data source type:

This action will open the Raw Table editor where you can enter your specifications and click on Apply to finalize. Depending on the type of source data to be loaded, each Raw Table type has its own Raw Table editor.

Below you will find more details about the editors for IoT routers and web data.

Once you have created a Raw Table for your data upload, the new table will be instantly added to your list of available tables.

Raw Tables

Raw Tables serve as buffers that accept and store data packages for later processing.

When a data package is loaded into a Raw Table, it is assigned a unique package_ID. This ID is stored in the special column package_ID of the Raw Table.

The Raw Table of each import type has its symbol:

Raw Tables are differentiated from other table types via the prefix S_.

All Raw Tables available within the data pod are arranged on a time axis tracing the contained data packages over time.

Here you manage your data import packages and the Raw Tables containing these data packages.

You also have a graphical overview of all imports arranged on a time axis.

Your existing Raw Tables are displayed on the left-hand side of the time axis.

Each Raw Table has a right-hand side section where a gray stripe represents each contained package. These Raw Tables store imported packages of the same structure for later processing.

You can scroll to zoom and drag to pan:

To inspect the contents and properties of a data package, hover over the stripe representing the data package arranged on the time axis. You get information about the file type, package ID, number of rows, load date, and the scope settings of the package.

To inspect the contents of a Raw Table, hover over the Raw Table icon to display the hover menu and click on the eye symbol Content. The contents of the table are then displayed in the lower-screen area as follows:

You can edit the settings of each Raw Table by clicking on the Raw Table symbol. Clicking on the Raw Table symbol displays a hover menu with details about the table size and an Edit option:

Click on Edit to open the editor panel for Raw Tables.

Depending on the source type for your load scenario, a different Raw Table editor with different functionalities will be displayed as per the data source of your existing/created Raw Table.

Below you will find more details about the individual Raw Table editors.

Below is an overview of the process of loading data into a Raw Table, based on the different available import sources.

Data import from an IoT router

Go to Raw Data to load the data from your IoT router into a Raw Table.

You have two options:

Load data into a new Raw Table

To load the data into a new Raw Table, click on the button at the upper left-hand corner:

A pop-up window will display the list of available data sources:

From the list of available data sources, go to IoT and click on the corresponding Create Raw Table button.

Now type in a name for your new Raw Table and hit Create:

This action will display the Raw Table editor for IoT connections.

Load data into an existing Raw Table

To load the data into an existing Raw Table, click on the Raw Table icon to display the hover menu.

Then click Edit from the hover menu to display the Raw Table editor for IoT connections.

The Raw Table editor for IoT connections will look like this:

The uppermost section contains the IoT symbol and the name of the Raw Table created for this data source.

You have an overview section containing information about the data source currently in use, the source type, as well as the current state of the connection (connected/stopped).

Note that in this case, you have an Automation section, a Pull Rule section where you specify how you want your data to be pulled from the IoT source, as well as the sections Table Columns, Retention Settings, and a Custom Table Description.

Automation

The Automation card is where you activate the schedule for pulling data from the IoT source into the raw table:

Click on the checkbox Activate Schedule and hit Apply to activate the pull schedule.

Once you activate the schedule, you will start receiving data on the topic that has been specified in the pull rule.

Note that the source has to be connected for you to receive data on a given topic.

Pull Rule

The Pull Rule card is where you specify a rule that determines how your data is to be pulled from the IoT source.

Here you select the time intervals at which data should be pulled:

You need to provide the URL and the topic to which you want to subscribe.

In addition, you can set up incremental loads by specifying the time interval in seconds as well as the maximum number of rows for your load.

In the Max Rows field, you can determine after how many rows a data package should be created and inserted into the specified Raw Table.

In the Time Seconds field, you can specify after how many seconds a data package should be created and inserted into the specified Raw Table.

A data package will be inserted if one of the two settings, Max Rows or Time Seconds, are read based on whichever comes first. The History field will request data history upon connection over the specified amount of rows.

Click on Apply to confirm your changes.

The loaded data package will be displayed immediately on your Raw Tables dashboard in Raw Data.

Hover over the loaded package to access the hover menu. Click on Content from the package hover menu to display your loaded data packages.

Clicking on Remove and confirming will irreversibly remove the loaded data package from your table.

Table Columns

Here you can change the column names and data types of your Raw Table:

The Raw Table columns are derived from the original columns of the import with which you have created the table.

You can add additional columns with values derived from the original columns.

Within the table columns editor, you can also change the column names and data types of the raw table and add derived columns.

If you add a new column, you can provide an SQL expression on how to derive the contents of the new column from the already existing columns.

If the Raw Table already contains data packages, the column for these packages will be generated as soon as you hit the Apply button.

Depending on the size of the Raw Table, saving may take some time. Each new data package loaded in the Raw Table will also get the new derived column populated as per the provided SQL expression.

Data types can only be changed into other compatible data types.

The SQL transform function is used to transform the raw text from a column in a source package into the appropriate data type.

You can also add extra columns and derive additional information from existing columns. This is particularly useful in the case of IoT import streams where two columns arrive as a JSON array and JSON data types and you can add extra columns to extract the important features from JSON.

For more information on the expressions you can use in the SQL transformation fields, you can refer to the PostgreSQL documentation.

Note that the contents of an existing column can only be changed if you change the data type as well.

Use the (red) bucket symbol to remove a column.

Retention Settings

Here you determine for how long a loaded data package should be kept in your Raw Table:

The data packages in a Raw Table will be retained in that table until deleted manually.

For this reason, it is advisable to prevent the Raw Table from growing indefinitely. Especially in high-frequency load scenarios (e.g. if a new data package is loaded every five minutes), it is recommended to retain only a certain limited number of packages.

In the Retention Settings, you can restrict the number of packages in a Raw Table by either age or number.

If you restrict by the number of packages, the oldest packages will be removed from the Raw Table if the number of packages exceeds the predefined maximum.

If you enter a setting of zero, no packages will be removed from the Raw Table.

You can also select both retention options: the maximum number of packages and the maximum age of packages in days. The packages will then be removed if either the selected maximum age or the maximum number of files is reached.

Custom Table Description

The last section of the table editor allows you to add a custom description for your table. Click on Apply to confirm your changes.

Delete and Empty Out

The Raw Table editor allows you to perform Delete and Empty out actions on your current import.

Delete will permanently remove your entire table and its contents.

Empty out will only remove all table contents but will retain the table.

You only receive IoT data if the connection is activated, at least one subscription for that connection is activated, and if a client publishes data on the subscribed topic.

Data import from a web API

Go to Raw Data to load your web data into a Raw Table.

You have two options:

Load data into a new Raw Table

To load the data into a new Raw Table, click on the button at the upper left-hand corner:

A pop-up window will display the list of available data sources:

From the list of available data sources, go to WEB and click on the corresponding Create Raw Table button.

Now type in a name for your new Raw Table and hit Create:

This action will display the Raw Table editor for web connections.

Load data into an existing Raw Table

To load the data into an existing Raw Table, click on the Raw Table icon to display the hover menu:

Click on Edit from the hover menu to display the Raw Table editor for web connections.

In both scenarios - loading data into a new Raw Table and loading data into an existing Raw Table, the Raw Table editor for web connections will be displayed next.

The Raw Table editor for web connections will look like this:

The uppermost section contains the web symbol and the name of the Raw Table created for this data source.

You may have a short overview section containing information about the data source currently in use, the source type, as well as the current state of the connection (active/inactive).

Note that in this case, you have an Automation section, a Pull Rule section where you specify how you want your data to be pulled from the web source, Table Columns, as well as the section Custom Table Description.

Automation

The Automation card is where you activate the schedule for pulling data from the web source into the Raw Table:

Here you can set the regular time intervals to pull data from your data source.

You can set an interval (a digit) at which your data should be pulled and select a frequency from the dropdown menu. In selecting a frequency value, you can determine whether you want the information to be loaded secondly, minutely, hourly, daily, or weekly.

Once you activate the schedule, you will start receiving data. Click on the checkbox Activate Schedule and hit Apply to activate the automation.

Pull Rule

The Pull Rule card is where you specify a rule that determines how your data is to be pulled from the web source:

You need to provide the URL.

In addition, you can use the SQL query tool to generate results from existing tables that can be passed as URL parameters.

Before starting the web poll, it is recommended that you test the query and API to make sure the results are correct. Hit the corresponding play icons to do so.

The results of the API test and of the query will be displayed in the lower left-hand part of your Raw Data dashboard.

The result of our API query test looks like this:

The result of the API test is displayed as follows:

Click on Apply to confirm your changes.

The loaded data package will be displayed immediately on your Raw Tables dashboard in Raw Data.

Hover over your loaded data package to reveal the hover menu:

Click on Content from the package hover menu to display your loaded data packages.

Now you can inspect your packages more closely. You can view the content of the package in the preview window at the lower left-hand corner of your screen.

Clicking on Remove and confirming will irreversibly remove the loaded data package from your table.