Code in Workbooks
A workbook is where you write code and create documents.
To access the workbooks in your data pod, start your data pod and go to the Analysis panel.
Workbooks are part of the Results layer of your data pod.
The Results layer is where you code in data science workbooks and create visualisations in infographics. In this environment, you have an overview of all result entities created within your data pod and grouped by type.
The centerpiece of the Analysis section is the Results environment where we have a masonry view of all workbooks and infographics within your data pod.
Hit the edit icon to open a workbook and inspect the results.
Workbooks allow you to view code results and documentation immediately within the code flow.
By hiding the actual code, you can change the view of the workbook to a document that can be read by non-experts.
By hiding results and documents, you can view the pure code of a document.
To use the hardware resource efficiently, we provide a workbook editor directly within the platform.
Like most things in your data pod, the workbook editor is geared towards full automation.
Other workbook environments such as Jupyter or Zeppelin can also be used as an alternative to the data pod workbooks. This can be done via our Direct Access interface. Using these workbooks, however, means that you may have to provide additional compute and storage resources. Also, you may have to take network bandwidth into account when shipping data from your data pod to your workbook environment.
Go to the Analysis panel to create new workbooks or edit existing ones.
To create a workbook, hit the Workbook button:
To edit a workbook, click on one of the existing workbook cards within the Results area.
Either way, you are redirected to the workbook editor of your data pod.
The workbook editor consists of several panels and control elements. The workbook is assembled by stacking workbook cards. By using the insert card buttons as shown below, you can insert new cards into the editor.
As you see on the screenshot, you have a markdown card, a Python card, and a PostgreSQL card:
Click on the button in the upper right-hand corner to display the general workbook settings:
From here, you can hide or unhide the code, restart or stop the Python kernels, execute the card, rename and/or delete your workbooks. You also get an overview of some useful keyboard shortcuts.
Every change you make anywhere in the workbook is automatically saved.
The following sections describe the different types of cards within the workbook in more detail.
A PostgreSQL code card enables you to write native PostgreSQL SELECT
statements.
Click on the PostgreSQL button to create a card:
Using the toolbar on the left-hand side, you can manage the aspects of this card.
The Run button allows you to execute your SELECT
statement. During execution, the button turns into a red Stop button which you can select again if you choose to cancel the query execution.
The result of the query is stored in a table in the database. The name of this table is displayed right in the result section in this card. If you want to do further processing, you can query this result in another workbook card.
If you automate a card, the card will be executed whenever one of the dependent tables of your SELECT
statement gets refreshed data. This will keep the card results always up to date.
Since the results of a code card are stored in a regular table, you can use this data in your infographics and access it through our web API or our direct data pod access.
Note: All code cards can be executed in parallel.
The markdown card enables you to provide documentation snippets for your workbook using a simple markdown format. A quick intro to markdown can be found here.
Click on the Markdown button to create a card:
To edit the text in a markdown card, click on the card to get the markdown code to edit. Once you click outside of the card, the card will be re-rendered:
A Python card enables you to write native Python code.
Click on the Python button to create a card:
Your card will look like this:
Here you can also use the built-in functions to create a table from a DataFrame, remove a table, retrieve a DataFrame from a table, display the table in the grid output and the visual data in the graph output.
Use the Run button at the top or cmd+E (keyboard shortcut) to execute the Python code.
During execution, you can always click on the Stop button to abort the code execution.
The Python card (v1) comes with pre-installed Python libraries. The Python card (v2) will support the installation of additional libraries using pip.
putTable (data_frame , table_name)
Creates a PostgreSQL table from a DataFrame (if a table already exists, it is replaced).
Parameters
Returns: table_name
Note that a table_name is always appended with wkb_. So if the table_name is emp_data, the actual table will be saved as wkb_emp_data.
Example
Returns a Postgres table into a DataFrame.
Parameters
Returns: DataFrame
Example
Drops the table created by the putTable function.
Parameters
Returns: String Message
Example
Displays the table in the grid output.
Parameters
Returns: table_name
Example
showDataFrame (data_frame , table_name)
Internally calls the putTable and the getTable functions.
Parameters
Returns: DataFame
Example
For data visualization, you can use libraries like matplotlib or seaborn. As usual, calling the function plt.show() will render the plot in the return window below the code.
Example Matplotlib
Example Seaborn
Stats and Analysis: Numpy, Pandas, SciPY
Machine Learning: TensorFlow, Lifelines, scikit-learn, xgboost, lightgbm
Data Visualization: Seaborn, Matplotlib, Networkx
Once you have explored the chapter on workbooks, you can check out the other Results entities: infographics.
If you have any questions at this point or if you encounter any issues, do not hesitate to get in touch with our support team.
Last updated