Zeppelin Basic Tutorial
- Try to follow the official tutorial of Zeppelin Notebook step-by-step.
1. Create a new Notebook
Click on 'Create new note', and give a name, click on 'Create Note': Then, you will see a new blank note:
Next, click the gear icon on the top-right, interpreter binding setting will be unfolded. Default interpreters will be enough for the most of cases, but you can add/remove at 'interpreter' menu if you want to. Click on 'Save' once you complete your configuration.
2. Basic usage
You can click the gear icon at the right side of the paragraph. If you click 'Show title' you can give a title as you want for each paragraph. Try to use other commands also.
Like other Notebooks, e.g. Jupyter, we can put some text in a paragraph by using
md command with Markdown syntax:
%md <some text using markdown syntax>
After put text, click
play button or
Shift+Enter to run the paragraph. It will show formatted Markdown text.
You can also choose show/hide editor for better visual or publishing.
If you bind default interpreters, you can use scala codes as well as Spark API directly in a paragraph:
Again, do not forget to click
play button or
Shift+Enter to actually run the paragraph.
If you meet an error related with HDFS, please check whether you have created HDFS user folder for 'zeppelin' as described in Zeppelin-Intro
3. Load Data Into Table
We can use sql query statements for easier visualization with Zeppelin. Later, you can fully utilize Angular or D3 in Zeppelin for better or more sophisticated visualization.
Let's get 'Bank' data from the official Zeppelin tutorial.
Next, define a
case class for easy transformation into
DataFrame and map the text data we downloaded into DataFrame without its header. Finally, register this DataFrame as
Table to use sql query statements.
4. Visualization of Data via SQL query statement
Once data is loaded into
Table, you can use
SQL query to visualize data you want to see:
%sql <valid SQL statement>
Let's try to show a distribution of age of who are younger than 30.
As you can see, visualization tool will be automatically loaded once you run a paragraph with SQL statement. Default one is the result table of the query statement, but you can choose other types of visualization such as bar chart, pie chart and line chart by just clicking the icons.
Also, you can change configurations for each chart as you want
You can create input form by using
Also, you can create select form by using
For more dynamic forms, please refer to zeppelin-dynamicform
5. Export/Import Notebook
Once you finish your works, you can export Notebook as JSON file for later use.
Also, you can import Notebook exported as JSON or from URL.
You can download the JSON file for this tutorial here or see the official 'Zeppelin Tutorial' on the frontpage of Zeppelin.