Getting started with Jupyter Notebook

Jupyter Notebook is a development environment that runs in your web browser and can be used with several languages, including R and Python. In this blog post, we’ll look at some of the benefits of using Jupyter Notebook and how to start using it with Python.

Benefits of Jupyter Notebook

Chunking code into cells

Instead of having to write code in large flat files, developers can use Jupyter Notebook to chunk documents up into cells that can contain either code or formatted text. Running these cells separately allows for quicker iterations and more targeted edits, without the need for complex dependencies between separate files.

Referencing variables between cells

Being able to reference variables between cells means that developers can run a computationally heavy cell only once. They then have that result to use in any other cell, and can quickly make edits and test different approaches in those cheaper cells.

Making results easy to share and understand

The ability to intersperse nicely formatted text cells between code cells allows developers to use Jupyter Notebook to make good-looking documents for sharing. It also allows them to put forward clearer and more readable explanations than inline comments alone would support. This makes Jupyter Notebook great for tutorials; in fact, a university tutorial is where I first encountered it.

Stepping through script one segment at a time

Jupyter Notebook’s cell structure naturally segments code and the output is displayed directly beneath each cell, allowing developers to display intermediate results easily. It also makes cause-and-effect clearer to see than it would be in one large output from a long chunk of code.

Setting up Jupyter Notebook

You can try out Jupyter Notebook from your web browser without any setup process required. However, I would say that the best experience comes with setting up Jupyter Notebook on your machine. As long as you have both Python and a package manager installed, it’s quick to do and requires only a few commands.

To follow these instructions, you will need to be working on a Linux or Mac machine, and have both Python and the ‘Poetry’ package manager installed. Find out more about using Poetry to manage Python dependencies across multiple projects.

In your terminal, navigate to the folder that you want to contain your Python project
Run poetry new <PROJECT NAME>. This will set up the project folder. Then navigate to it using cd <PROJECT NAME>.
Run poetry add -D jupyter to download Jupyter and everything required to run it.
If you want to use outside packages in the environment you have created, run poetry add <PACKAGE NAME>. I recommend numpy, pandas, matplotlib and sklearn.
Run poetry run jupyter notebook. This will open Jupyter in your preferred browser automatically and may take a few moments to complete.
Jupyter should now be up and running. To get a notebook up, navigate to ‘New -> Python 3 (ipykernel)’.

Reading & using a Jupyter Notebook

How to read a notebook

A Jupyter notebook is split into cells and each cell can contain text, code or images. The ‘Example Heading’ cell in the example below, for instance, is a Markdown cell that employs some basic text formatting.

Beneath the ‘Example Heading’ cell is an example code cell. It contains some standard Python, which performs some arithmetic and prints the result. When you run the cell, you see the output from the code (7 in this example) visible directly beneath the code cell. The blue outline in the example above signifies that a cell is currently selected. If a cell is currently being edited, a green outline will be visible instead.

Below this is another code cell. Notice that I can reference a variable defined in an earlier cell. However, for this to work the cell that defines the variable must be run at least once before the cell that references it. The number next to the cell shows its latest completed run. Another way of telling when a cell has completed running is that it will change from [*] during runtime to a number.

How to use a notebook

Click on the grey part of a code cell to enter ‘edit’ mode. Write the code you want to run and press ‘ctrl + ENTER’ to run that cell. I typically use ‘shift + ENTER’ as it not only runs the cell, but also selects the next cell down or creates a new one if there is not one already beneath it.

Markdown cells are run the same way as code cells. To change a code cell to a Markdown cell, select it in the command mode (signified by the blue outline) by clicking on the cell – but outside the grey area – then press ‘M’.

To delete a cell, select it in command mode and double tap ‘D’

To force-halt a cell you can press the button signified by the ‘Stop’ icon at the top of the page.

It’s a good idea to take a look at the other options in the notebook. You’ll find examples of both interesting and typical commands under the headers at the top of the page. For example saving is found under ‘File’.

Tips for using Jupyter Notebook

Over the years, I’ve learned a lot about using Jupyter Notebook, including how to avoid some common pitfalls. Here are my top tips based on everything I’ve learned so far:

Running <PACKAGE NAME>? in a code cell brings up a description of the package. This also works with class names.
Try to keep your cells in the order in which you want them to be run. This way, when you reopen the notebook you can just fire through the cells using ‘shift + ENTER‘
If something is behaving weirdly, try resetting your notebook. You can then run each cell in order and see the result. It can be easy to get lost in the variable environment you’ve built up, as it may contain strange assignments and accidental reassignments. In these cases, nothing beats a switch on-switch off.
You can also reset a notebook by stopping the kernel running in the terminal with ‘crtl–C’. However, you will then have to re-run poetry run jupyter notebook.
Be aware that closing the tab does not stop the notebook running. To stop the notebook you can navigate to the initial browser tab opened by poetry run jupyter notebook, select the notebook that’s running and then press ‘shutdown’.
Be careful about which variable names you use, as you might accidentally rewrite a value assigned in an earlier cell. If you go back and run cells that depend on that value, you might get unexpected results.
This is a specific matplotlib consideration, but when you are setting axis titles make sure to run plt.xlabel(‘xlabel’) and not plt.xlabel = ‘xlabel’, which will make the xlabel object a string and you’ll have to reset the notebook to get it back to normal.
When you want to restart a notebook, make sure you’re shutting it down properly. Remember that closing and reopening the tab does not reset the notebook.
When you make a change to a function or class definition, make sure that you re-run the cell and that it successfully completes without any errors. Otherwise, the other cells will be unaware of the change and will still work off of the old definition.

That’s it for this introduction to getting started with Jupyter Notebook. I hope I’ve interested you in trying it out and that you can have fun using it!

Grinding Gears

Tales of code crunching from the FreeAgent Engineering team