Jupyter Notebooks: Work with data, code, and comments all under one roof
Get started with a great tool for creating versatile computational notebooks
Hey Grokking Python readers!
Today we're going to cover one of the most widely used online platforms for sharing code, documentation, and multimedia visualizations all in one interactive notebook. If you've ever wanted to streamline your collaborative process, house code and accompanying documentation in one space, or outsource the computational lifting to a second machine, keep reading.
Jupyter Notebook is an open-source web application used to create and share documents that have live code, equations, visualizations, and text. It’s maintained by the Project Jupyter community. In 2014, Project Jupyter and Jupyter Notebook became spin-off projects from IPython and the IPython Notebook.
Jupyter Notebooks are document-centric and feature a simple and clear user interface. For this reason, they are very popular; there are over 9 million notebooks available on GitHub. These notebooks are used for a variety of things including:
This edition of Grokking Python will walk you through:
Why you may want to use a web-based interface like Jupyter Notebook
What the setup process entails
How to get the most out of it
Why bother with Jupyter Notebooks?
There are other solutions for collaborative coding, like Microsoft's Code Spaces, but Jupyter Notebooks are the standard computational notebook for data management. If you've never used an online service for sharing code or documentation you may be at a loss as to why you may need something like this.
While we typically spotlight Python-specific technologies on Grokking Python, it is worth noting that Jupyter Notebooks support over 40 languages, including all those most commonly used.
Here are just a few other reasons that Jupyter Notebooks are great:
Live coding environments: Code can be changed and run in real-time with feedback provided directly in the browser
Interactive output: Code can produce rich output like HTML, LaTeX, images, and videos
Sharing capabilities: You can share your notebooks with others using email, Dropbox, GitHub, or Jupyter Notebook Viewer
Documentation: Notebooks support Markdown in text cells and feature inline output
Satellite computing: Notebooks are ideal for those who want to perform computationally heavy tasks remotely from a lightweight machine
Getting started with Jupyter Notebook
Unless you've specifically included it when setting up your current IDE or code editor, you'll need to install Jupyter.
This can be done a couple of ways, but the easiest is via pip, Python's own package installer.
$ pip install jupyter
It’s worth noting: If you already have the Python data science platform Anaconda installed, it comes with Jupyter Notebook, so you're ready to go.
If you just want to try using a notebook without going through the installation process, you can visit www.jupyter.org/try. Just select the "Jupyter Notebook" button and you'll be treated to a tutorial of Jupyter Notebooks and the IPython kernel classic interface.
Setting up the server
The Jupyter Notebook web application is intended for a single user running a public server to access their machine remotely. Allowing multiple users to access the same notebook server may result in commands overwriting one another. That said, you can create a multi-user server using JupyterHub. (More on this later!)
All we need to do to set up the server is create a new folder and then go to that folder location in our terminal. Then, we can run this command to start Jupyter:
$ jupyter notebook
This command will open your default browser to the Jupyter Notebook server. Now we can create our very first notebook. More detailed information on the configuration process for Jupyter can be found in the Jupyter Notebook documentation.
Creating a new notebook
Creating a new notebook is as simple as selecting "New" from the "File" drop down menu.
A fresh notebook should look something like this.
You'll notice that the first few dropdown menus, "File
", "Edit
", "View
", and "Insert
", are fairly standard and do just what you'd expect.Â
File: the place to create new notebooks and load old ones
Edit: allows you to manipulate cells
View: toggles headers, toolbars, and line numbers
The next few are slightly different.
Run: runs selected cells
Kernel: controls the execution of the code
Settings: changes the theme or language
Help: offers tutorials and reference material
Jupyter Notebooks in action
Below is an image that showcases the simplicity of combining code with documentation in a Jupyter Notebook.
Jupyter Notebook enables a rudimentary form of version control by offering "checkpoints." You can save and load checkpoints that revert your notebook to earlier iterations.
The two main cell types in a notebook are code and Markdown, but there is an option for raw text if you prefer.Â
Adding new cells is as simple as navigating to an existing cell and pressing the leftmost icon of the group.
The arrow icons allow you to move cells up and down in the notebook
The two icons to the right insert a cell above and below the current, respectively
And the trash can icon deletes the current cell
Sharing your notebook
Jupyter comes with a built-in file conversion tool called nbconvert
. This tool allows you to convert a .ipynb
notebook into different file formats. File formats available for conversion include:
Markdown
PDF
WebPDF
LaTeX
HTML
Reveal.js HTML slideshow
ReStructured Text
Python script
Ascii
executable script
Other notebook formats
To use nbconvert
:
Open up the terminal
Go to the folder with the notebook to be converted
Run the command
The command to convert looks like this:
$ jupyter convert <input notebook> --to <output format>
Once converted, the notebook can be shared however you choose!
JupyterHub
Briefly mentioned earlier, JupyterHub is another offering from Project Jupyter. JupyterHub is a multi-user version of Jupyter Notebook. It brings the convenience and power of notebooks to teams, classrooms, and labs.Â
JupyterHub is often used by educators as it provides an excellent interface for meshing sample code with supporting explanations. It creates a central and transparent location for notes and assignments.Â
In a professional setting, the hub allows you to deploy your notebooks to your organization, scale your deployment with Docker and Kubernetes, and provide uniform data management and access within your company.Â
What to do next with Jupyter Notebooks and Python
Hopefully you found this information helpful! Jupyter Notebooks are some of the most popular computational notebooks out there. They are commonly used in data science and machine learning, so if you're currently working in those fields or even just find yourself interested in them, Jupyter is a powerful resource to know.Â
If your data science or machine learning work requires something a little more capable and maybe a touch more involved, their newer service is JupyterLab. JupyterLab is a web-based IDE for Jupyter Notebooks. It can run terminals, text editors and supports the creation of custom plugins.Â
Data science and machine learning are both highly in-demand fields, and Python is an extremely popular language for both. If you want to learn how to get some experience under your belt, we have compiled a list of the most popular Python libraries for data science and machine learning.
As always, happy learning!