An Iterative Workflow for Jupyter Notebook

Inspecting a data set is usually not a linear process, but involves iterative refinement of the initial assumptions and question. One might have an initial question but after taking a look at the data set realize that it requires multiple preprocessing steps before it can be used to address the question.

Jupyter Notebook may be very nice for presenting the results of a data analysis, but using it for an iterative workflow feels not nearly as productive as working on the command-line interface with the Python interpreter, which leaves the question if there is some room for improvement.

One thing I noticed that eats up a lot of time while working with Jupyter Notebook is scrolling through the notebook to locate something in a cell. This can be a snippet of code, some documentation or an error message.

More keyboard, less mouse

One obvious way to increase productivity is to to work with the keyboard as much as possible, not touching the mouse unless absolutely necessary. Reminding of Vims modes, Jupyter Notebook allows switching between edit mode and command mode allowing to move upwards and downwards, insert and remove cells, etc. and has keyboard shortcuts at least for some menu entries.

Less scrolling in error messages

Another thing I noticed is that error messages feel very verbose compared to the command-line interface. To see the actual exception one often needs to repeatedly scroll downwards. When comparing the form of presentation to the Python command-line interface I noticed that the error messages not only include the stack trace but also the context of each function call. This may certainly be useful if one is debugging or developing an API, but otherwise it is often merely a form of distraction. Having less verbose error messages would mean spending less time on scrolling through the notebook.

Luckily the IPython kernel in Jupyter Notebook provides several levels of verbosity for exceptions: Plain, Context, Verbose and Minimal. The default is Context which is even more verbose than the default in the Python Interpreter. Minimal ommits the stack trace, showing just the actual exception.

To set the exception verbosity in Jupyter for an individual notebook one can use xmode. There is also a configuration setting c.InteractiveShell.xmode that can be enabled in ipython_kernel_config.py.

In comparison the Python interpreter allows to set the verbosity either on the command-line with -v, by setting the environment variable PYTHONVERBOSE or in the interpreter using sys.tracebacklimit.

Connect the terminal with the browser

Minimizing the number of mouse interactions by using keyboard shortcuts and reducing the scrolling through error messages by making them less verbose certainly has some potential to make an iterative workflow in Juypter Notebook more productive.

A different approach would be have an integration of the Python interpreter in the terminal and the notebook in the browser allowing to seamlessly switch back and forth. Any variable created in the notebook should be available in the Python interpreter in the terminal and vice-versa. While Jupyter does allow to open a console within the browser it does not provide the same user experience as the Python interpreter in a terminal window.

Connecting the original Python interpreter to the kernel of an already running notebook is certainly technically possible but it seems there is no convenient built-in method that wraps things nicely up.

It is however possible to connect an IPython interpreter in the terminal with the kernel of an already running notebook using either jupyter console --existing to connect to the most recent kernel that was started or jupyter console --existing kernel.json to connect to a specific kernel. To get the file name of a kernel from an already running notebook one can use %connect_info.

While the IPython interpreter is supposed to provide an improved user experience I prefer the original Python interpreter. A workaround might be to adjust the behaviour of the IPython interpreter to more closely align with the original Python interpreter but I have not investigated if that is possible. Nonetheless being able to connect from the terminal to an already running notebook in Jupyter is a promising productivity gain.

Interesting Stuff

Me writing about Tech stuff