Some thoughts about current design problems

While this is a nice example how to get a Jupyter notebook working on Euler, it also displays some problems:

the notebook server itself should not be part of a bsub command, simply because it is not the notebook server which is doing the hard work
it is the kernel which is actually executing a code and doing the hard work
I am not an expert in bsub, but any bsub command should always end some time and not run forever, doing nothing.
on the other hand, a notebook server should stay almost forever (hence the name, «server») and not suddenly being dropped, just because the Euler job is ending
but if we put the bsub command with jupyter notebook & in the background, the job executes forever and possibly blocks other jobs from being executed (am I wrong?)
currently, just starting a notebook server can take a very long while, until the cluster finally picks up my request
users can easily wait until an actual job is finished, but they should not be waiting just for a notebook to be opened!

All this being said: we need to design the whole setup differently. We need to write a Jupyter kernel which will wrap a cell into a bsub command and execute it on the cluster. To keep the user sessions alive, they should be hosted with JupyterHub on a dedicated machine outside of Euler.

To say it in different words: A Jupyter notebook makes use of the so called REPL (Read-Eval-Print-Loop) environment. This is a fundamentally different principle than the batch job system we have on Euler. REPL is most of the time idle and waits for input (ideal for interactive programming!), whereas a batch job system like bsub tries to maximize the performance for all users by organising their jobs in time and putting them in a queue.

Rok's fork of an existing project in this direction might be a good starting point:

https://bitbucket.org/rokroskar/remote_ikernel

Edited Nov 30, 2018 by vermeul

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message

Some thoughts about current design problems