JupyterHub 是用於 Jupyter 筆記本的多用戶 Web 伺服器。它由四個子系統組成:
有關詳細信息,請參閱 JupyterHub 文檔中的技術概述。
安裝
安裝 jupyterhubAUR 軟體包。 在大多數情況下,您還需要安裝 jupyter-notebook包 包(一些更高級的生成器可能不需要它)。還可以安裝 jupyterlab包 軟體包以使 JupyterLab 接口可用。
運行
啟動/啟用 jupyterhub.service
。使用默認配置,您可以通過在瀏覽器中轉到 127.0.0.1:8000 來訪問集線器。
配置
JupyterHub 配置文件位於 /etc/jupyterhub/jupyterhub_config.py
。這是一個修改配置對象 c
的 Python 腳本。軟體包提供的配置文件顯示可用的配置選項及其默認值。
配置中的任何相對路徑都是從運行集線器的工作目錄開始解析的。軟體包提供的 systemd 服務用 /etc/jupyterhub
作為工作目錄。這意味著,例如,默認資料庫 URL c.JupyterHub.db_url = 'sqlite:///jupyterhub.sqlite'
對應於文件 /etc/jupyterhub/jupyterhub.sqlite
。
所有配置選項都可以在命令行中覆蓋。例如,配置文件中的 c.Application.show_config = True
設置可以用命令行標誌 --Application.show_config=True
代替。請注意,所提供的 systemd 服務會使用命令行將 c.JupyterHub.pid_file
和 c.ConfigurableHTTPProxy.pid_file
明確設置到合適的運行時目錄,因此配置文件中的任何值都會被忽略。
身份驗證器
身份驗證器控制對集線器和單用戶伺服器的訪問。文檔的身份驗證器部分包含有關身份驗證器如何工作以及如何編寫自定義身份驗證器的詳細信息。身份驗證器 wiki 頁面包含身份驗證器列表;其中一些是打包的,如下所述。
請注意,用戶狀態存儲在 cookie 中,由 cookie 密鑰加密。如果切換到其他身份驗證器,或修改所選身份驗證器的設置,導致允許的用戶列表可能發生變化,則應更改 Cookie 密鑰。這將註銷所有當前用戶,並強制他們使用新設置重新進行身份驗證。這可以通過刪除 cookie 密鑰文件並重新啟動集線器來執行,該中心將自動生成新密鑰。在默認配置中,cookie 密鑰存儲在 /etc/jupyterhub/jupyterhub_cookie_secret
。
PAM 身份驗證器
PAM 身份驗證器使用 PAM 允許本地用戶登錄集線器。它包含在 JupyterHub 中,是默認的身份驗證器。使用它要求集線器擁有 /etc/shadow
(包含用戶密碼的散列版本)的讀取權限,以便對用戶進行身份驗證。默認情況下,/etc/shadow
由 root 擁有,文件權限為 -rw------
,因此以 root 身份運行集線器將滿足這一要求。一些資料主張刪除 /etc/shadow
中的所有權限,使其無法被受損的守護進程讀取,並授予需要訪問的進程 DAC_OVERRIDE
功能。如果你的 /etc/shadow
是這樣設置的,請為服務創建一個插入文件,將此功能授予 JupyterHub:
/etc/systemd/system/jupyterhub.service.d/override.conf
[Service] CapabilityBoundingSet=CAP_DAC_OVERRIDE
The PAM authenticator relies on the Python package pamela. For basic troubleshooting this can be tested on the commandline. To attempt authentication as user testuser
, run the following command:
# python -m pamela -a testuser
(If you run JupyterHub as a non-root user, run the command as that user instead of root). If the authentication succeeds, no output will be printed. If it failed an error message will be printed.
PAM authentication as non-root user
If you run JupyterHub as a non-root user, you will need to give that user read permissions to the shadow file. The method recommended by the JupyterHub documentation is to create a shadow
group, make the shadow file readable by this group, and add the JupyterHub user to this group.
/etc/shadow
to anybody running code as the JupyterHub user. Note that each single-user server is run under their own account and so code executed in those servers will not have access. Also note that a security exploit in JupyterHub would allow the same access to the hashed passwords if JupyterHub was being run as root.Creating the group, modifying the shadow file permissions and adding the user jupyterhub
to the group can be accomplished with the following four commands:
# groupadd shadow # chgrp shadow /etc/shadow # chmod g+r /etc/shadow # usermod -aG shadow jupyterhub
Spawners
Spawners are responsible for starting and monitoring each user's notebook server. The spawners section of the documentation contains more details about how they work and how to write a custom spawner. The spawners wiki page has a list of spawners; some of these are packaged and are described below.
LocalProcessSpawner
This is the default spawner included with JupyterHub. It runs each single-user server in a separate local process under their user account (this means each JupyterHub user must correspond to a local user account). It also requires JupyterHub to be run as root so it can spawn the processes under the different user accounts. The jupyter-notebook包 package must be installed for this spawner to work.
SudoSpawner
The SudoSpawner uses an intermediate process created with sudo to spawn the single-user servers. This allows the JupyterHub process to be run as a non-root user. To use it install the jupyterhub-sudospawnerAUR package.
To use it, create a system user account (the following assumes the account is named jupyterhub
) and a group whose membership will define which users can access the hub (here assumed to be called jupyterhub-users
). First, we have to configure sudo to allow the jupyterhub
user to spawn a server without a password. Create a drop-in sudo configuration file with visudo:
# visudo -f /etc/sudoers.d/jupyterhub-sudospawner
# The command the hub is allowed to run. Cmnd_Alias SUDOSPAWNER_CMD = /usr/bin/sudospawner # Allow the jupyterhub user to run this command on behalf of anybody # in the jupyterhub-users group. jupyterhub ALL=(%jupyterhub-users) NOPASSWD:SUDOSPAWNER_CMD
The default service file runs the hub as root. It also applies a number of hardening options to the service to restrict its capabilities. This hardening prevents sudo from working; to allow it, the NoNewPrivileges
service option (plus any other options which implicitly set it, see systemd.exec(5) for a list of service options) needs to be off. Create a drop-in file to run the hub using the jupyterhub
user instead:
/etc/systemd/system/jupyterhub.service.d/override.conf
[Service] User=jupyterhub Group=jupyterhub # Required for sudo. NoNewPrivileges=false # Setting the following would implicitly set NoNewPrivileges. PrivateDevices=false ProtectKernelTunables=false ProtectKernelModules=false LockPersonality=false RestrictRealtime=false RestrictSUIDGID=false SystemCallFilter= SystemCallArchitectures=
If you have previously run the hub as the root user, you will need to change the ownership of the user database and cookie secret files:
# chown jupyterhub:jupyterhub /etc/jupyterhub/{jupyterhub_cookie_secret,jupyterhub.sqlite}
If you are using the PAMAuthenticator, you will need to configure your system to allow it to work as a non-root user.
Finally, edit the JupyterHub configuration and change the spawner class to SudoSpawner:
/etc/jupyterhub/jupyterhub_config.py
c.JupyterHub.spawner_class='sudospawner.SudoSpawner'
To give a user access to the hub, add them to the jupyterhub-users
group:
# usermod -aG jupyterhub-users <username>
systemdspawner
The systemdspawner uses systemd to manage each user's notebook which allows configuring resource limitations, better process isolation and sandboxing, and dynamically allocated users. To use it install the jupyterhub-systemdspawnerAUR package and set the spawner class in the configuration file:
/etc/jupyterhub/jupyterhub_config.py
c.JupyterHub.spawner_class = 'systemdspawner.SystemdSpawner'
Note that as per systemdspawner's readme using it currently requires JupyterHub to be run as root.
Services
A JupyterHub service is defined as a process which interacts with the Hub through its API. Services can either be run by the hub or as standalone processes.
Idle culler
The idle culler service can be used to automatically shut down idle single-user servers. To use it, install the jupyterhub-idle-cullerAUR package. To run the service through the hub, add a service description to the c.JupyterHub.services
configuration variable:
/etc/jupyterhub/jupyterhub_config.py
import sys c.JupyterHub.services = [ { 'name': 'idle-culler', 'admin': True, 'command': [ sys.executable, '-m', 'jupyterhub_idle_culler', '--timeout=3600' ], } ]
See the service documentation or the output of python -m jupyterhub_idle_culler --help
for a description of command-line options and details of how to run the service as a standalone process.
Tips and Tricks
Running as non-root user
By default, the main hub process is run as the root user (the individual user servers are run under the corresponding local user as set by the spawner). To run as a non-root user, you need to use the SudoSpawner (the other spawners listed above require running as root). If you are using the PAM authenticator, you will also need to configure it for a non-root user.
Using a reverse proxy
A reverse proxy can be used to redirect external requests to the JupyterHub instance. This can be useful if you want to serve multiple sites from one machine, or use an existing server to handle SSL. The using a reverse proxy section of the JupyterHub documentation has example configuration for using either nginx or Apache as a reverse proxy.
Proxy other web services
The Jupyter Server Proxy extension allows you to run other web services such as Code Server or RStudio alongside JupyterHub and provide authenticated web access to them. To use it, install python-jupyter-server-proxyAUR and configure it with the /etc/jupyter/jupyter_notebook_config.py
file. For instance, to proxy code-serverAUR:
/etc/jupyter/jupyter_notebook_config.py
c.ServerProxy.servers = { 'code-server': { 'command': [ 'code-server', '--auth=none', '--disable-telemetry', '--disable-update-check', '--bind-addr=localhost:{port}', '--user-data-dir=.config/Code - OSS/', '--extensions-dir=.vscode-oss/extensions/' ], 'timeout': 20, 'launcher_entry': { 'title': 'VS Code' } } }
See the documentation for more details about configuring the Jupyter Server Proxy.
Access to GPUs
If you receive errors when accessing GPUs (for instance, if nvidia-smi
reports it cannot communicate with the NVIDIA driver), you must consider the hardening that is shipped with the JupyterHub systemd unit file.
To allow access to GPUs (and other hardware) broadly, you can add this to a drop-in file:
/etc/systemd/system/jupyterhub.service.d/override.conf
[Service] PrivateDevices=false