If you don't know what Jupyter Notebooks are, I encourage you to check them out. They give you a way to write programs, in a variety of programming languages, and share those with your friends and colleagues (even if they themselves are not as tech savvy as you are).   JupyterHub in turn gives you a way to easily manage a large number of notebooks presented on the internet.

Now, wouldn't it be cool if you could display Jupyter notebooks inside of Nextcloud, and have access to all your Nextcloud files  inside the Jupyter interface?

I think so.


TL;DR;

Our code and k8s manifests are available here, in the jupyter directory: https://platform.sunet.se/Drive/k8s-manifests   

Prerequisites

For our use case we will be deploying JupyterHub in Kubernetes using the wonderful Zero to JupyterHub with Kubernetes. To make the file syncing work as expected later, we will need to patch the Helm charts unfortunately (I will see if I can upstream these changes, so stay tuned for future updates).

hub.patch
diff --git a/jupyterhub/templates/hub/deployment.yaml b/jupyterhub/templates/hub/deployment.yaml

index d6e1c63..359e981 100644

--- a/jupyterhub/templates/hub/deployment.yaml
+++ b/jupyterhub/templates/hub/deployment.yaml

@@ -33,6 +33,10 @@ spec:

         {{- . | toYaml | nindent 8 }}

         {{- end }}

     spec:

+      hostAliases:

+        - ip: "127.0.0.1"

+          hostnames:

+            - "hub"

       {{- if .Values.scheduling.podPriority.enabled }}

       priorityClassName: {{ include "jupyterhub.priority.fullname" . }}

       {{- end }}

@@ -211,6 +215,8 @@ spec:

           ports:

             - name: http

               containerPort: 8081

+            - name: refresh-token

+              containerPort: 8082

           {{- if .Values.hub.livenessProbe.enabled }}

           {{- /* NOTE:

             We don't know how long hub database upgrades could take so having a


helm repo add jupyterhub https://hub.jupyter.org/helm-chart/
helm repo update
helm fetch jupyterhub/jupyterhub --version 3.2.1 --untar --untardir .
patch -p1 < hub.patch


Authenticating Jupyter with Nextcloud

The first piece of the puzzle is to use Nextcloud for authentication in JupyterHub. This can be achieved out of the box, with some configuration. In our case, we will implement a sub class of the GenericOAuthenticator class. The code will take care of authentication with oauth2 against Nextcloud and and will also refresh the access_token for the user, when it expires.

You need to create a oauth2 client in Nextcloud and get a client_id and client_secret from Nextcloud: https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/oauth2.html

These needs to be stored as kubernetes secrets, look at the very bottom of this post to find out what they sould be called.

The code looks like this:

      import time
      import requests
      from datetime import datetime
      from oauthenticator.generic import GenericOAuthenticator
      token_url = 'https://' + os.environ['NEXTCLOUD_HOST'] + '/index.php/apps/oauth2/api/v1/token'
      debug = os.environ.get('NEXTCLOUD_DEBUG_OAUTH', 'false').lower() in ['true', '1', 'yes']

      def get_nextcloud_access_token(refresh_token):
        client_id = os.environ['NEXTCLOUD_CLIENT_ID']
        client_secret = os.environ['NEXTCLOUD_CLIENT_SECRET']

        code = refresh_token
        data = {
          'grant_type': 'refresh_token',
          'code': code,
          'refresh_token': refresh_token,
          'client_id': client_id,
          'client_secret': client_secret
        }
        response = requests.post(token_url, data=data)
        if debug:
          print(response.text)
        return response.json()

      def post_auth_hook(authenticator, handler, authentication):
        user = authentication['auth_state']['oauth_user']['ocs']['data']['id']
        auth_state = authentication['auth_state']
        auth_state['token_expires'] =  time.time() + auth_state['token_response']['expires_in']
        authentication['auth_state'] = auth_state
        return authentication

      class NextcloudOAuthenticator(GenericOAuthenticator):
        def __init__(self, *args, **kwargs):
          super().__init__(*args, **kwargs)
          self.user_dict = {}

        async def pre_spawn_start(self, user, spawner):
          super().pre_spawn_start(user, spawner)
          auth_state = await user.get_auth_state()
          if not auth_state:
            return
          access_token = auth_state['access_token']
          spawner.environment['NEXTCLOUD_ACCESS_TOKEN'] = access_token

        async def refresh_user(self, user, handler=None):
          auth_state = await user.get_auth_state()
          if not auth_state:
            if debug:
              print(f'auth_state missing for {user}')
            return False
          access_token = auth_state['access_token']
          refresh_token = auth_state['refresh_token']
          token_response = auth_state['token_response']
          now = time.time()
          now_hr = datetime.fromtimestamp(now)
          expires = auth_state['token_expires']
          expires_hr = datetime.fromtimestamp(expires)
          expires = 0
          if debug:
            print(f'auth_state for {user}: {auth_state}')
          if now >= expires:
            if debug:
              print(f'Time is: {now_hr}, token expired: {expires_hr}')
              print(f'Refreshing token for {user}')
            try:
              token_response = get_nextcloud_access_token(refresh_token)
              auth_state['access_token'] = token_response['access_token']
              auth_state['refresh_token'] = token_response['refresh_token']
              auth_state['token_expires'] = now + token_response['expires_in']
              auth_state['token_response'] = token_response
              if debug:
                print(f'Successfully refreshed token for {user.name}')
                print(f'auth_state for {user.name}: {auth_state}')
              return {'name': user.name, 'auth_state': auth_state}
            except Exception as e:
              if debug:
                print(f'Failed to refresh token for {user}')
              return False
            return False
          if debug:
            print(f'Time is: {now_hr}, token expires: {expires_hr}')
          return True

      c.JupyterHub.authenticator_class = NextcloudOAuthenticator
      c.NextcloudOAuthenticator.client_id = os.environ['NEXTCLOUD_CLIENT_ID']
      c.NextcloudOAuthenticator.client_secret = os.environ['NEXTCLOUD_CLIENT_SECRET']
      c.NextcloudOAuthenticator.login_service = 'Sunet Drive'
      c.NextcloudOAuthenticator.username_claim = lambda r: r.get('ocs', {}).get('data', {}).get('id')
      c.NextcloudOAuthenticator.userdata_url = 'https://' + os.environ['NEXTCLOUD_HOST'] + '/ocs/v2.php/cloud/user?format=json'
      c.NextcloudOAuthenticator.authorize_url = 'https://' + os.environ['NEXTCLOUD_HOST'] + '/index.php/apps/oauth2/authorize'
      c.NextcloudOAuthenticator.token_url = token_url
      c.NextcloudOAuthenticator.oauth_callback_url = 'https://' + os.environ['JUPYTER_HOST'] + '/hub/oauth_callback'
      c.NextcloudOAuthenticator.allow_all = True
      c.NextcloudOAuthenticator.refresh_pre_spawn = True
      c.NextcloudOAuthenticator.enable_auth_state = True
      c.NextcloudOAuthenticator.auth_refresh_age = 3600
      c.NextcloudOAuthenticator.post_auth_hook = post_auth_hook      


Syncing files from Nextcloud to JupyterHub

To sync the files from Nextcloud to JupyterHub we will create a custom Jupyter Labs/Notebook image that has a sync client in it.

Dockerfile
FROM quay.io/jupyter/scipy-notebook:lab-4.0.10  
USER root
RUN apt-get update && apt-get upgrade -y && apt-get install -y \
  curl \
  jq \
  nextcloud-desktop-cmd \
  python3-pip
COPY ./nc-sync /usr/local/bin/
USER jovyan

The nc-sync script looks like this:

#!/bin/bash
rmdir --ignore-fail-on-non-empty /home/jovyan/work
workdir='/home/jovyan'
mkdir -p ${workdir}
server="https://sunet.drive.test.sunet.se"
json_file="${workdir}/.access_token.json"

function refresh_token {
  json="$(curl --header "Authorization: token ${JUPYTERHUB_API_TOKEN}" https://${JUPYTER_HOST}/services/refresh-token/tokens)"
  # If json is empty here we are early in the process and should have a fresh token from the environment
  if [[ -z "${json}" ]]; then
    token="${NEXTCLOUD_ACCESS_TOKEN}"
    json="{ \"access_token\": \"${token}\", \"token_expires\": $(date -d "10 min" +%s).0000000 }" 
  fi
  echo "${json}" > "${json_file}"
  token=$(jq -r '.access_token' "${json_file}")
  echo "${token}"
}

function get_token {
  # First we try to use our cache
  if [[ -f "${json_file}" ]]; then
    now=$(date +%s)
    token=$(jq -r '.access_token' "${json_file}")
    expires_at=$(jq -r '.token_expires' "${json_file}"| sed 's/\..*//')
    # If the token is expired, we fetch a new one
    if [[ "${expires_at}" -lt ${now} ]]; then
      token=$(refresh_token)
    fi
  else
    token=$(refresh_token)
  fi
  echo "${token}"
}

function ncsync {
  while true; do
    nextcloudcmd -s --user ${JUPYTERHUB_USER} --password $(get_token) --path / "${workdir}" "${server}"
    sleep 5s
  done
}
ncsync &

For this sync script to work we also need a JupyterHub service:

      import sys
      c.JupyterHub.load_roles = [
          {
              "name": "refresh-token",
              "services": [
                "refresh-token"
              ],
              "scopes": [
                "read:users",
                "admin:auth_state"
              ]
          },
          {
              "name": "user",
              "scopes": [
                "access:services!service=refresh-token",
                "read:services!service=refresh-token",
                "self",
              ],
          },
          {
              "name": "server",
              "scopes": [
                "access:services!service=refresh-token",
                "read:services!service=refresh-token",
                "inherit",
              ],
          }
      ]
      c.JupyterHub.services = [
          {
              'name': 'refresh-token',
              'url': 'http://' + os.environ.get('HUB_SERVICE_HOST', 'hub') + ':' + os.environ.get('HUB_SERVICE_PORT_REFRESH_TOKEN', '8082'),
              'display': False,
              'oauth_no_confirm': True,
              'api_token': os.environ['JUPYTERHUB_API_KEY'],
              'command': [sys.executable, '/usr/local/etc/jupyterhub/refresh-token.py']
          }
      ]
      c.JupyterHub.admin_users = {"refresh-token"}
      c.JupyterHub.api_tokens = {
          os.environ['JUPYTERHUB_API_KEY']: "refresh-token",
      }      

In theory it should be enough to have the load_roles config along with the services config, and the admin_users and api_tokens should not be needed, but for me it does not seem to work without it. YMMV.  The executable refresh-token.py is a piece of code that looks like this:

refresh-token.py
        """A token refresh service authenticating with the Hub.

        This service serves `/services/refresh-token/`,
        authenticated with the Hub,
        showing the user their own info.
        """
        import json
        import os
        import requests
        import socket
        from jupyterhub.services.auth import HubAuthenticated
        from jupyterhub.utils import url_path_join
        from tornado.httpserver import HTTPServer
        from tornado.ioloop import IOLoop
        from tornado.web import Application, HTTPError, RequestHandler, authenticated
        from urllib.parse import urlparse
        debug = os.environ.get('NEXTCLOUD_DEBUG_OAUTH', 'false').lower() in ['true', '1', 'yes']
        def my_debug(s):
          if debug:
            with open("/proc/1/fd/1", "a") as stdout:
              print(s, file=stdout)


        class RefreshHandler(HubAuthenticated, RequestHandler):
            def api_request(self, method, url, **kwargs):
                my_debug(f'{self.hub_auth}')
                url = url_path_join(self.hub_auth.api_url, url)
                allow_404 = kwargs.pop('allow_404', False)
                headers = kwargs.setdefault('headers', {})
                headers.setdefault('Authorization', f'token {self.hub_auth.api_token}')
                try:
                    r = requests.request(method, url, **kwargs)
                except requests.ConnectionError as e:
                    my_debug(f'Error connecting to {url}: {e}')
                    msg = f'Failed to connect to Hub API at {url}.'
                    msg += f'  Is the Hub accessible at this URL (from host: {socket.gethostname()})?'

                    if '127.0.0.1' in url:
                        msg += '  Make sure to set c.JupyterHub.hub_ip to an IP accessible to' + \
                               ' single-user servers if the servers are not on the same host as the Hub.'
                    raise HTTPError(500, msg)

                data = None
                if r.status_code == 404 and allow_404:
                    pass
                elif r.status_code == 403:
                    my_debug(
                        'Lacking permission to check authorization with JupyterHub,' + 
                        f' my auth token may have expired: [{r.status_code}] {r.reason}'
                    )
                    my_debug(r.text)
                    raise HTTPError(
                        500,
                        'Permission failure checking authorization, I may need a new token'
                    )
                elif r.status_code >= 500:
                    my_debug(f'Upstream failure verifying auth token: [{r.status_code}] {r.reason}')
                    my_debug(r.text)
                    raise HTTPError(
                        502, 'Failed to check authorization (upstream problem)')
                elif r.status_code >= 400:
                    my_debug(f'Failed to check authorization: [{r.status_code}] {r.reason}')
                    my_debug(r.text)
                    raise HTTPError(500, 'Failed to check authorization')
                else:
                    data = r.json()
                return data

            @authenticated
            def get(self):
                user_model = self.get_current_user()
                # Fetch current auth state
                user_data = self.api_request('GET', url_path_join('users', user_model['name']))
                auth_state = user_data['auth_state']
                access_token = auth_state['access_token']
                token_expires = auth_state['token_expires']

                self.set_header('content-type', 'application/json')
                self.write(json.dumps({'access_token': access_token, 'token_expires': token_expires}, indent=1, sort_keys=True))

        class PingHandler(RequestHandler):

            def get(self):
                my_debug(f"DEBUG: In ping get")
                self.set_header('content-type', 'application/json')
                self.write(json.dumps({'ping': 1}))


        def main():
            app = Application([
                (os.environ['JUPYTERHUB_SERVICE_PREFIX'] + 'tokens', RefreshHandler),
                (os.environ['JUPYTERHUB_SERVICE_PREFIX'] + '/?', PingHandler),
            ])

            http_server = HTTPServer(app)
            url = urlparse(os.environ['JUPYTERHUB_SERVICE_URL'])

            http_server.listen(url.port)

            IOLoop.current().start()

        if __name__ == '__main__':
            main()        

This will create a custom api where the users sync client can always get an up to date access token, without the refresh token, that will be kept by JupyterHub. I don't know if this is strictly neccessary, but it feels better not to expose a long lived token that in theory could be used to cause problems for the user if it were to be leaked, where as the access token is only valid for an hour at a time.

Displaying JupyterHub inside Nextcloud

I wrote a simple app for Nextcloud, that simply displays the JupyterHub interface to the user inside an iframe. While this is simple as far as integration goes (because Jupyter can already use Nextcloud for authentication) it really makes it feel well integrated and allows for future improvements to the app, like adding context menus in the files menu of Nextcloud to open notebooks directly in the app (not implemented yet!).

Check the link above for how to install the app in your Nextcloud instance, and I will be making it available in the Nextcloud appstore shortly. Here is a video on how it all looks.

jupyter.mp4

Putting it all together

We can deploy all the stuff from just inside the values.yaml for helm, if you have patched the charts as per above. Ok, here it is in its full glory:

debug:
  enabled: true
hub:
  config:
    Authenticator:
      auto_login: true
      enable_auth_state: true
    JupyterHub:
      tornado_settings:
        headers: { 'Content-Security-Policy': "frame-ancestors *;" }
  db:
    pvc:
      storageClassName: <>
  extraConfig:
    oauthCode: |
      import time
      import requests
      from datetime import datetime
      from oauthenticator.generic import GenericOAuthenticator
      token_url = 'https://' + os.environ['NEXTCLOUD_HOST'] + '/index.php/apps/oauth2/api/v1/token'
      debug = os.environ.get('NEXTCLOUD_DEBUG_OAUTH', 'false').lower() in ['true', '1', 'yes']

      def get_nextcloud_access_token(refresh_token):
        client_id = os.environ['NEXTCLOUD_CLIENT_ID']
        client_secret = os.environ['NEXTCLOUD_CLIENT_SECRET']

        code = refresh_token
        data = {
          'grant_type': 'refresh_token',
          'code': code,
          'refresh_token': refresh_token,
          'client_id': client_id,
          'client_secret': client_secret
        }
        response = requests.post(token_url, data=data)
        if debug:
          print(response.text)
        return response.json()

      def post_auth_hook(authenticator, handler, authentication):
        user = authentication['auth_state']['oauth_user']['ocs']['data']['id']
        auth_state = authentication['auth_state']
        auth_state['token_expires'] =  time.time() + auth_state['token_response']['expires_in']
        authentication['auth_state'] = auth_state
        return authentication

      class NextcloudOAuthenticator(GenericOAuthenticator):
        def __init__(self, *args, **kwargs):
          super().__init__(*args, **kwargs)
          self.user_dict = {}

        async def pre_spawn_start(self, user, spawner):
          super().pre_spawn_start(user, spawner)
          auth_state = await user.get_auth_state()
          if not auth_state:
            return
          access_token = auth_state['access_token']
          spawner.environment['NEXTCLOUD_ACCESS_TOKEN'] = access_token

        async def refresh_user(self, user, handler=None):
          auth_state = await user.get_auth_state()
          if not auth_state:
            if debug:
              print(f'auth_state missing for {user}')
            return False
          access_token = auth_state['access_token']
          refresh_token = auth_state['refresh_token']
          token_response = auth_state['token_response']
          now = time.time()
          now_hr = datetime.fromtimestamp(now)
          expires = auth_state['token_expires']
          expires_hr = datetime.fromtimestamp(expires)
          expires = 0
          if debug:
            print(f'auth_state for {user}: {auth_state}')
          if now >= expires:
            if debug:
              print(f'Time is: {now_hr}, token expired: {expires_hr}')
              print(f'Refreshing token for {user}')
            try:
              token_response = get_nextcloud_access_token(refresh_token)
              auth_state['access_token'] = token_response['access_token']
              auth_state['refresh_token'] = token_response['refresh_token']
              auth_state['token_expires'] = now + token_response['expires_in']
              auth_state['token_response'] = token_response
              if debug:
                print(f'Successfully refreshed token for {user.name}')
                print(f'auth_state for {user.name}: {auth_state}')
              return {'name': user.name, 'auth_state': auth_state}
            except Exception as e:
              if debug:
                print(f'Failed to refresh token for {user}')
              return False
            return False
          if debug:
            print(f'Time is: {now_hr}, token expires: {expires_hr}')
          return True

      c.JupyterHub.authenticator_class = NextcloudOAuthenticator
      c.NextcloudOAuthenticator.client_id = os.environ['NEXTCLOUD_CLIENT_ID']
      c.NextcloudOAuthenticator.client_secret = os.environ['NEXTCLOUD_CLIENT_SECRET']
      c.NextcloudOAuthenticator.login_service = 'Sunet Drive'
      c.NextcloudOAuthenticator.username_claim = lambda r: r.get('ocs', {}).get('data', {}).get('id')
      c.NextcloudOAuthenticator.userdata_url = 'https://' + os.environ['NEXTCLOUD_HOST'] + '/ocs/v2.php/cloud/user?format=json'
      c.NextcloudOAuthenticator.authorize_url = 'https://' + os.environ['NEXTCLOUD_HOST'] + '/index.php/apps/oauth2/authorize'
      c.NextcloudOAuthenticator.token_url = token_url
      c.NextcloudOAuthenticator.oauth_callback_url = 'https://' + os.environ['JUPYTER_HOST'] + '/hub/oauth_callback'
      c.NextcloudOAuthenticator.allow_all = True
      c.NextcloudOAuthenticator.refresh_pre_spawn = True
      c.NextcloudOAuthenticator.enable_auth_state = True
      c.NextcloudOAuthenticator.auth_refresh_age = 3600
      c.NextcloudOAuthenticator.post_auth_hook = post_auth_hook

    serviceCode: |
      import sys
      c.JupyterHub.load_roles = [
          {
              "name": "refresh-token",
              "services": [
                "refresh-token"
              ],
              "scopes": [
                "read:users",
                "admin:auth_state"
              ]
          },
          {
              "name": "user",
              "scopes": [
                "access:services!service=refresh-token",
                "read:services!service=refresh-token",
                "self",
              ],
          },
          {
              "name": "server",
              "scopes": [
                "access:services!service=refresh-token",
                "read:services!service=refresh-token",
                "inherit",
              ],
          }
      ]
      c.JupyterHub.services = [
          {
              'name': 'refresh-token',
              'url': 'http://' + os.environ.get('HUB_SERVICE_HOST', 'hub') + ':' + os.environ.get('HUB_SERVICE_PORT_REFRESH_TOKEN', '8082'),
              'display': False,
              'oauth_no_confirm': True,
              'api_token': os.environ['JUPYTERHUB_API_KEY'],
              'command': [sys.executable, '/usr/local/etc/jupyterhub/refresh-token.py']
          }
      ]
      c.JupyterHub.admin_users = {"refresh-token"}
      c.JupyterHub.api_tokens = {
          os.environ['JUPYTERHUB_API_KEY']: "refresh-token",
      }
  extraFiles:
    refresh-token.py: 
      mountPath: /usr/local/etc/jupyterhub/refresh-token.py
      stringData: |
        """A token refresh service authenticating with the Hub.

        This service serves `/services/refresh-token/`,
        authenticated with the Hub,
        showing the user their own info.
        """
        import json
        import os
        import requests
        import socket
        from jupyterhub.services.auth import HubAuthenticated
        from jupyterhub.utils import url_path_join
        from tornado.httpserver import HTTPServer
        from tornado.ioloop import IOLoop
        from tornado.web import Application, HTTPError, RequestHandler, authenticated
        from urllib.parse import urlparse
        debug = os.environ.get('NEXTCLOUD_DEBUG_OAUTH', 'false').lower() in ['true', '1', 'yes']
        def my_debug(s):
          if debug:
            with open("/proc/1/fd/1", "a") as stdout:
              print(s, file=stdout)


        class RefreshHandler(HubAuthenticated, RequestHandler):
            def api_request(self, method, url, **kwargs):
                my_debug(f'{self.hub_auth}')
                url = url_path_join(self.hub_auth.api_url, url)
                allow_404 = kwargs.pop('allow_404', False)
                headers = kwargs.setdefault('headers', {})
                headers.setdefault('Authorization', f'token {self.hub_auth.api_token}')
                try:
                    r = requests.request(method, url, **kwargs)
                except requests.ConnectionError as e:
                    my_debug(f'Error connecting to {url}: {e}')
                    msg = f'Failed to connect to Hub API at {url}.'
                    msg += f'  Is the Hub accessible at this URL (from host: {socket.gethostname()})?'

                    if '127.0.0.1' in url:
                        msg += '  Make sure to set c.JupyterHub.hub_ip to an IP accessible to' + \
                               ' single-user servers if the servers are not on the same host as the Hub.'
                    raise HTTPError(500, msg)

                data = None
                if r.status_code == 404 and allow_404:
                    pass
                elif r.status_code == 403:
                    my_debug(
                        'Lacking permission to check authorization with JupyterHub,' + 
                        f' my auth token may have expired: [{r.status_code}] {r.reason}'
                    )
                    my_debug(r.text)
                    raise HTTPError(
                        500,
                        'Permission failure checking authorization, I may need a new token'
                    )
                elif r.status_code >= 500:
                    my_debug(f'Upstream failure verifying auth token: [{r.status_code}] {r.reason}')
                    my_debug(r.text)
                    raise HTTPError(
                        502, 'Failed to check authorization (upstream problem)')
                elif r.status_code >= 400:
                    my_debug(f'Failed to check authorization: [{r.status_code}] {r.reason}')
                    my_debug(r.text)
                    raise HTTPError(500, 'Failed to check authorization')
                else:
                    data = r.json()
                return data

            @authenticated
            def get(self):
                user_model = self.get_current_user()
                # Fetch current auth state
                user_data = self.api_request('GET', url_path_join('users', user_model['name']))
                auth_state = user_data['auth_state']
                access_token = auth_state['access_token']
                token_expires = auth_state['token_expires']

                self.set_header('content-type', 'application/json')
                self.write(json.dumps({'access_token': access_token, 'token_expires': token_expires}, indent=1, sort_keys=True))

        class PingHandler(RequestHandler):

            def get(self):
                my_debug(f"DEBUG: In ping get")
                self.set_header('content-type', 'application/json')
                self.write(json.dumps({'ping': 1}))


        def main():
            app = Application([
                (os.environ['JUPYTERHUB_SERVICE_PREFIX'] + 'tokens', RefreshHandler),
                (os.environ['JUPYTERHUB_SERVICE_PREFIX'] + '/?', PingHandler),
            ])

            http_server = HTTPServer(app)
            url = urlparse(os.environ['JUPYTERHUB_SERVICE_URL'])

            http_server.listen(url.port)

            IOLoop.current().start()

        if __name__ == '__main__':
            main()
  networkPolicy:
    ingress:
      - ports:
          - port: 8082
        from:
          - podSelector:
              matchLabels:
                hub.jupyter.org/network-access-hub: "true"
  service:
    extraPorts:
      - port: 8082
        targetPort: 8082
        name: refresh-token
  extraEnv:
    NEXTCLOUD_DEBUG_OAUTH: "no"
    NEXTCLOUD_HOST: <public dns name of your Nextcloud instance here>
    JUPYTER_HOST: <public dns name of your JupoyterHub instance here>
    JUPYTERHUB_API_KEY:
      valueFrom:
        secretKeyRef:
          name: jupyterhub-secrets
          key: api-key
    JUPYTERHUB_CRYPT_KEY:
      valueFrom:
        secretKeyRef:
          name: jupyterhub-secrets
          key: crypt-key
    NEXTCLOUD_CLIENT_ID:
      valueFrom:
        secretKeyRef:
          name: nextcloud-oauth-secrets
          key: client-id
    NEXTCLOUD_CLIENT_SECRET:
      valueFrom:
        secretKeyRef:
          name: nextcloud-oauth-secrets
          key: client-secret
    networkPolicy:
      enabled: false
proxy:
  chp:
    networkPolicy:
      egress:
        - to:
            - podSelector:
                matchLabels:
                  app: jupyterhub
                  component: hub
          ports:
            - port: 8082
singleuser:
  image:
    name: docker.sunet.se/drive/jupyter-custom
    tag: lab-4.0.10-sunet2
  storage:
    dynamic:
      storageClass: <again, a suitable pvc storage class here>
  extraEnv:
    JUPYTER_ENABLE_LAB: "yes"
  extraFiles:
    jupyter_notebook_config:
      mountPath: /home/jovyan/.jupyter/jupyter_server_config.py
      stringData: |
        import os
        c = get_config()
        c.NotebookApp.allow_origin = '*'
        c.NotebookApp.tornado_settings = {
            'headers': { 'Content-Security-Policy': "frame-ancestors *;" }
        }
        os.system('/usr/local/bin/nc-sync')
      mode: 0644

Once you adjust the few parameters for storageClass and  set the environment variables to match your installation you should be good to go. One final thing you need to do is create the following secrets:


name: jupyterhub-secrets

key: api-key

name: jupyterhub-secrets

key: crypt-key

name: nextcloud-oauth-secrets

key: client-id

name: nextcloud-oauth-secrets

key: client-secret


Client id and client secret comes from Nextcloud: https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/oauth2.html while api-key and crypt-key both can be generated by this command:

openssl rand -hex 32



Nextcloud has a fantastic mail-app that can be used as a web interface for most e-mail providers. And of course, if you are running Nextcloud, you might want to host your own imap and smtp servers as well. In this blog post I will describe how you can set up and configure dovecot and postfix as redundant imap and smtp servers and how you can integrate them with Nextclouds app-password database so you get all user management for free and your users can have self service password management.


Architectural overview

To get this working you will need a minimum of two servers, where you can run docker containers. You will also need the ability to configure two (sub) domains for use with  your e-mail services. In this example we will set up imap.example.com and smtp.example.com on four servers, that means you should replace any instance of 'example.com' below with your own domain:

  1. imap1.example.com
  2. imap2.example.com
  3. smtp1.example.com
  4. smtp2.example.com

If you are so inclined, you cound of course consolidate this to only two servers, e.g.

We will use docker-compose to manage the containers.

You will also need some kind of loadbalancing, that can load balance on tcp ports to get high availability for this set up. One option is to use round robin dns or even have one of the servers be a hot standby that you can switch to by updating the dns record (make sure you have a short TTL set for your dns records in that case). I will not go in to a lot of details on how you can set up load balancing, but you should be able use your favourite solution for this.

The same thing is assumed for certificates, you need to have a certificate called /opt/dovecot/certs/imap.example.com/fullchain.pem and a key called /opt/dovecot/certs/imap.example.com/privkey.pem on the imap servers, and similarly a certificate called /opt/postfix/certs/smtp.example.com/fullchain.pem and a key called /opt/postfix/certs/smtp.example.com/privkey.pem on the smtp servers. I will not cover how to get them and how to distribute them to your server here, but I encourage you to look at certbot if you don't know where to go from there.

Companion docker containers

SUNET builds two container images from these repositories, that can be used with this guide:

These images can be pulled from:

Common configuration

This guide will assume you are running Debian 12 on your machines, but you should be able to adjust the commands here to any modern linux distro. At SUNET we use a rather fancy puppet setup to manage our servers, but I will describe installation commands using apt command line syntax below. In general you will need to be root for these commands, so either prefix with sudo or make sure you are running as the root user. The first thing we need is docker and docker-compose:

# apt update && apt install docker.io docker-compose

This needs to be done on all servers.

Dovecot

This needs to be done on imap1.example.com and imap2.example.com.


Next we will set up dovecot. Let's first create a directory structure on our host:

# mkdir -p /opt/dovecot/{certs,config,mail,ssmtp}

Next create the file /opt/dovecot/docker-compose.yml with the following content:

/opt/dovecot/docker-compose.yml
version: "3.7"
services:
  dovecot:
    image: docker.sunet.se/mail/dovecot:SUNET-1
    volumes:
      - /opt/dovecot/ssmtp/ssmtp.conf:/etc/ssmtp/ssmtp.conf
      - /opt/dovecot/config:/etc/dovecot/
      - /opt/dovecot/mail:/var/mail/
      - /opt/dovecot/certs/:/certs
    command:
      - /usr/sbin/dovecot
      - -F
    ports:
      - "24:24"
      - "143:143"
      - 993:993
      - 4190:4190
      - 12345:12345
      - 12346:12346
    restart: always


Next you will need some dovecot configuration:

/opt/dovecot/config/dovecot.conf
mail_home=/srv/mail/%Lu
mail_location=maildir:/var/mail/vhosts/%d/%n/
mail_privileged_group = mail
log_path=/dev/stdout
first_valid_uid=8
postmaster_address = postmaster@example.com
sendmail_path = /usr/sbin/ssmtp

namespace inbox {
  inbox = yes
  separator = /
  mailbox Drafts {
    special_use = \Drafts
    auto = subscribe
  }
  mailbox Junk {
    special_use = \Junk
    auto = subscribe
  }
  mailbox Trash {
    special_use = \Trash
    auto = subscribe
  }

  mailbox Sent {
    special_use = \Sent
    auto = subscribe
  }
  mailbox "Sent Messages" {
    special_use = \Sent
    auto = subscribe
  }
}

protocols = imap lmtp sieve

service lmtp {
   inet_listener lmtp {
      address = 0.0.0.0 ::
      port = 24
   }
}

mail_plugins = mail_plugins notify replication
protocol lmtp {
  mail_plugins = mail_plugins sieve
}
protocol sieve {
  mail_debug = yes
  managesieve_max_line_length = 65536
}

service managesieve-login {
  inet_listener sieve {
    port = 4190
    ssl = yes
  }  
  service_count = 1
}
service managesieve {
  process_limit = 256
}

plugin {
   sieve = ~/.dovecot.sieve
   sieve_global_path = /var/lib/dovecot/sieve/default.sieve
   sieve_dir = ~/sieve
   sieve_global_dir = /var/lib/dovecot/sieve/
   sieve_extensions = +vacation-seconds
   sieve_global_extensions = +vnd.dovecot.pipe
   sieve_pipe_bin_dir = /etc/dovecot/sieve
   sieve_plugins = sieve_imapsieve sieve_extprograms
   sieve_vacation_default_period = 7d
   sieve_vacation_max_period = 30d
   sieve_vacation_min_period = 1d
}


auth_mechanisms = plain login
auth_username_format = %n
passdb {
  args = password=<a secret you need for nextcloud goes here> allow_nets=<a comma separated list of your nextcloud servers goes here>
  driver = static
}

passdb {
  driver = lua
  args = file=/etc/dovecot/nextcloud-auth.lua
}

userdb {
  driver = sql
  args = /etc/dovecot/dovecot-sql.conf
}

service auth {
 inet_listener {
   port = 12346
 }
}

auth_debug = yes
auth_verbose = yes

ssl=yes
ssl_cert=</certs/imap.example.com/fullchain.pem
ssl_key=</certs/imap.example.com/privkey.pem

doveadm_password = <a shared secret for replicating between the two servers goes here>
service replicator {
  process_min_avail = 1
}
service aggregator {
  fifo_listener replication-notify-fifo {
    user = mail
  }
  unix_listener replication-notify {
    user = mail
  }
}
service doveadm {
  inet_listener {
    port = 12345
  }
}
plugin {
  mail_replica = tcp:<host name of the partner server goes here>:12345
}


/opt/dovecot/config/dovecot-sql.conf
driver = mysql
connect = host=<database hostname goes here> dbname=<nextcloud database name goes here> user=<nextcloud db user> password=<nextcloud db password>
user_query = SELECT '%n' as username, 'mail' as uid, 'mail' as gid, '/var/mail/vhosts/example.com/%n' as home, 'maildir:/var/mail/vhosts/example.com/%n/' as mail;
iterate_query = SELECT UNIQUE(REPLACE(value, '@example.com', '')) AS username, 'example.com' as domain FROM oc_accounts_data WHERE name = 'email' AND value LIKE '%%example.com';

The lua script below will do the actual validation of app passwords, but in stead of implementing sha512 in lua, we will just call out to php and do the exact same thing Nextcloud does when validating app passwords.

 This is the one place where you can not have multiple database servers configured, so if you want High Availabilty for this part, you will need to look in to proxysql. You could for instance run proxysql locally in docker on the imap server, and then connect to that container from the lua script to gain high availability.

/opt/dovecot/config/nextcloud-auth.lua
function auth_passdb_lookup(req)
  -- Get the hash out using php
  local salt = "<the salt from config.php>"
  local command = "php -r " .. "'print(hash(" .. '"sha512","' .. req.password .. salt .. '"' .. "));'"
  local handle = assert(io.popen(command))
  local hash = handle:read("*a")
  handle:close()

  -- Get the stored app passwords from Nextcloud
  local db        = '<nextcloud database name goes here>'
  local user      = '<nextcloud db user>'
  local password  = '<nextcloud db password>'
  local db_server = '<database hostname goes here>'
  local mysql     = require "luasql.mysql"
  -- Adjust the query below so that you can find your users if the don't have usernames like user@example.com
  local query     = "SELECT token FROM oc_authtoken where uid = '" .. req.user .. "@example.com'"
  local env       = assert(mysql.mysql())
  local conn      = assert(env:connect(db, user, password, db_server))
  local cur       = assert(conn:execute(query))
  local row       = cur:fetch({}, "a")
  while row do
    local token = row.token
    if token == hash then
      return dovecot.auth.PASSDB_RESULT_OK, "password=" .. req.password
    end
    row = cur:fetch(row, "a")
  end
  return dovecot.auth.PASSDB_RESULT_USER_UNKNOWN, "no such user"
end
/opt/dovecot/config/ssmtp.conf
root=postmaster
mailhub=smtp.example.com:25
rewriteDomain=example.com
hostname=<host name of current imap server>


The configuration in <brackets> in the configuration needs to be replaced with actual values:


  • <a secret you need for nextcloud goes here> This is your master password you need for confguring in Nextcloud, you can generate a strong and complicated password for this
  • <a comma separated list of your nextcloud servers goes here> Any ip address mentioned here can access any email using the master password above
  • <a shared secret for replicating between the two servers goes here> This is the authentication token used for replication between the two imap servers, you can generate a strong and complicated password for this
  • <database hostname goes here> The host name of the nextcloud database server, the host= stanza can be repeted in the dovecot-sql.conf file (but not in the lua-script) if you have multiple database hosts in a cluster
  • <nextcloud database name goes here> The name of the nextcloud database, usually nextcloud 
  • <nextcloud db user> The name of your nextcloud database user
  • <nextcloud db password> The password of your nextcloud database user
  • <host name of the partner server goes here> This is the other servers host name, i.e. imap2.example.com or imap1.example.com
  • <the salt from config.php> This is a secret generated by nextcloud and stored in config.php that you need to share with your imap servers
  • <host name of current imap server> This is this servers host name, i.e. imap1.example.com or imap2.example.com

Postfix

This needs to be done on smtp1.example.com and smtp2.example.com.


First create the required directory structure:

# mkdir -p /opt/postfix/{certs,config}


Now create /opt/postfix/docker-compose.yml

/opt/postfix/docker-compose.yml
version: "3.7"
services:
  postfix:
    image: docker.sunet.se/mail/postfix:SUNET-1
    volumes:
      - /opt/postfix/config:/config
      - /opt/postfix/certs/:/certs
    command:
      - /start.sh
    ports:
      - "25:25"
      - 587:587
    restart: always


Then create the /opt/postfix/config/main.cf file:

maillog_file = /dev/stdout
smtpd_banner = $myhostname ESMTP $mail_name (Debian/GNU)
biff = no
append_dot_mydomain = no
readme_directory = no
compatibility_level = 3.6
smtpd_tls_cert_file=/certs/smtp.example.com/fullchain.pem
smtpd_tls_key_file=/certs/smtp.example.com/privkey.pem
smtpd_tls_security_level=may
smtp_tls_CApath=/etc/ssl/certs
smtp_tls_security_level=may
smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated defer_unauth_destination
smtpd_client_restrictions = permit_mynetworks
myhostname = <current smtp hostname goes here>
alias_maps = hash:/etc/aliases
alias_database = hash:/etc/aliases
virtual_mailbox_domains = mysql:/config/mysql-virtual-mailbox-domains.cf
virtual_mailbox_maps = mysql:/config/mysql-virtual-mailbox-maps.cf
mydestination = $myhostname, localhost.localdomain, localhost
mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 <ip address of imap servers goes here>
mailbox_size_limit = 0
recipient_delimiter = +
inet_interfaces = all
inet_protocols = all
relayhost = <a comma seprated list of relay servers>
smtpd_sasl_type = dovecot
smtpd_sasl_path = inet:imap.example.com:12346
smtpd_sasl_auth_enable = yes
virtual_transport = lmtp:imap.example.com:24

Add the /opt/postfix/config/master.cf file

/opt/postfix/config/master.cf
smtp      inet  n       -       n       -       -       smtpd
submission inet n       -       n       -       -       smtpd
  -o syslog_name=postfix/submission
  -o smtpd_tls_security_level=encrypt
  -o smtpd_sasl_auth_enable=yes
  -o smtpd_tls_auth_only=yes
  -o smtpd_reject_unlisted_recipient=no
pickup    unix  n       -       n       60      1       pickup
cleanup   unix  n       -       n       -       0       cleanup
qmgr      unix  n       -       n       300     1       qmgr
tlsmgr    unix  -       -       n       1000?   1       tlsmgr
rewrite   unix  -       -       n       -       -       trivial-rewrite
bounce    unix  -       -       n       -       0       bounce
defer     unix  -       -       n       -       0       bounce
trace     unix  -       -       n       -       0       bounce
verify    unix  -       -       n       -       1       verify
flush     unix  n       -       n       1000?   0       flush
proxymap  unix  -       -       n       -       -       proxymap
proxywrite unix -       -       n       -       1       proxymap
smtp      unix  -       -       n       -       -       smtp
relay     unix  -       -       n       -       -       smtp
        -o syslog_name=postfix/$service_name
showq     unix  n       -       n       -       -       showq
error     unix  -       -       n       -       -       error
retry     unix  -       -       n       -       -       error
discard   unix  -       -       n       -       -       discard
local     unix  -       n       n       -       -       local
virtual   unix  -       n       n       -       -       virtual
lmtp      unix  -       -       n       -       -       lmtp
anvil     unix  -       -       n       -       1       anvil
scache    unix  -       -       n       -       1       scache
postlog   unix-dgram n  -       n       -       1       postlogd
maildrop  unix  -       n       n       -       -       pipe
  flags=DRXhu user=vmail argv=/usr/bin/maildrop -d ${recipient}
uucp      unix  -       n       n       -       -       pipe
  flags=Fqhu user=uucp argv=uux -r -n -z -a$sender - $nexthop!rmail ($recipient)
ifmail    unix  -       n       n       -       -       pipe
  flags=F user=ftn argv=/usr/lib/ifmail/ifmail -r $nexthop ($recipient)
bsmtp     unix  -       n       n       -       -       pipe
  flags=Fq. user=bsmtp argv=/usr/lib/bsmtp/bsmtp -t$nexthop -f$sender $recipient
scalemail-backend unix -       n       n       -       2       pipe
  flags=R user=scalemail argv=/usr/lib/scalemail/bin/scalemail-store ${nexthop} ${user} ${extension}
mailman   unix  -       n       n       -       -       pipe
  flags=FRX user=list argv=/usr/lib/mailman/bin/postfix-to-mailman.py ${nexthop} ${user}

Add the/opt/postfix/config/mysql-virtual-mailbox-domains.cf file

/opt/postfix/config/mysql-virtual-mailbox-domains.cf
user = <nextcloud db user>
password = <nextcloud db password>
hosts = <nextcloud db hosts>
dbname = <nextcloud db name> 
query = SELECT 1 WHERE 'example.com' = '%s'
/opt/postfix/config/mysql-virtual-mailbox-maps.cf
user = <nextcloud db user>
password = <nextcloud db password>
hosts = <nextcloud db hosts>
dbname = <nextcloud db name>  
query = SELECT UNIQUE(1) FROM oc_accounts_data WHERE value='%s' and name = 'email'


The configuration in <brackets> in the configuration needs to be replaced with actual values:

  • <ip address of imap servers goes here> add the ip addresses space separated here to allow them to send vacation responders and such
  • <a comma seprated list of relay servers> if you are using any relay servers to send your emails add them here
  • <nextcloud db user> User name of the Nextcloud db user
  • <nextcloud db password> Password of the Nextcloud db user
  • <nextcloud db hosts> A space separated list of database servers for Nextcloud (if you have more than one, otherwise just the one then)
  • <nextcloud db name>  The name of the Nextcloud database


Using Nextcloud mail with SSO-accounts

It is currently not possible to use the Nextcloud mail app with accounts that have logged in using Single Sign On. That is because Nextcloud expects the Nextcloud password to be used for authenticating against the email server (unless you have a gmail or outlook account with ouath set up). However, that is about to change. I have written a PR for the Nextcloud mail app that allows you set a "master password" for the provisioned accounts, when that PR is merged it will be possible to use the Nextcloud mail app with the configuration in this guide.

Configuring Nextcloud

Make sure you enable the mail app in Nextcloud , as an administrator you should then be able to create a "provisioning" in Admin settings → Groupware. Note that in this screen shot we are running the patched version that can use master passwords.