Some tips to deploy Django in kubernetes
Posted on 2021-03-29 in Trucs et astuces Last modified on: 2022-03-09
I am not going to go into details in this article about how you can deploy Django in kubernetes. I am just going to highlight the main points you should pay attention to when deploying a Django app. I expect you to have prior knowledge about how to deploy an application in kubernetes using Helm. I hope you will still find useful pieces of information in this article.
Contents
Deploying the application
- Always disable the DEBUG mode with DEBUG=False in your settings. That's the case for all Django deployments not matter how you do it.
- Don't use the the Django dev server to launch your application (that's the python manage.py server command), rely on gunicorn or something equivalent instead (like you normally would).
- Rely on environment variables to inject configurations into your settings files. You can use django-environ to help you read, validate and parse them.
- Store secrets into kubernetes secrets. That includes: the SECRET_KEY configuration value, your database connection details, API keys…
- Store everything else into a ConfigMap managed by Helm.
- Configure livenessProbe to detect issues with your applications and allow kubernetes to correctly restart the pod if needed.
- You may want to add a nginx sidecar container to buffer some requests like file uploads. By default, when you deploy Django into kubernetes, the request will hit gunicorn directly. In the case of long file uploads, it means the gunicorn worker that handles this request cannot do anything until the upload is done. This can be a problem and may result in container restarts (because kubernetes cannot check the liveness probe) or request timeouts. A good way to avoid that, is to put a nginx server in front of gunicorn like you would do if you weren't on kubernetes. The sidecar pattern is a common way to do that. Just make sure your service will route traffic to nginx and not to gunicorn. Normally, this can be done by changing the port it must route traffic to to 80.
- If you use async Django, you should already be good without nginx. Sadly, at this time, the ORM doesn't support async yet so it limits where you can apply this pattern, meaning you probably will need nginx.
- You could also use gevent workers, but this involves patching the standard library, so I'm not a fan and don't advise it.
- You may be able to configure a ngnix ingress at cluster level. However, after some tests, I didn't succeed to correctly configure it. So I decided to use a nginx sidecar which is a much easier pattern to deal with.
- Don't run gunicorn as root in the container to limit the surface of attack.
- Use an initContainer to run your migrations.
- Give your containers resource quotas to avoid any of them using too much resources.
- Put the static files into a bucket or let nginx serve them. See this article.
Configurations
To help you put this into practice, here are some configuration samples.
Nginx sidecar configuration
It's a very standard reverse proxy configuration.
1 apiVersion: v1 2 kind: ConfigMap 3 metadata: 4 name: backend-api-nginx 5 data: 6 api.conf: | 7 upstream app_server { 8 # All containers in the same pod are reachable with 127.0.0.1 9 server 127.0.0.1:{{ .Values.container.port }} fail_timeout=0; 10 } 11 12 13 server { 14 listen 80; 15 root /var/www/api/; 16 client_max_body_size 1G; 17 18 access_log stdout; 19 error_log stderr; 20 21 location / { 22 location /static { 23 add_header Access-Control-Allow-Origin *; 24 add_header Access-Control-Max-Age 3600; 25 add_header Access-Control-Expose-Headers Content-Length; 26 add_header Access-Control-Allow-Headers Range; 27 28 if ($request_method = OPTIONS) { 29 return 204; 30 } 31 32 try_files /$uri @django; 33 } 34 35 # Dedicated route for nginx health to better understand wher problems come from if needed. 36 location /nghealth { 37 return 200; 38 } 39 40 try_files $uri @django; 41 } 42 43 location @django { 44 proxy_connect_timeout 30; 45 proxy_send_timeout 30; 46 proxy_read_timeout 30; 47 send_timeout 30; 48 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; 49 # We have another proxy in front of this one. It will capture traffic 50 # as HTTPS, so we must not set X-Forwarded-Proto here since it's already 51 # set with the proper value. 52 # proxy_set_header X-Forwarded-Proto $schema; 53 proxy_set_header Host $http_host; 54 proxy_redirect off; 55 proxy_pass http://app_server; 56 } 57 }
deployment.yaml
1 apiVersion: apps/v1 2 kind: Deployment 3 metadata: 4 name: {{ include "chart.fullname" . }} 5 labels: 6 {{ include "chart.labels" . | indent 4 }} 7 spec: 8 selector: 9 matchLabels: 10 app.kubernetes.io/name: {{ include "chart.name" . }} 11 app.kubernetes.io/instance: {{ .Release.Name }} 12 template: 13 metadata: 14 labels: 15 app.kubernetes.io/name: {{ include "chart.name" . }} 16 app.kubernetes.io/instance: {{ .Release.Name }} 17 spec: 18 containers: 19 - name: {{ .Chart.Name }} 20 image: "{{ .Values.container.image.repository }}:{{ .Values.container.image.tag }}" 21 imagePullPolicy: {{ .Values.container.image.pullPolicy }} 22 securityContext: 23 privileged: false 24 runAsUser: 1001 25 runAsGroup: 1001 26 # Required to prevent escalations to root. 27 allowPrivilegeEscalation: false 28 runAsNonRoot: true 29 envFrom: 30 - configMapRef: 31 name: {{ .Chart.Name }} 32 optional: true 33 - secretRef: 34 name: {{ .Chart.Name }} 35 optional: true 36 ports: 37 - name: http 38 containerPort: {{ .Values.container.port }} 39 protocol: TCP 40 resources: 41 limits: 42 memory: {{ .Values.container.resources.limits.memory }} 43 cpu: {{ .Values.container.resources.limits.cpu }} 44 requests: 45 memory: {{ .Values.container.resources.requests.memory }} 46 cpu: {{ .Values.container.resources.requests.cpu }} 47 {{ if .Values.container.probe.enabled -}} 48 # As soon as this container is alive, it can serve traffic, so no need for a readinessProbe. 49 # We still need a bit for it to start before trying to consider it alive: gunicorn must 50 # start its workers and open connections to the database. 51 livenessProbe: 52 httpGet: 53 path: {{ .Values.container.probe.path }} 54 port: {{ .Values.container.port }} 55 timeoutSeconds: {{ .Values.container.probe.livenessTimeOut }} 56 initialDelaySeconds: {{ .Values.container.probe.initialDelaySeconds }} 57 {{- end }} 58 volumeMounts: 59 - name: backend-credentials 60 mountPath: /secrets/backend 61 readOnly: true 62 - name: staticfiles 63 mountPath: /var/www/api/ 64 # The API must be able to copy the files to the volume. 65 readOnly: false 66 - name: nginx-sidecar 67 image: nginx:stable 68 imagePullPolicy: Always 69 securityContext: 70 privileged: false 71 # Nginx must start as root to bind the proper port in the container. 72 allowPrivilegeEscalation: true 73 runAsNonRoot: false 74 ports: 75 - name: http 76 containerPort: {{ .Values.service.port }} 77 protocol: TCP 78 volumeMounts: 79 - name: nginx-conf 80 mountPath: /etc/nginx/conf.d 81 readOnly: true 82 - name: staticfiles 83 mountPath: /var/www/api/ 84 readOnly: true 85 {{ if .Values.sidecar.nginx.probe.enabled -}} 86 livenessProbe: 87 httpGet: 88 # When we can access this route, nginx is alive, but it is not ready (ie cannot serve 89 # traffic yet). 90 path: {{ .Values.sidecar.nginx.probe.path }} 91 port: {{ .Values.service.port }} 92 timeoutSeconds: {{ .Values.sidecar.nginx.probe.livenessTimeOut }} 93 readinessProbe: 94 httpGet: 95 # The container cannot be ready (that is accepting traffic) until it can talk to the 96 # container. So we need to pass through nginx (with the port) to the container (with 97 # the path) to check this. 98 # Since it can take a few seconds, we have an initialDelaySeconds. 99 path: {{ .Values.container.probe.path }} 100 port: {{ .Values.service.port }} 101 initialDelaySeconds: {{ .Values.sidecar.nginx.probe.initialDelaySeconds }} 102 timeoutSeconds: {{ .Values.sidecar.nginx.probe.livenessTimeOut }} 103 {{- end }} 104 resources: 105 limits: 106 memory: {{ .Values.container.resources.limits.memory }} 107 cpu: {{ .Values.container.resources.limits.cpu }} 108 requests: 109 memory: {{ .Values.initContainer.resources.requests.memory }} 110 cpu: {{ .Values.initContainer.resources.requests.cpu }} 111 {{ if .Values.initContainer.enabled -}} 112 initContainers: 113 - name: {{ .Values.initContainer.name }} 114 image: "{{ .Values.container.image.repository }}:{{ .Values.container.image.tag }}" 115 imagePullPolicy: {{ .Values.container.image.pullPolicy }} 116 envFrom: 117 - configMapRef: 118 name: {{ .Chart.Name }} 119 optional: true 120 - secretRef: 121 name: {{ .Chart.Name }} 122 optional: true 123 resources: 124 limits: 125 memory: {{ .Values.initContainer.resources.limits.memory }} 126 cpu: {{ .Values.initContainer.resources.limits.cpu }} 127 requests: 128 memory: {{ .Values.initContainer.resources.requests.memory }} 129 cpu: {{ .Values.initContainer.resources.requests.cpu }} 130 {{- end }} 131 volumes: 132 - name: nginx-conf 133 configMap: 134 name: backend-api-nginx 135 - name: staticfiles 136 emptyDir: {} 137 - name: backend-credentials 138 secret: 139 secretName: {{ .Values.gcp.backend.credentials.secret }}
Handling commands
You can run commands at regular intervals with CronJob. To avoid the need to create one file per CronJob, you can loop over values as described here. In a nutshell, you can combine this cronjobs.yaml Helm template:
{- range $job, $val := .Values.cronjobs }} apiVersion: batch/v1beta1 kind: CronJob metadata: name: "{{ .name }}" spec: schedule: "{{ .schedule }}" jobTemplate: spec: template: spec: containers: - name: "{{ .name }}" image: "{{ $.Values.container.image.repository }}:{{ $.Values.container.image.tag }}" imagePullPolicy: "{{ $.Values.container.image.pullPolicy }}" args: - python - manage.py - "{{ .djangoCommand }}" envFrom: - configMapRef: name: {{ $.Chart.Name }} optional: true - secretRef: name: {{ $.Chart.Name }} optional: true restartPolicy: "{{ .restartPolicy }}" --- {{- end}}
With this configuration:
# We currently assume we run the API Python/Django image for all jobs. cronjobs: "0": name: backend-api-clearsessions # This must be in the standard Unix crontab format schedule: "0 23 * * *" djangoCommand: clearsessions restartPolicy: Never "1": name: backend-api-clean-pending-loan-applications schedule: "0 23 1 * *" djangoCommand: remove_stale_contenttypes restartPolicy: Never
To create two CronJob in kubernetes: one for python manage.py clearsessions launched every day at 23:00 and one for python manage.py remove_stale_contenttypes launched every fist day of each month at 23:00.
History
- 9th of March 2022: corrected static files volume mounts and copy of static files to volumes. Thanks to RomoSapiens for the catch.