Some tips for django
Posted on 2018-05-21 in Trucs et astuces
Sommaire
View last executed query
Run (the query will be as executed by the database with values correctly replaced and escaped):
from django.db import connection print(connection.queries[-1])
Logging
Queries
You can configure your logger to view all requests made to the database with:
LOGGING = { # ... 'loggers': { # ... 'django.db': { 'handlers': ['console'], 'level': 'DEBUG', }, }, }
Source: Surviving Django (if you care about databases) under Another random bit of advice.
Migrations
Checks
You can use this check to verify your migrations during CI (no dependency issue and no creation):
python manage.py makemigrations if [[ $(git status --porcelain | grep migrations | wc -l) -gt 0 ]]; then echo 'New migrations were created in the project. Please fix that.' >&2 exit 1 fi
If you want to use it in a git hook, you should only consider untracked migrations:
python manage.py migrate migs_exit_code=$? if [[ "${migs_exit_code}" -ne 0 ]]; then exit 1 fi if [[ $(git status --porcelain | grep migrations | grep .py | grep '??' | wc -l) -gt 0 ]]; then echo 'New migrations were created in the project. Please fix that.' >&2 exit 1 fi
Or you can just use python manage.py makemigrations --dry-run --check which I discovered recently.
Fake all pending migrations
#!/usr/bin/env bash set -eu set -o pipefail # This will contain a list like this: #wagtailusers # [ ] 0001_initial # [ ] 0002_add_verbose_name_on_userprofile # [ ] 0003_add_verbose_names # [ ] 0004_capitalizeverbose # [ ] 0005_make_related_name_wagtail_specific # [ ] 0006_userprofile_prefered_language # [ ] 0007_userprofile_current_time_zone # [ ] 0008_userprofile_avatar # [ ] 0009_userprofile_verbose_name_plural pending_migrations="$(python manage.py showmigrations | grep --color '\[ \]\|^[a-z]' | grep --color '[ ]' -B 1)" declare -A app_to_last_migration # Make sure we loop over lines, not words. IFS=$'\n' # Array must not be quoted to correctly loop over each line. # shellcheck disable=SC2068 for line in ${pending_migrations[@]}; do if [[ "$line" =~ ^[a-z]+ ]]; then app_name="${line}" elif [[ "$line" =~ \[ ]]; then # Capture the last migration of the app to pass it and all migrations before it. migration_name=$(echo "${line}" | awk '{print $3}') app_to_last_migration["${app_name}"]="${migration_name}" fi done for app_name in "${!app_to_last_migration[@]}"; do python manage.py migrate ${app_name} ${app_to_last_migration[${app_name}]} --fake done
Model proxies
They are useful to change the behavior of a model (based on a type column for instance) or to work on a subset of a table (based on a type column for instance). See the documentation for more details.
from django.db import models class Person(models.Model): first_name = models.CharField(max_length=30) last_name = models.CharField(max_length=30) class MyPerson(Person): class Meta: proxy = True def do_something(self): # ... pass class OrderedPerson(Person): class Meta: ordering = ["last_name"] proxy = True
Easily increment migrations numbers
Here is a small bash script to increment the numbers of your migrations. Put it into your ~/.profile or ~/.bashrc. Use it like: inc-migrations PROJECT/apps/APP_NAME/migrations 0012
It will rename all migrations from 0012 (ie transform 0012_mig.py into 0013_mig.py and so on) and replace all occurrences of the previous name (eg 0012_mig.py) with the new one (eg 0013_mig.py).
function inc-migrations() { local folder="$1" local number="$2" local start_file local new_number local next_file if ! [[ "$(pwd)" =~ "${folder}$" ]]; then cd "${folder}" fi if [[ -f "${number}" ]]; then start_file="${number}" number=$(echo "${number}" | cut -d _ -f 1) else start_file=$(ls ${number}* 2> /dev/null) fi if [[ ! -f "${start_file}" ]]; then echo "${start_file} doesn't exits" >&2 return 1 fi let "new_number = ${number} + 1" new_number=$(printf "%04d\n" "${new_number}") next_file=$(ls ${new_number}* 2> /dev/null) new_file_name="${start_file/${number}/${new_number}}" git mv "${start_file}" "${new_file_name}" sed -i "s/${start_file/.py/}/${new_file_name/.py/}/g" *.py if [[ -f "${next_file:-}" ]]; then inc-migrations "${folder}" "${next_file}" fi }
Create a widget with custom display
class BonusTimeWidget(AdminIntegerFieldWidget): UNITS_TO_TRANSLATIONS = { 'month': partial(ungettext_lazy, '%(count)d month', '%(count)d free months'), } def __init__(self, *args, unit='month', **kwargs): super().__init__(*args, **kwargs) if unit not in self.UNITS_TO_TRANSLATIONS: supported_units = ','.join(self.UNITS_TO_TRANSLATIONS.keys()) raise ValueError( f'{unit} is not a supported unit. Supported units are: {supported_units}', ) self.bonus_translation = self.UNITS_TO_TRANSLATIONS[unit] def render(self, name, value, attrs=None): """Render the value in a custom span.""" if value == 0: return '' text = self.bonus_translation(value) % {'count': value} return mark_safe(f'<span>+ {text}</span>')
Use a nginx reverse proxy in dev
The django web server works well but can be slow when it needs to handle many requests (to load many images for instance). One way to solve this is to use a production web server (nginx in this case) to handle most of the work (ie everything but dynamically generated pages).
Prerequisites:
- Install nginx
- Add the domain you want to use in your /etc/hosts file. For instance 127.0.0.1 myproject.localhost.
- Make sure nginx has access to the files of your project. Most of the time a chmod 755 /PATH/TO/PROJECT will do (repeat on each subdirectory nginx need to pass to access to your files). If you are using a shared computer, you may need to think about a more secure way to allow nginx to access the file (ACL may help you).
- If you are using SELinux, don't forget to add the proper context to the files. For instance, do something like:
- Add these files to the proper SELinux context by copying the one from the default web folder: semanage fcontext --add --equal /var/www/html /PATH/TO/PROJECT
- Restore the context of the files: restorecon -R /home/jenselme/Work/bureauxlocaux
- Check that the context is correct: ls -Z The output should contain something like system_u:object_r:httpd_sys_content_t:s0.
Here is the nginx configuration to put in /etc/nginx/conf.d (or /etc/nginx/sites-enabled):
server { listen 80; # Make sure this host is in the ALLOWED_HOSTS variable in the settings. server_name PROJECT.localhost; root /PATH/TO/PROJECT; # Prevent access to pyc and py files. location ~ .*\.pyc? { return 404; } # Search for files in the media folder. Change this if you configured Django to store your uploaded files elsewhere. location ~ ^/files/(.*) { try_files /media/$1 =404; } # Look for static files in the static folder or at the root of the project. location ~ ^/static { # Look both in the production static folder at the root of you project and in PROJECT # (where you have the apps directory and the static directory you use in dev). try_files /PROJECT/$uri /$uri $uri; } # Relay everything else to the django web server. location / { proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_redirect off; proxy_pass http://127.0.0.1:8000; } }
To add a connect timeout (eg to mimic Heroku's timeout) to the Django dev server, add the lines below in the @django block:
proxy_connect_timeout 30; proxy_send_timeout 30; proxy_read_timeout 30; send_timeout 30;
Checking PO files
#!/usr/bin/env bash set -eu for file in "$@"; do msgfmt -v --check "${file}" done
Testing
Inserting pytest fixtures in a test case
class SpamTest(unittest.TestCase): @pytest.fixture(autouse=True) def inject_fixtures(self, caplog): self._caplog = caplog def test_eggs(self): with self._caplog.at_level(logging.INFO): spam.eggs() assert self._caplog.records[0].message == 'bacon'
Reset PK sequence
def reset_database_sequences(*models_or_factories): models = [ model_or_factory._meta.model for model_or_factory in models_or_factories ] sequence_sql = connection.ops.sequence_reset_sql(no_style(), models) with connection.cursor() as cursor: for sql in sequence_sql: cursor.execute(sql)
Reset factoryboy sequence
def reset_sequences(*factories): reset_database_sequences(*factories) for factory_cls in factories: factory_cls.reset_sequence()
View inter app dependencies
Here is a small script to help you picture the dependencies between your Django apps. It does three things:
- It creates a graph of your models thanks to Django extensions.
- It creates a TSV file with all your URLs.
- It creates a text file with the dependencies between each app. This file is very bare bone and will parse import of your Python files thanks to a regexp. I didn't manage to make more evolved tools like pydeps to just print a graph between each app, I always got to much information. You can make sure CI fails when new dependencies are introduced so you can review them. For that, pass the ci argument to the script.
1 #!/usr/bin/env bash 2 3 set -eu 4 set -o pipefail 5 6 # This scripts requires extra deps (either pygraphviz or pyparsing and pydot). 7 # See: https://django-extensions.readthedocs.io/en/latest/graph_models.html 8 9 readonly my_project="my_project" 10 11 # We ignore the history app as it mirrors other app. It doesn't have dependencies by itself. 12 readonly our_apps=$(ls ${my_project}/apps/ | grep -vE '__(\.py)?$' | grep -v history) 13 14 function does_app_depends_on_app() { 15 local app="$1" 16 local inner_app="$2" 17 18 # We allow constants import across apps so they can be defined where relevant. 19 # Since constants don't import stuff from the project, these imports will never be an issue anyway. 20 # To do that, we impose the imports to match the import of the app AND that we don't import constants. 21 # See https://stackoverflow.com/a/6361362 for symbols details. 22 # We can add # ignore-deps at the end of a line to ignore a manually validated deps. 23 grep -RP --files-with-matches "(?=^(from|import).*${inner_app}( |\.|\n))(?=(?!((from|import).*${inner_app}.constants( |\.|\n)|from.*${inner_app} import constants)))(?=(?!.*# ignore-deps\$))" "${my_project}/apps/${app}" | 24 grep -v __pycache__ | 25 grep -v __test__ | 26 grep -v __tests__ | 27 grep -v "^${my_project}/apps/${app}/migrations/" | 28 grep -v "^${my_project}/apps/${app}/admin/" | 29 grep -v --quiet "^${my_project}/apps/${app}/factories/" 30 } 31 32 function compute_model_graph() { 33 echo "Computing model graph" 34 python manage.py graph_models \ 35 --no-inheritance \ 36 --exclude-models Group,Permission,AbstractUser \ 37 --group-models \ 38 --disable-fields \ 39 --output ${my_project}_models.svg \ 40 --output ${my_project}_models.png \ 41 ${our_apps} 42 echo -e "\n" 43 } 44 45 function compute_urls_list() { 46 echo "Computing URLs list" 47 echo -e "URL\tview\treverse_name" > urls.tsv 48 python manage.py show_urls --force-color | 49 grep -v '^/__debug__' | # Django Debug Toolbar. 50 grep -v 'admin:[a-z_]+$' | # Admin views generated by Django. 51 grep -v 'django.views.static.serve$' | # View generated by Django to serve static content. 52 grep -v '^/ckeditor' | # View for CKEditor (external app) 53 grep -v '^/bk-team' >> urls.tsv # Extra admin views generated by Django (mostly redirection). 54 echo -e "\n" 55 } 56 57 function compute_app_dependencies() { 58 echo "Computing app dependencies" 59 for app in ${our_apps}; do 60 echo "${app} depends on these apps" 61 for inner_app in ${our_apps}; do 62 if [[ "${inner_app}" != "${app}" ]]; then 63 if does_app_depends_on_app "${app}" "${inner_app}"; then 64 echo -e "\t${inner_app}" 65 fi 66 fi 67 done 68 echo -e "\n" 69 done 70 } 71 72 function main() { 73 case $1 in 74 ci) 75 compute_app_dependencies > ./scripts/app_deps.txt 76 git diff --quiet --exit-code 77 ;; 78 *) 79 compute_model_graph 80 compute_urls_list 81 compute_app_dependencies 82 ;; 83 esac 84 } 85 86 main "$@"
Use the true client IP
Sometimes, you will need to know the IP of the client. To track where the user connected from as a security feature or to block an IP when it fails to connect too many times. If you don't have any reverse proxy, you can simply use request.META["REMOTE_ADDR"] to get the information. If you do, request.META["REMOTE_ADDR"] will contain the address of the proxy, not the client.
Luckily, if your proxy is correctly configured, it will append the address of its client to the X-Forwarded-For header. For nginx, this is done with proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;. You can then read request.META["HTTP_X_FORWARDED_FOR"] to get the IP or set request.META["REMOTE_ADDR"] to the proper value (some libriaries like Django user sessions can only read the IP from there). A naive and insecure way, would be do to it in a middleware like this:
class IpAddressMiddleware: def __init__(self, get_response): self.get_response = get_response def __call__(self, request): # Beware, this IP will generally NOT be trust worthy. It can be tampered # with by the user by setting the header manually on a request made # directly to the backend. In these case, we will get the user # supplied address. if request.META.get("HTTP_X_FORWARDED_FOR"): request.META["REMOTE_ADDR"] = ( request.META["HTTP_X_FORWARDED_FOR"].split(",")[0].strip() ) return self.get_response(request)
The problem being, the IP cannot be trusted. If all goes well and the user is trustworthy, you will get the IP. But the user can spoof the header with something like this:
curl -H 'X-Forwarded-For: SPOOFED_IP' https://example.com
After the passage by the proxy, the header will look like this:
X-Forwarded-For: SPOOFED_IP, CLIENT_IP
So you will read the spoofed address. Since the address of the proxy will be appended at the end, you may think that instead of reading the first address, all you need to do is to read the last one. If you have exactly one proxy it works. If you have more, the header will be like CLIENT_IP, PROXY1_IP. So it doesn't work.
Instead, what we need to do is add a configuration settings like REVERSE_PROXY_COUNT, set it to the proper number of proxies and use it to find the proper address. If you take into account health probes that may not go down the all proxies stack, you can end-up with something like this:
class IpAddressMiddleware: def __init__(self, get_response): self.get_response = get_response def __call__(self, request): if request.META.get("HTTP_X_FORWARDED_FOR"): x_forwarded_for = [ ip.strip() for ip in request.META["HTTP_X_FORWARDED_FOR"].split(",") ] # Only probes can genuinely have this. Their requests are the # only one that won't go through load balancing and only # through nginx. So they are the only one with one proxy # and so with x_forwarded_for at 1. And they have a dedicated # user agent we can check for extra safety. is_probe = len(x_forwarded_for) == 1 and ( request.headers.get("User-Agent", "").startswith("kube-probe/") or request.headers.get("User-Agent", "").startswith("GoogleHC/") ) if len(x_forwarded_for) != settings.BK_PROXY_COUNT and not is_probe: logger.error( f"Expected {settings.BK_PROXY_COUNT} addresses in " f"X-Forwarded-For, got {len(x_forwarded_for)}. It can either" f"be a configuration issue or an attack. Please check." ) if len(x_forwarded_for) <= settings.BK_PROXY_COUNT: remote_addr = x_forwarded_for[0] else: # We have some user supplied data in the header. # Let's strip it. client_ip_index = len(x_forwarded_for) - settings.BK_PROXY_COUNT remote_addr = x_forwarded_for[client_ip_index] logger.debug( f"Setting REMOTE_ADDR based on value from X-Forwarded-For header. " f"Changing from {request.META['REMOTE_ADDR']} to {remote_addr} f"based on {x_forwarded_for}" ) request.META["REMOTE_ADDR"] = remote_addr return self.get_response(request)
Note
This only works if you have more than 1 proxy. If you have only one proxy, you will need to rely on IPs to identify the probes if you need to. Or make sure they don't go through the proxy so X-Forwarded-For won't be set.