Django select_related and prefetch_related

In Django, select_related and prefetch_related are designed to stop the deluge of database queries that are caused by accessing related objects.

Example Models:

from django.db import models

class Publisher(models.Model):
    name = models.CharField(max_length=300)

    def __str__(self):
        return self.name


class Book(models.Model):
    name = models.CharField(max_length=300)
    price = models.IntegerField(default=0)
    publisher = models.ForeignKey(Publisher, on_delete=models.CASCADE)

    class Meta:
        default_related_name = 'books'

    def __str__(self):
        return self.name


class Store(models.Model):
    name = models.CharField(max_length=300)
    books = models.ManyToManyField(Book)

    class Meta:
        default_related_name = 'stores'

    def __str__(self):
        return self.name

Now, we need to insert some data into our models. so that we can test.

To load data lets write managements command, so that we have to run just one command to insert the data.

python manage.py load_items
import random
from django.core.management.base import BaseCommand
from apps.bookstore.models import Publisher, Store, Book

class Command(BaseCommand):
    """
    This command is for inserting Publisher, Book, Store into database.
    Insert 5 Publishers, 100 Books, 10 Stores.
    """

    def handle(self, *args, **options):
        Publisher.objects.all().delete()
        Book.objects.all().delete()
        Store.objects.all().delete()

        # create 5 publishers
        publishers = [Publisher(name=f"Publisher{index}") for index in range(1, 6)]
        Publisher.objects.bulk_create(publishers)

        # create 20 books for every publishers
        counter = 0
        books = []
        for publisher in Publisher.objects.all():
            for i in range(20):
                counter = counter + 1
                books.append(Book(name=f"Book{counter}", price=random.randint(50, 300), publisher=publisher))

        Book.objects.bulk_create(books)

        # create 10 stores and insert 10 books in every store
        books = list(Book.objects.all())
        for i in range(10):
            temp_books = [books.pop(0) for i in range(10)]
            store = Store.objects.create(name=f"Store{i+1}")
            store.books.set(temp_books)
            store.save()

To measure the execution time and the number of queries executed in a function. We are going to use decorator.

from django.db import connection, reset_queries
import time
import functools

def query_debugger(func):

    @functools.wraps(func)
    def inner_func(*args, **kwargs):

        reset_queries()

        start_queries = len(connection.queries)

        start = time.perf_counter()
        result = func(*args, **kwargs)
        end = time.perf_counter()

        end_queries = len(connection.queries)

        print(f"Function : {func.__name__}")
        print(f"Number of Queries : {end_queries - start_queries}")
        print(f"Finished in : {(end - start):.2f}s")
        return result

    return inner_func

Select Related

We use select_related when the object that you’re going to select is a single object, which means forward ForeignKey, OneToOne and backward OneToOne.

select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query.

Example:

@query_debugger
def book_list():

    queryset = Book.objects.all()

    books = []
    for book in queryset:
        books.append({'id': book.id, 'name': book.name, 'publisher': book.publisher.name})

    return books

After running this function, the output shows:

Function :  book_list
Number of Queries : 101
Finished in : 0.08s

One query for populating all the books and, while iterating each time, we access the foreign key publisher that another separate query executes.

Let’s modify the query with select_related as follows and see what happens.

After running this function, the output shows:

Function :  book_list_select_related
Number of Queries : 1
Finished in : 0.02s

Isn’t it amazing? This query reduced 101 to 1. This is what select_related does.

Prefetch Related

We use prefetch_related when we’re going to get a set of things.

That means forward ManyToMany and backward ManyToMany, ForeignKey. prefetch_related does a separate lookup for each relationship, and performs the “joining” in Python.

It is different from select_related, the prefetch_related made the JOIN using Python rather than in the database.

Let’s dig into it with an example.

@query_debugger
def store_list():

    queryset = Store.objects.all()

    stores = []

    for store in queryset:
        books = [book.name for book in store.books.all()]
        stores.append({'id': store.id, 'name': store.name, 'books': books})

    return stores

After running this function, the output shows:

Function :  store_list
Number of Queries : 11
Finished in : 0.02s

We have 10 stores in the database and each store has 10 books. What’s happening here is one query for fetching all the stores and, while iterating through each store, another query is executing when we access the ManyToMany field books.

Let’s reduce the number of queries using prefetch_related.

@query_debugger
def store_list_prefetch_related():

    queryset = Store.objects.prefetch_related('books')

    stores = []

    for store in queryset:
        books = [book.name for book in store.books.all()]
        stores.append({'id': store.id, 'name': store.name, 'books': books})

    return stores

After running this function, the output shows:

Function : store_list_prefetch_related
Number of Queries : 2
Finished in : 0.01s

Query performance improved here, 11 to 2 queries. I want you to understand what prefetch_related is doing here.

Let’s take another example for prefetch_related.

In the management command code, I set the book price randomly from 50 to 300. Now, we will find expensive books (price 250 to 300) in every store.

@query_debugger
def store_list_expensive_books_prefetch_related():

    queryset = Store.objects.prefetch_related('books')

    stores = []
    for store in queryset:
        books = [book.name for book in store.books.filter(price__range=(250, 300))]
        stores.append({'id': store.id, 'name': store.name, 'books': books})

    return stores

After running this function, the output shows:

Function :  store_list_expensive_books_prefetch_related
Number of Queries : 12
Finished in : 0.05s

Despite the fact that we are using prefetch_related, our queries increased rather than decreased. But why?

Using prefetch related, we are telling Django to give all the results to be JOINED, but when we use the filter(price__range=(250, 300)), we are changing the primary query and then Django doesn’t JOIN the right results for us.

This is the reason why we have 12 queries, 11 queries iterating over the stores and one query to get all the results in prefetch.

Let’s solve the problem with Prefetch.

@query_debugger
def store_list_expensive_books_prefetch_related_efficient():

    queryset = Store.objects.prefetch_related(
        Prefetch('books', queryset=Book.objects.filter(price__range=(250, 300))))

    stores = []
    for store in queryset:
        books = [book.name for book in store.books.all()]
        stores.append({'id': store.id, 'name': store.name, 'books': books})

    return stores

After running this function, the output shows:

Function :  store_list_expensive_books_prefetch_related_efficient
Number of Queries : 2
Finished in : 0.03s

Explore More Django Posts

Efficient Django Project Settings with Split Settings Library

Learn how to efficiently manage your Django project settings with the Split Settings library. Use environment variables, keep sensitive information i…

Read More
Integrating Flake8 with Django: Best Practices

Learn how to integrate Flake8 with Django projects and enforce code quality. Follow our step-by-step guide and optimize your Django workflow with Fla…

Read More
Django Authentication and Authorization with JWT

Learn how to implement JSON Web Token (JWT) based authentication and authorization in Django web applications with step-by-step guide and code exampl…

Read More
Best Practices for Django Development: Tips and Tricks

Learn the best practices for Django development, including project structure, code organization, testing, and deployment. Build high-quality web apps.

Read More
Django Middleware: Tips, Tricks and Examples

Learn how to use Django Middleware to improve your app's performance and security. Includes examples and best practices.

Read More
Django Production Deployment: Best Practices & Checklist

Learn the best practices and checklist for deploying a Django application to production. Includes tips on web servers, databases, caching, security, …

Read More