Django's select_related and prefetch_related

I was working on a personal website for my kids where I can post videos for them and I noticed that for each video link that was displayed on the page there were 2 additional SQL queries. One to get the username of the user that added the video link, and another to get the list of categories for the link.

The model looks like this:

class Link(models.Model):
    ...
    # There was an extra query per link to get the categories for a link
    category = models.ManyToManyField(Category)
    # There was an extra query per link to get the user.username
    user = models.ForeignKey(User)
    ...

The reason that an extra query was required to get the username is because the query to get all the links does not look at the user table at all. It doesn't do any joins. So it has to look up the username by doing something like "select username from users where users.pk == link.user_id". You get the idea. And it has to do this for every link on the page. This is easily fixed by doing

Link.objects.select_related().all()

or something similar. Django will then do a JOIN on the User table so it will now have all the user-related fields in the result set.

I was still getting a query for the category names on each link. To resolve this, I did:

Link.objects.select_related().all().prefetch_related('category')

This causes Django to do a fetch on the Category_Link table (the table that describes the many-to-many relationship) to get all the categories that match the primary keys for the links displayed on the current page. Then a join is done in Python, essentially mapping a list of category objects with each link object.

The number of queries for my home page was reduced from 44 down to 5. Page load time has been decreased by 1-2 orders of magnitude.

So, the naive rule-of-thumb might be to always use select_related() if you have a ForeignKey and you are going to reference fields in the ForeignKey in your template, and to always use prefetch_related if you have a ManyToManyField and you need information in the related table. For now I am going to NOT use these commands unless I first use the django-debug-toolbar and figure out what is going on. Then if it's necessary to use select_related or prefetch_related I will.

Tags:

Add new comment