Django: Index Your Generic Foreign Keys
In Django, models may relate to another, generic model, as opposed to a concrete model, via a generic foreign key, which becomes self explanatory when looking at a sample definition:
class TaggedItem(models.Model):
tag = models.SlugField()
content_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
object_id = models.PositiveBigIntegerField()
content_object = GenericForeignKey("content_type", "object_id")
A TaggedItem instance may be related to any other type of model, as opposed to being restricted to a specific type of model, say a Post. This makes the TaggedItem model quite flexible, allowing reuse across different purposes. The flexibility comes at the drawback of performance and implementation, typically.
In reviewing query performance at work, I noticed that a few of the most time consuming queries (where total time = num calls * mean time) involved generic foreign key filters. Running the queries through the postgres query explainer, I was surprised to discover that generic foreign keys are not indexed by default. That is, a query that searches for TaggedItems by their related content object like:
TaggedItem.objects.filter(content_type__pk=bookmark_type.id, object_id=bookmark_id)
will do a full table scan to find the relevant TaggedItem rows. If you're lucky, the query will leverage an index on content_type to filter down the rows to scan, which may or may not make an impact depending on the characteristics of the table (e.g. if there's only one content type in the table, there's nothing to be gained via that index).
Adding an index on content_type, object_id will avoid the scanning, reducing performance in my local test by ~300x.
class Meta:
indexes = [
models.Index(fields=["content_type", "object_id"]),
]
To Django's credit, this behavior is called out in the docs:
Unlike for the ForeignKey, a database index is not automatically created on the GenericForeignKey, so it’s recommended that you use Meta.indexes to add your own multiple column index. This behavior may change in the future.
though only in recent versions. If your project is older, like mine, or you weren't aware of this behavior and haven't been conscientiously adding in the indexes manually, then you may want to take a look at your queries for a potentially simple optimization.
May your generic foreign key searches be fast and efficient.