Performance of ORDER BY RANDOM to select random rows?

From: Victor Hooi <victorhooi(at)yahoo(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Performance of ORDER BY RANDOM to select random rows?
Date: 2013-08-08 02:01:17
Message-ID: CAMnnoU+K5dh_atGRAXPgYGhzxT5Spfs1ZMPRz8AmZ5CPeFaixQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I have a Django application where we need to pull random rows out of a
table.

According to the Django documentation:

https://docs.djangoproject.com/en/dev/ref/models/querysets/#order-by

Note: order_by('?') queries may be expensive and slow, depending on the
> database backend you’re using.

My understanding is that order_by('?') in the Django ORM will generate a
query with ORDER BY RANDOM.

This blog post:

http://www.peterbe.com/plog/getting-random-rows-postgresql-django

also seems to suggest that using ORDER BY RANDOM() will perform poorly on
Postgres.

I'm just wondering if this is still the case?

I just ran those benchmarks on my system (Postgres 9.2.4), and using ORDERY
BY RANDOM did not seem substantially to generating random integers in
Python and picking those out (and handling non-existent rows).

Has Postgres's behaviour for ORDER BY RANDOM change sometime recently?

Cheers,
Victor

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sergey Konoplev 2013-08-08 02:11:33 Re: Self referencing composite datatype
Previous Message Sergey Konoplev 2013-08-08 01:54:24 Re: Pl/Python runtime overhead