Re: selecting random row values in postgres

From: Geoff Tolley <geoff(at)polimetrix(dot)com>
To: Tommy Gildseth <tommy(at)gildseth(dot)com>
Cc: Sumeet <asumeet(at)gmail(dot)com>, Rajesh Kumar Mallah <mallah(dot)rajesh(at)gmail(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: selecting random row values in postgres
Date: 2007-02-24 03:33:28
Message-ID: 45DFB208.1010201@polimetrix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Tommy Gildseth wrote:
> Sumeet wrote:
>> Thanks Buddy, really appreciate ur help on this
>>
>> ....problem solved...
>>
>> Is there any way this query can be optimized...i'm running it on a
>> huge table with joins
>
> ORDER BY rand() is rather slow on large datasets, since the db has to
> actually generate a random value for each row in the table, before being
> able use it to sort by. Preferable ways to do this include f.ex:
> SELECT max(id) FROM table;
> SELECT * FROM table WHERE id > [rand value between 0 and max(id)] ORDER
> BY id LIMIT 1;

I'd have thought that in most cases the slowness would be due more to the
sort than the cost of generating the random numbers. Your pair of queries
neatly avoid doing the sort, but it does raise the question, does it matter
if not all rows have the same chance of being picked? As an extreme example
in the above, if there are id's 1-100 and id = 100000 in the table, the
last one will almost always be picked.

I wonder if the random aggregate that Josh Berkus wrote could be adapted to
return a record type rather than a random value of a single column? Its big
advantage is that it requires no sorting and all entries are equiprobable.
Unfortunately I'm a bit short of time at this end of a Friday to do much
but give you the URL:

http://www.powerpostgresql.com/Random_Aggregate

HTH,
Geoff

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message chrisj 2007-02-24 23:04:51 Re: can someone explain confusing array indexing nomenclature
Previous Message Tommy Gildseth 2007-02-23 20:45:27 Re: selecting random row values in postgres