Selecting a random row

From: Kari Lavikka <tuner(at)bdb(dot)fi>
To: pgsql-general(at)postgresql(dot)org
Subject: Selecting a random row
Date: 2004-11-04 11:36:14
Message-ID: Pine.HPX.4.51.0411041329040.3138@purple.bdb.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Hi!

I have to select a random row from a table where primary key isn't
continuous (some rows have been deleted). Postgres just seems to do
something strange with my method.

--
-- Use the order by desc limit 1 -trick to get maximum value
--
CREATE OR REPLACE FUNCTION max_uid() RETURNS int4 AS
'SELECT uid FROM users WHERE status = ''a'' ORDER BY uid DESC LIMIT 1'
LANGUAGE 'sql';

--
-- Choose a random point between 0 and max_uid and select the first
-- value from the bigger part
--
CREATE OR REPLACE FUNCTION random_uid() RETURNS int4 AS
'SELECT uid FROM users u WHERE u.status = ''a'' AND uid >=
cast(cast(max_uid() - 1 AS FLOAT) * random() AS INTEGER) ORDER BY uid
ASC LIMIT 1'
LANGUAGE 'sql';

--
-- testing and looks good
--
galleria=> SELECT max_uid();
max_uid
---------
126263

--
-- testing...
--
galleria=> SELECT random_uid(), random_uid(), random_uid(), random_uid(), random_uid();
random_uid | random_uid | random_uid | random_uid | random_uid
------------+------------+------------+------------+------------
322 | 601 | 266 | 427 | 369

... but what is this? Values seem to vary from 0 to ~1000.
Not from 0 to 126263!!

How about doing some manual work...

--
-- Testing split point selection
--
galleria=> SELECT cast(cast(max_uid() - 1 AS FLOAT) * random() AS INTEGER);
int4
-------
43279

--
-- And inserting split point manually
--
galleria=> SELECT uid FROM users u WHERE u.status = 'a' AND uid >= 43279
ORDER BY uid ASC LIMIT 1;
uid
-------
43284

Works just fine!

Is there any explanation for this strange behavior or are there better
ways to select a random row?

I'm using PG 8.0 b2. Plan for the query is:

Limit (cost=0.00..5.19 rows=1 width=4)
-> Index Scan using users_pkey on users u (cost=0.00..69145.26 rows=13329 width=4)
Filter: ((status = 'a'::bpchar) AND (uid >= ((((max_uid() - 1))::double precision * random()))::integer))

|\__/|
( oo ) Kari Lavikka - tuner(at)bdb(dot)fi
__ooO( )Ooo_______ _____ ___ _ _ _ _ _ _ _
""

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Simon Riggs 2004-11-04 12:19:00 Re: Restricting Postgres
Previous Message Markus Schaber 2004-11-04 10:30:56 Re: Avoiding explicit addDataType calls for PostGIS