Re: Selecting random rows efficiently

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Russell Garrett <rg(at)tcslon(dot)com>
Cc: PgSQL Performance ML <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Selecting random rows efficiently
Date: 2003-09-04 00:09:14
Message-ID: Pine.LNX.4.33.0309031806090.25524-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Can you just create an extra serial column and make sure that one is
always in order and no holes in it? (i.e. a nightly process, etc...)???

If so, then something like this truly flies:

select * from accounts where aid = (select cast(floor(random()*100000)+1 as int));

My times on it on a 100,000 row table are < 1 millisecond.

Note that you have to have a hole free sequence AND know how many rows
there are, but if you can meet those needs, this is screamingly fast.

On Sat, 30 Aug 2003, Russell Garrett wrote:

> Considering that we'd have to index the random field too, it'd be neater in
> the long term to re-number the primary key. Although, being a primary key,
> that's foreign-keyed from absolutely everywhere, so that'd probably take an
> amusingly long time.
>
> ...and no we're not from Micronesia, we're from ever so slightly less exotic
> London. Though Micronesia might be nice...
>
> Russ (also from last.fm but without the fancy address)
>
> pgsql-performance-owner(at)postgresql(dot)org wrote:
> > On Sat, 2003-08-30 at 09:01, Rod Taylor wrote:
> >>> i was hoping there was some trickery with sequences that would
> >>> allow me to easily pick a random valid sequence number..?
> >>
> >> I would suggest renumbering the data.
> >>
> >> ALTER SEQUENCE ... RESTART WITH 1;
> >> UPDATE table SET pkey = DEFAULT;
> >>
> >> Of course, PostgreSQL may have trouble with that update due to
> >> evaluation of the unique constraint immediately -- so drop the
> >> primary key first, and add it back after.
> >
> > And if there are child tables, they'd all have to be updated, too.
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashley Cambrell 2003-09-04 00:28:01 Re: [HACKERS] Using oids
Previous Message Larry Rosenman 2003-09-03 23:59:17 Re: Unixware Patch (Was: Re: Beta2 Tag'd and Bundled ...)

Browse pgsql-performance by date

  From Date Subject
Next Message Relaxin 2003-09-04 00:28:26 SELECT's take a long time compared to other DBMS
Previous Message Marc G. Fournier 2003-09-04 00:04:58 Re: FreeBSD page size