Skip site navigation (1) Skip section navigation (2)

Re: Result set granularity..

From: Greg Stark <gsstark(at)mit(dot)edu>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Result set granularity..
Date: 2003-09-29 15:12:52
Message-ID: 87u16veih7.fsf@stark.dyndns.tv (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-sql
Short answer, no there's no trick to doing this in postgres like rownum. You
would have to either add a column with sequential or random keys and then
select on it, or else pull down all the rows and only process the ones you
want.

Rownum is handy though, I wonder how easy it would be to add it to postgres.


"Rasmus Aveskogh" <rasmus(at)defero(dot)se> writes:

> SELECT <column> FROM <table> WHERE mod(rownum, 5) = 0;

Uhm. Does that actually work? I thought rownum only incremented for every row
actually returned. So that this would return one row and then stop returning
rows.

I would have thought you would have to use a subquery to get this to work
like:

SELECT * FROM (SELECT column, rownum AS n FROM table) WHERE mod(n,5)=0

> The query above would give me every fifth row of the original result set
> and would save me from sending the data over my database connection and do
> the lowering of the granularity in the application.

Also, in Oracle there's actually a SAMPLE keyword that you can put on a select
to tell oracle that you only need a sample. It's way more efficient than using
rownum because it skips whole blocks. Ie, the equivalent of above would be
'SAMPLE 20 PERCENT' or something like that, and it would read a whole block,
then skip 4 whole blocks.

However, for statistical purposes both of these techniques have downsides.
Consider the case where you're looking for the standard deviation of some
timing information and there's a spike every five minutes. The correlation
between the sampling and the event could create spurious results. You could
completely fail to see the events, or thing they are much worse than they are.
Or even that they're worse at some times of day when in fact they're
consistent.

Really what you want is to assign statistically random numbers, probably
floating point numbers, to each record, then read all records where those
numbers are in some range. Even that might not really be kosher for serious
statistics.

-- 
greg


In response to

pgsql-sql by date

Next:From: Christoph HallerDate: 2003-09-29 15:14:39
Subject: Re: now() in loop statement
Previous:From: TimoDate: 2003-09-29 14:27:47
Subject: Need to overcome UNION / ORDER BY restriction

pgsql-general by date

Next:From: E. Zorn (blue2)Date: 2003-09-29 15:19:37
Subject: Access - can't close Form
Previous:From: Tom LaneDate: 2003-09-29 15:08:33
Subject: Re: PostgreSQL SSL communication with SecureTcpClient (Ssl

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group