Re: BUG #6483: Rows being evaluated, although being outside the LIMIT / OFFSET boundaries

From: Marti Raudsepp <marti(at)juffo(dot)org>
To: kouber(at)saparev(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6483: Rows being evaluated, although being outside the LIMIT / OFFSET boundaries
Date: 2012-02-22 21:13:03
Message-ID: CABRT9RA6BWuTV1PESSQ-x_5+3qUJMgz0NnVZyzbFyjwY1CQgmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Feb 22, 2012 at 20:53, <kouber(at)saparev(dot)com> wrote:
> -- shows a notice for 1, 2, 3 and 4
> SELECT x.id, f(x) FROM xxx as x LIMIT 2 OFFSET 2;

Currently the way to fix this is to use a subquery that acts as an
optimization barrier in the presence of OFFSET:
SELECT x.id, f(x) FROM (SELECT * FROM xxx as x LIMIT 2 OFFSET 2) as xxx;

> The rows of a SELECT statement are being evaluated, even when not shown in
> the final result, when using an OFFSET > 0. Although I know that LIMIT is
> imposed just before flushing the result set to the client, this behaviour
> seems quite confusing, especially when using DML statements in the field
> list of the SELECT itself.

Interesting, the model for evaluating queries is documented here:
http://www.postgresql.org/docs/9.1/static/sql-select.html

According to this model, evaluating SELECT clause fields for *all*
found rows is done in step 5, whereas LIMIT/OFFSET are only applied
later at step 9. So we're already bending the rules here (in general
we don't do such optimizations around volatile functions). The worst
thing is that it's inconsistent -- the LIMIT gets applied when
computing the SELECT list, but OFFSET doesn't.

In theory we could bend the model even more -- to push SELECT list
fields below the "Limit" node if they aren't referenced by ORDER and
there are no set operations. However, adapting the model to back to
this behavior seems rather impossible -- ORDER BY must be strictly
evaluated after SELECT list (it can refer to SELECT fields), and LIMIT
must be evaluated after ORDER BY, otherwise it makes no sense.

Or going the other way -- we could make it evaluate all rows if the
SELECT list if it contains volatile functions, and then apply the
LIMIT afterwards. That would go even more against "common sense", but
at least it would be "correct" :)

Regards,
Marti

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2012-02-22 21:40:32 Re: BUG #6483: Rows being evaluated, although being outside the LIMIT / OFFSET boundaries
Previous Message Tom Lane 2012-02-22 20:16:40 Re: BUG #6482: Service terminates executing "With ... as ... UPDATE" query