From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | konstantin knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> |
Subject: | Re: Optimizer questions |
Date: | 2016-03-10 17:17:47 |
Message-ID: | 14904.1457630267@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
konstantin knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> writes:
> But right now the rule for cost estimation makes it not possible to apply this optimization for simple expressions like this: ...
> I wonder is there any advantages of earlier evaluation of such simple expressions if them are not needed for sort?
Well, as I said, my patch was intentionally written to be conservative
about when to change the semantics from what it's been for the last
twenty years. I think there's a good argument for changing it for
volatile functions, but I'm less convinced that we should whack around
what happens in cases where there is not a clear benefit. It's fairly
likely that there are users out there who have (perhaps without even
knowing it) optimized their queries to work well with eval-before-sort
behavior, perhaps by applying data-narrowing functions. They won't
thank us for changing the behavior "just because".
If we had a more reliable idea of whether we are widening or narrowing
the sort data by postponing eval, I'd be willing to be more aggressive
about doing it. But without that, I think it's best to be conservative
and only change when there's a pretty clear potential benefit.
> Also I do not completely understand your concern about windows functions.
The point is just that we have to not disassemble WindowFunc nodes when
building the sort-input tlist, in the same way that we don't disassemble
Aggref nodes. If we are sorting the output of a WindowAgg, the WindowFunc
expressions have to appear as expressions in the WindowAgg's output tlist.
>> I think it's probably also broken for SRFs in the tlist; we need to work
>> out what semantics we want for those. If we postpone any SRF to after
>> the Sort, we can no longer assume that a query LIMIT enables use of
>> bounded sort (because the SRF might repeatedly return zero rows).
>> I don't have a huge problem with that, but I think now would be a good
>> time to nail down some semantics.
As far as that goes, it seems to me after thinking about it that
non-sort-column tlist items containing SRFs should always be postponed,
too. Performing a SRF before sorting bloats the sort data vertically,
rather than horizontally, but it's still bloat. (Although as against
that, when you have ORDER BY + LIMIT, postponing SRFs loses the ability
to use a bounded sort.) The killer point though is that unless the sort
is stable, it might cause surprising changes in the order of the SRF
output values. Our sorts aren't stable; here's an example in HEAD:
# select q1, generate_series(1,9) from int8_tbl order by q1 limit 7;
q1 | generate_series
-----+-----------------
123 | 2
123 | 3
123 | 4
123 | 5
123 | 6
123 | 7
123 | 1
(7 rows)
I think that's pretty surprising, and if we have an opportunity to
provide more intuitive semantics here, we should do it.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2016-03-10 17:18:45 | Re: amcheck (B-Tree integrity checking tool) |
Previous Message | Petr Jelinek | 2016-03-10 17:06:09 | Re: [PROPOSAL] Client Log Output Filtering |