Re: Partial aggregates pushdown

From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Subject: Re: Partial aggregates pushdown
Date: 2023-11-22 06:32:58
Message-ID: 8175ddeb6d417d8a1f91e667fef77abf@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas писал 2023-11-21 20:16:

>> > I don't think the patch does a good job explaining why HAVING,
>> > DISTINCT, and ORDER BY are a problem. It seems to me that HAVING
>> > shouldn't really be a problem, because HAVING is basically a WHERE
>> > clause that occurs after aggregation is complete, and whether or not
>> > the aggregation is safe shouldn't depend on what we're going to do
>> > with the value afterward. The HAVING clause can't necessarily be
>> > pushed to the remote side, but I don't see how or why it could make
>> > the aggregate itself unsafe to push down. DISTINCT and ORDER BY are a
>> > little trickier: if we pushed down DISTINCT, we'd still have to
>> > re-DISTINCT-ify when combining locally, and if we pushed down ORDER
>> > BY, we'd have to do a merge pass to combine the returned values unless
>> > we could prove that the partitions were non-overlapping ranges that
>> > would be visited in the correct order. Although that all sounds
>> > doable, I think it's probably a good thing that the current patch
>> > doesn't try to handle it -- this is complicated already. But it should
>> > explain why it's not handling it and maybe even a bit about how it
>> > could be handling in the future, rather than just saying "well, this
>> > kind of thing is not safe." The trouble with that explanation is that
>> > it does nothing to help the reader understand whether the thing in
>> > question is *fundamentally* unsafe or whether we just don't have the
>> > right code to make it work.
>>
>> Makes sense.
>
> Actually, I think I was wrong about this. We can't handle ORDER BY or
> DISTINCT because we can't distinct-ify or order after we've already
> partially aggregated. At least not in general, and not without
> additional aggregate support functions. So what I said above was wrong
> with respect to those. Or so I believe, anyway. But I still don't see
> why HAVING should be a problem.

Hi. HAVING is also a problem. Consider the following query

SELECT count(a) FROM t HAVING count(a) > 10 - we can't push it down to
foreign server as HAVING needs full aggregate result, but foreign server
don't know it.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey M. Borodin 2023-11-22 06:46:27 Re: WIP: libpq: add a possibility to not send D(escribe) when executing a prepared statement
Previous Message Andrei Lepikhov 2023-11-22 06:31:44 Re: Postgres picks suboptimal index after building of an extended statistics