Re: Add proper planner support for ORDER BY / DISTINCT aggregates

From: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Subject: Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Date: 2021-07-22 08:42:49
Message-ID: 4493999.1py09z8qHF@aivenronan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le jeudi 22 juillet 2021, 09:38:50 CEST David Rowley a écrit :
> On Thu, 22 Jul 2021 at 02:01, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> wrote:
> > I tested the 0001 patch against both HEAD and my proposed bugfix for
> > postgres_fdw.
> >
> > There is a problem that the ordered aggregate is not pushed down anymore.
> > The underlying Sort node is correctly pushed down though.
> >
> > This comes from the fact that postgres_fdw grouping path never contains
> > any
> > pathkey. Since the cost is fuzzily the same between the pushed-down
> > aggregate and the locally performed one, the tie is broken against the
> > pathkeys.
> I think this might be more down to a lack of any penalty cost for
> fetching foreign tuples. Looking at create_foreignscan_path(), I don't
> see anything that adds anything to account for fetching the tuples
> from the foreign server. If there was something like that then there'd
> be more of a preference to perform the remote aggregation so that
> fewer rows must arrive from the remote server.
>
> I tested by adding: total_cost += cpu_tuple_cost * rows * 100; in
> create_foreignscan_path() and I got the plan with the remote
> aggregation. That's a fairly large penalty of 1.0 per row. Much bigger
> than parallel_tuple_cost's default value.
>
> I'm a bit undecided on how much this patch needs to get involved in
> adjusting foreign scan costs. The problem is that we've given the
> executor a new path to consider and nobody has done any proper
> costings for the foreign scan so that it properly prefers paths that
> have to pull fewer foreign tuples. This is a pretty similar problem
> to what parallel_tuple_cost aims to fix. Also similar to how we had to
> add APPEND_CPU_COST_MULTIPLIER to have partition-wise aggregates
> prefer grouping at the partition level rather than at the partitioned
> table level.
>
> > Ideally we would add the group pathkeys to the grouping path, but this
> > would add an additional ORDER BY expression matching the GROUP BY.
> > Moreover, some triaging of the pathkeys would be necessary since we now
> > mix the sort-in- aggref pathkeys with the group_pathkeys.
>
> I think you're talking about passing pathkeys into
> create_foreign_upper_path in add_foreign_grouping_paths. If so, I
> don't really see how it would be safe to add pathkeys to the foreign
> grouping path. What if the foreign server did a Hash Aggregate? The
> rows might come back in any random order.

Yes, I was suggesting to add a new path with the pathkeys factored in, which
if chosen over the non-ordered path would result in an additional ORDER BY
clause to prevent a HashAggregate. But that doesn't seem a good idea after
all.

>
> I kinda think that to fix this properly would need a new foreign
> server option such as foreign_tuple_cost. I'd feel better about
> something like that with some of the people with a vested interest in
> the FDW code were watching more closely. So far we've not managed to
> entice any of them with the bug report yet, but it's maybe early days
> yet.

We already have that in the form of fdw_tuple_cost as a server option if I'm
not mistaken ? It works as expected when the number of tuples is notably
reduced by the foreign group by.

The problem arise when the cardinality of the groups is equal to the input's
cardinality. I think even in that case we should try to use a remote aggregate
since it's a computation that will not happen on the local server. I also
think we're more likely to have up to date statistics remotely than the ones
collected locally on the foreign tables, and the estimated number of groups
would be more accurate on the remote side than the local one.

--
Ronan Dunklau

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ronan Dunklau 2021-07-22 08:49:14 Re: ORDER BY pushdowns seem broken in postgres_fdw
Previous Message Peifeng Qiu 2021-07-22 08:39:53 Re: Kerberos delegation support in libpq and postgres_fdw