Re: FDW does not push down LIMIT & ORDER BY with sharding (partitions)

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Gert van Dijk <gertvdijk(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Subject: Re: FDW does not push down LIMIT & ORDER BY with sharding (partitions)
Date: 2019-07-09 04:56:03
Message-ID: CA+HiwqEk=-NjRL+Ro=9tnSchpSqgb2oUROcVmbt_k5cFb=fiPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

(I've added Fujita-san and David Rowley to this discussion as they
will have a better clue about some things I write below.)

On Tue, Jul 9, 2019 at 9:12 AM Gert van Dijk <gertvdijk(at)gmail(dot)com> wrote:
> First of all I want to thank Etsuro Fujita for implementing the exact
> feature I was missing in
> FDW 11.4, but now available in 12: pushing down of LIMIT & ORDER BY to
> foreign tables
> (commit d50d172e51). Now that I'm using PostgreSQL 12-beta2 from the
> official Docker
> image I noticed an omission that I wanted to report here which may be
> relevant for those
> like me, using FDW in a typical sharding setup.
>
> By querying purely foreign tables, I can confirm pushing down LIMIT &
> ORDER BY is
> working as expected on my installation.
> However, when I use a typical sharding setup where the main table is
> located on the FDW
> node, with partitions of foreign tables, this seems not to activate
> the new code path. I can
> understand that pushing this down is not possible in cases where
> *multiple* foreign tables
> are to be scanned. However, it also does not work in the case where my
> WHERE clause
> condition causes to only connect to a *single* foreign table.

As far as I can tell, LIMIT cannot be pushed below an Append or
MergeAppend that's used to combine the outputs of individual
partitions, which if I read correctly, you already know. It's true
that there's no Append/MergeAppend node in the *final* plan of your
example query, because there's only partition to be scanned after
pruning, but the Append/MergeAppend node remains in the plan through
the planning stage where LIMIT is added to the plan and only removed
in the final stage of planning. The final stage that removes the
Append/MergeAppend doesn't reassess whether the LIMIT on top (if any)
should be applied to the partition directly, which means the
partition's FDW never gets to see the LIMIT.

> (If this is already being worked on or discussed elsewhere on this or
> another list, please
> excuse me, it seems a bit hard to find relevant results searching the
> mailing list archives,
> and I'm fairly new to PostgreSQL in general too.)

I don't know of any ongoing work to address this either. The ability
to remove unnecessary Append/MergeAppend was also added in PG 12
(thanks to David and Tom Lane), along with LIMIT push down to foreign
server (thanks to Fujita-san), but the way former is implemented
prevents the two from working together. Maybe, there's a plan to fix
that in the future.

Thanks,
Amit

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Rares Salcudean 2019-07-09 06:53:26 Re: PG11 - Multiple Key Range Partition
Previous Message Michael Paquier 2019-07-09 04:43:41 Re: BUG #15899: Valgrind detects errors on create gist index