RE: Speeding up creating UPDATE/DELETE generic plan for partitioned table into a lot

From: "Kato, Sho" <kato-sho(at)jp(dot)fujitsu(dot)com>
To: "Kato, Sho" <kato-sho(at)jp(dot)fujitsu(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, 'David Rowley' <david(dot)rowley(at)2ndquadrant(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Speeding up creating UPDATE/DELETE generic plan for partitioned table into a lot
Date: 2019-04-18 02:09:52
Message-ID: 25C1C6B2E7BE044889E4FE8643A58BA9725F5216@G01JPEXMBKW03
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, hackers.

I would like advice on how to design creating generic plans for child tables gradually.

The current generic plan is created for all child tables plan by specifying NULL in boundParams of pg_plan_queries().
Also, specifying a parameter in boundParams, custom plan is created, so pg_plan_queries() cannot create a generic plan of child tables gradually.

Therefore, before creating a plan for all child tables, I want to stop planning and resume planning using the parameters specified in EXECUTE.

At first I want opinion about the following.

- When does planner stop planning?
By comparing num_live_parts and partdesc->nparts in expand_partitioned_rtentry(), I think that it seems to be possible to judge whether planner stop planning in case of UPDATE or SELECT.

- What information does planner need to resume planning?
I think that caching PlannerInfo is good to resume plans, but I'm not confident. If there is another good way, please let me know.

- How to create generic plan for the target child table
If planner creates a custom plan and then convert Const node to Param node, planner may be create a generic plan for the target child table, but this is also less confident. If there is another good way, please let me know.

regards,
> -----Original Message-----
> From: Kato, Sho [mailto:kato-sho(at)jp(dot)fujitsu(dot)com]
> Sent: Wednesday, February 20, 2019 3:11 PM
> To: Tsunakawa, Takayuki/綱川 貴之 <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>;
> 'David Rowley' <david(dot)rowley(at)2ndquadrant(dot)com>; Amit Langote
> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
> Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
> Subject: RE: Speeding up creating UPDATE/DELETE generic plan for
> partitioned table into a lot
>
> Before addressing to speeding up creating generic plan of UPDATE/DELETE,
> I will begin with the speed up creating SELECT plan.
>
> I will explain the background as time has passed.
> Since generic plan creates plans of all partitions and is cached, we can
> skip planning and expect performance improvements.
> But, When a table is partitioned into thousands, it takes time to execute
> a generic plan for the first time because planner creates plans for all
> child tables including a child table that may not be accessed.
>
> Therefore, I would like to develop a method to gradually create a child
> table plan instead of creating and caching all child table plans at once
> at EXECUTE.
> I came up with a mechanism that caches the information like PlannerInfo
> -- necessary to create the plan and the plan and adds the access plan
> of the child table to the cached plan.
>
> However, I'm not sure that this can be realized and this is right, so
> I want an opinion.
> Also, I'd like advice if it would be better to create a new path for
> partitioning like "Partition Scan Path" or "Partition Index Scan Path".
>
> regards,
> Sho Kato
>
> > -----Original Message-----
> > From: Kato, Sho [mailto:kato-sho(at)jp(dot)fujitsu(dot)com]
> > Sent: Friday, February 1, 2019 5:16 PM
> > To: Tsunakawa, Takayuki/綱川 貴之 <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>;
> 'David
> > Rowley' <david(dot)rowley(at)2ndquadrant(dot)com>; Amit Langote
> > <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
> > Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
> > Subject: Speeding up creating UPDATE/DELETE generic plan for
> > partitioned table into a lot
> >
> > Sorry, I lost previous mail[1].
> >
> > On Fri, 28 Dec 2018 at 20:36, Tsunakawa, Takayuki
> > <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
> > > Although I may say the same thing as you, I think a natural idea
> > > would
> > be to create a generic plan gradually. The starting simple question
> > is "why do we have to touch all partitions at first?" That is, can
> we
> > behave like this:
> >
> > I also think creating a generic plan gradually is a better idea
> > because planner should create a plan when it is needed.
> > Any ideas?
> >
> > On 2018-12-31 08:57:04, David Rowley wrote
> > >I imagine the place to start looking would be around why planning is
> > so slow for that many partitions.
> >
> > As you may already know, FLATCOPY at range_table_mutator has a large
> > bottleneck.
> > Executing UPDATE, about npart squared RangeTblEntry is copied.
> > When I execute UPDATE to 100 partitioned table, FLATCOPY takes about
> > 100
> > * 0.067 ms while total planning time takes 12.689 ms.
> >
> > On 2018-12-31 08:57:04, David Rowley wrote
> > >Another possible interesting idea would be to, instead of creating
> > >large Append/MergeAppend plans for partition scanning, invent some
> > >"Partition Seq Scan" and "Partition Index Scan" nodes that are able
> > >to build plans more similar to scanning a normal table. Likely such
> > >nodes would need to be programmed with a list of Oids that they're
> to
> > >scan during their execution. They'd also need to take care of their
> > >own tuple mapping for when partitions had their columns in varying
> orders.
> >
> > Inventing some "Partition Seq Scan" and "Partition Index Scan" nodes
> > is interesting.
> > It seems easy to add Scan nodes to each partition gradually.
> >
> >
> [1]:CAKJS1f-y1HQK+VjG7=C==vGcLnzxjN8ysD5NmaN8Wh4=VsYipw(at)mail(dot)gmail(dot)c
> > om
> >
> > regards,
> >
> >
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-04-18 02:37:17 Pathological performance when inserting many NULLs into a unique index
Previous Message Michael Paquier 2019-04-18 01:14:30 REINDEX INDEX results in a crash for an index of pg_class since 9.6