Re: making update/delete of inheritance trees scale better

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: making update/delete of inheritance trees scale better
Date: 2020-05-13 07:02:35
Message-ID: CA+HiwqH5PH7c4N5DE1GOF1DAj3H7YquAQUdXphTz20kMqn_kAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 13, 2020 at 8:52 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> On Wed, 13 May 2020 at 00:54, Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > On Mon, May 11, 2020 at 8:11 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > > Per row overhead would be incurred for every row whereas the plan time
> > > > overhead is one-time or in case of a prepared statement almost free.
> > > > So we need to compare it esp. when there are 2000 partitions and all
> > > > of them are being updated.
> > >
> > > I assume that such UPDATEs would be uncommon.
> >
> > Yes, 2000 partitions being updated would be rare. But many rows from
> > the same partition being updated may not be that common. We have to
> > know how much is that per row overhead and updating how many rows it
> > takes to beat the planning time overhead. If the number of rows is
> > very large, we are good.
>
> Rows from a non-parallel Append should arrive in order. If you were
> worried about the performance of finding the correct ResultRelInfo for
> the tuple that we just got, then we could just cache the tableOid and
> ResultRelInfo for the last row, and if that tableoid matches on this
> row, just use the same ResultRelInfo as last time. That'll save
> doing the hash table lookup in all cases, apart from when the Append
> changes to the next child subplan.

That would be a more common case, yes. Not when a join is involved though.

> Not sure exactly how that'll fit
> in with the foreign table discussion that's going on here though.

Foreign table discussion is concerned with what the only top-level
targetlist should look like given that different result relations may
require different row-identifying junk columns, due to possibly
belonging to different FDWs. Currently that's not a thing to worry
about, because each result relation has its own plan and hence the
targetlist.

> Another option would be to not use tableoid and instead inject an INT4
> Const (0 to nsubplans) into each subplan's targetlist that serves as
> the index into an array of ResultRelInfos.

That may be a bit fragile, considering how volatile that number
(result relation index) can be if you figure in run-time pruning, but
maybe worth considering.

> As for which ResultRelInfos to initialize, couldn't we just have the
> planner generate an OidList of all the ones that we could need.
> Basically, all the non-pruned partitions.

Why would replacing list of RT indexes by OIDs be better?

> Perhaps we could even be
> pretty lazy about building those ResultRelInfos during execution too.
> We'd need to grab the locks first, but, without staring at the code, I
> doubt there's a reason we'd need to build them all upfront. That
> would help in cases where pruning didn't prune much, but due to
> something else in the WHERE clause, the results only come from some

Late ResultRelInfo initialization is worth considering, given that
doing it for tuple-routing target relations works. I don't know why
we are still Initializing them all in InitPlan(), because the only
justification given for doing so that I know of is that it prevents
lock-upgrade. I think we discussed somewhat recently that that is not
really a hazard.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-05-13 07:10:30 Re: SLRU statistics
Previous Message Dilip Kumar 2020-05-13 06:04:42 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions