Re: Huge memory consumption on partitioned table with FKs

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: amitlangote09(at)gmail(dot)com
Cc: alvherre(at)alvh(dot)no-ip(dot)org, keisuke(dot)kuroda(dot)3862(at)gmail(dot)com, tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp, pgsql-hackers(at)lists(dot)postgresql(dot)org, tatsuhito(dot)kasahara(dot)rd(at)hco(dot)ntt(dot)co(dot)jp
Subject: Re: Huge memory consumption on partitioned table with FKs
Date: 2020-12-03 05:28:57
Message-ID: 20201203.142857.1823750032495295791.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote <amitlangote09(at)gmail(dot)com> wrote in
> Hello,
>
> On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote <amitlangote09(at)gmail(dot)com> wrote in
> > > Hmm, I don't see what the problem is here, because it's not
> > > RI_ConstraintInfo that is being shared among partitions of the same
> > > parent, but the RI query (and the SPIPlanPtr for it) that is issued
> > > through SPI. The query (and the plan) turns out to be the same no
> > > matter what partition's RI_ConstraintInfo is first used to generate
> > > it. What am I missing?
> >
> > I don't think you're missing something. As I (tried to..) mentioned in
> > another branch of this thread, you're right. On the otherhand
> > fk_attnums is actually used to generate the query, thus *issue*
> > potentially affects the query. Although that might be paranoic, that
> > possibility is eliminated by checking the congruency(?) of fk_attnums
> > (or other members). We might even be able to share a riinfo among
> > such children.
>
> Just to be sure I've not misunderstood you, let me mention why I think
> the queries used by different partitions all turn out to be the same
> despite attribute number differences among partitions. The places
> that uses fk_attnums when generating a query are these among others:
>
> Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
> ...
> Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
> ...
> quoteOneName(attname,
> RIAttName(fk_rel, riinfo->fk_attnums[i]));
>
> For the queries on the referencing side ("check" side),
> type/collation/attribute name determined using the above are going to
> be the same for all partitions in a given tree irrespective of the
> attribute number, because they're logically the same column. On the

Yes, I know that, which is what I meant by "practically" or
"actually", but it is not explicitly defined AFAICS.

Thus that would be no longer an issue if we explicitly define that
"When conpraentid stores a valid value, each element of fk_attnums
points to logically the same attribute with the RI_ConstraintInfo for
the parent constraint." Or I'd be happy if we have such a comment
there instead.

> referenced side ("restrict", "cascade", "set" side), as you already
> mentioned, fk_attnums refers to the top parent table of the
> referencing side, so no possibility of they being different in the
> various referenced partitions' RI_ConstraintInfos.

Right. (I'm not sure I have mention that here, though:p)A

> On the topic of how we'd be able to share even the RI_ConstraintInfos
> among partitions, that would indeed look a bit more elaborate than the
> patch we have right now.

Maybe just letting the hash entry for the child riinfo point to the
parent riinfo if all members (other than constraint_id, of course)
share the exactly the same values. No need to count references since
we don't going to remove riinfos.

> > > BTW, one problem with Kuroda-san's patch is that it's using
> > > conparentid as the shared key, which can still lead to multiple
> > > queries being generated when using multi-level partitioning, because
> > > there would be many (intermediate) parent constraints in that case.
> > > We should really be using the "root" constraint OID as the key,
> > > because there would only be one root from which all constraints in a
> > > given partition hierarchy would've originated. Actually, I had
> > > written a patch a few months back to do exactly that, a polished
> > > version of which I've attached with this email. Please take a look.
> >
> > I don't like that the function self-recurses but
> > ri_GetParentConstrOid() actually climbs up to the root constraint, so
> > the patch covers the multilevel partitioning cases.
>
> Sorry, I got too easily distracted by the name Kuroda-san chose for
> the field in RI_ConstraintInfo and the function.

I thought the same at the first look to the function.

> > About your patch, it calculates the root constrid at the time an
> > riinfo is created, but when the root-partition is further attached to
> > another partitioned-table after the riinfo creation,
> > constraint_root_id gets stale. Of course that dones't matter
> > practically, though.
>
> Maybe we could also store the hash value of the root constraint OID as
> rootHashValue and check for that one too in
> InvalidateConstraintCacheCallBack(). That would take care of this
> unless I'm missing something.

Seems to be sound.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2020-12-03 05:29:34 Re: should INSERT SELECT use a BulkInsertState?
Previous Message Pavel Stehule 2020-12-03 05:19:50 Re: [HACKERS] [PATCH] Generic type subscripting