Re: Huge memory consumption on partitioned table with FKs

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, keisuke kuroda <keisuke(dot)kuroda(dot)3862(at)gmail(dot)com>, Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, tatsuhito(dot)kasahara(dot)rd(at)hco(dot)ntt(dot)co(dot)jp
Subject: Re: Huge memory consumption on partitioned table with FKs
Date: 2020-12-03 07:41:45
Message-ID: CA+HiwqEOrfN9b=f3sDmySPGc4gO-L_VMFHXLLxVmmdP34e64+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 3, 2020 at 2:29 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
> At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote <amitlangote09(at)gmail(dot)com> wrote in
> > On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
> > <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > > I don't think you're missing something. As I (tried to..) mentioned in
> > > another branch of this thread, you're right. On the otherhand
> > > fk_attnums is actually used to generate the query, thus *issue*
> > > potentially affects the query. Although that might be paranoic, that
> > > possibility is eliminated by checking the congruency(?) of fk_attnums
> > > (or other members). We might even be able to share a riinfo among
> > > such children.
> >
> > Just to be sure I've not misunderstood you, let me mention why I think
> > the queries used by different partitions all turn out to be the same
> > despite attribute number differences among partitions. The places
> > that uses fk_attnums when generating a query are these among others:
> >
> > Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
> > ...
> > Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
> > ...
> > quoteOneName(attname,
> > RIAttName(fk_rel, riinfo->fk_attnums[i]));
> >
> > For the queries on the referencing side ("check" side),
> > type/collation/attribute name determined using the above are going to
> > be the same for all partitions in a given tree irrespective of the
> > attribute number, because they're logically the same column. On the
>
> Yes, I know that, which is what I meant by "practically" or
> "actually", but it is not explicitly defined AFAICS.

Well, I think it's great that we don't have to worry *in this part of
the code* about partition's fk_attnums not being congruent with the
root parent's, because ensuring that is the responsibility of the
other parts of the system such as DDL. If we have any problems in
this area, they should be dealt with by ensuring that there are no
bugs in those other parts.

> Thus that would be no longer an issue if we explicitly define that
> "When conpraentid stores a valid value, each element of fk_attnums
> points to logically the same attribute with the RI_ConstraintInfo for
> the parent constraint." Or I'd be happy if we have such a comment
> there instead.

I saw a comment in Kuroda-san's v2 patch that is perhaps meant to
address this point, but the placement needs to be reconsidered:

@@ -366,6 +368,14 @@ RI_FKey_check(TriggerData *trigdata)
querysep = "WHERE";
for (int i = 0; i < riinfo->nkeys; i++)
{
+
+ /*
+ * We share the same plan among all relations in a partition
+ * hierarchy. The plan is guaranteed to be compatible since all of
+ * the member relations are guaranteed to have the equivalent set
+ * of foreign keys in fk_attnums[].
+ */
+
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);

A more appropriate place for this kind of comment would be where
fk_attnums is defined or in ri_BuildQueryKey() that is shared by
different RI query issuing functions.

> > referenced side ("restrict", "cascade", "set" side), as you already
> > mentioned, fk_attnums refers to the top parent table of the
> > referencing side, so no possibility of they being different in the
> > various referenced partitions' RI_ConstraintInfos.
>
> Right. (I'm not sure I have mention that here, though:p)A

Maybe I misread but I think you did in your email dated Dec 1 where you said:

"After an off-list discussion, we confirmed that even in that case the
patch works as is because fk_attnum (or contuple.conkey) always stores
key attnums compatible to the topmost parent when conparent has a
valid value (assuming the current usage of fk_attnum), but I still
feel uneasy to rely on that unclear behavior."

> > On the topic of how we'd be able to share even the RI_ConstraintInfos
> > among partitions, that would indeed look a bit more elaborate than the
> > patch we have right now.
>
> Maybe just letting the hash entry for the child riinfo point to the
> parent riinfo if all members (other than constraint_id, of course)
> share the exactly the same values. No need to count references since
> we don't going to remove riinfos.

Ah, something maybe worth trying. Although the memory we'd save by
sharing the RI_ConstraintInfos would not add that much to the savings
we're having by sharing the plan, because it's the plans that are a
memory hog AFAIK.

> > > About your patch, it calculates the root constrid at the time an
> > > riinfo is created, but when the root-partition is further attached to
> > > another partitioned-table after the riinfo creation,
> > > constraint_root_id gets stale. Of course that dones't matter
> > > practically, though.
> >
> > Maybe we could also store the hash value of the root constraint OID as
> > rootHashValue and check for that one too in
> > InvalidateConstraintCacheCallBack(). That would take care of this
> > unless I'm missing something.
>
> Seems to be sound.

Okay, thanks.

I have attached a patch in which I've tried to merge the ideas from
both my patch and Kuroda-san's. I liked that his patch added
conparentid to RI_ConstraintInfo because that saves a needless
syscache lookup for constraints that don't have a parent. I've kept
my idea to compute the root constraint id only once in
ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey().
Kuroda-san, anything you'd like to add to that?

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v2-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patch application/octet-stream 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2020-12-03 07:55:30 Re: convert elog(LOG) calls to ereport
Previous Message Peter Smith 2020-12-03 07:21:05 Re: [HACKERS] logical decoding of two-phase transactions