From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Arne Roland <A(dot)Roland(at)index(dot)de>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Subject: | Re: a misbehavior of partition row movement (?) |
Date: | 2021-12-20 13:10:30 |
Message-ID: | CA+HiwqEi9S2_1qcG-v98jPEEoNhs+Goonfz+Cjb4nqwf4CkcpQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Oct 14, 2021 at 6:00 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Mon, Sep 20, 2021 at 3:32 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > The problem was that the tuplestore
> > (afterTriggers.query_stack[query_level].tuplestore) that I decided to
> > use to store the AFTER trigger tuples of a partitioned table that is
> > the target of an cross-partition update lives only for the duration of
> > a given query. So that approach wouldn't work if the foreign key
> > pointing into that partitioned table is marked INITIALLY DEFERRED. To
> > fix, I added a List field to AfterTriggersData that stores the
> > tuplestores to store the tuples of partitioned tables that undergo
> > cross-partition updates in a transaction and are pointed to by
> > INITIALLY DEFERRED foreign key constraints. I couldn't understand one
> > comment about why using a tuplestore for such cases *might not* work,
> > which as follows:
> >
> > * Note that we need triggers on foreign tables to be fired in exactly the
> > * order they were queued, so that the tuples come out of the tuplestore in
> > * the right order. To ensure that, we forbid deferrable (constraint)
> > * triggers on foreign tables. This also ensures that such triggers do not
> > * get deferred into outer trigger query levels, meaning that it's okay to
> > * destroy the tuplestore at the end of the query level.
> >
> > I tried to break the approach using various test cases (some can be
> > seen added by the patch to foreign_key.sql), but couldn't see the
> > issue alluded to in the above comment. So I've marked the comment
> > with an XXX note as follows:
> >
> > - * Note that we need triggers on foreign tables to be fired in exactly the
> > - * order they were queued, so that the tuples come out of the tuplestore in
> > - * the right order. To ensure that, we forbid deferrable (constraint)
> > - * triggers on foreign tables. This also ensures that such triggers do not
> > - * get deferred into outer trigger query levels, meaning that it's okay to
> > - * destroy the tuplestore at the end of the query level.
> > + * Note that we need triggers on foreign and partitioned tables to be fired in
> > + * exactly the order they were queued, so that the tuples come out of the
> > + * tuplestore in the right order. To ensure that, we forbid deferrable
> > + * (constraint) triggers on foreign tables. This also ensures that such
> > + * triggers do not get deferred into outer trigger query levels, meaning that
> > + * it's okay to destroy the tuplestore at the end of the query level.
> > + * XXX - update this paragraph if the new approach, whereby tuplestores in
> > + * afterTriggers.deferred_tuplestores outlive any given query, can be proven
> > + * to not really break any assumptions mentioned here.
> >
> > If anyone reading this can think of the issue the original comment
> > seems to be talking about, please let me know.
>
> I brought this up in an off-list discussion with Robert and he asked
> why I hadn't considered not using tuplestores to remember the tuples
> in the first place, specifically pointing out that it may lead to
> unnecessarily consuming a lot of memory when such updates move a bunch
> of rows around. Like, we could simply remember the tuples to be
> passed to the trigger function using their CTIDs as is done for normal
> (single-heap-relation) updates, though in this case also remember the
> OIDs of the source and the destination partitions of a particular
> cross-partition update.
>
> I had considered that idea before but I think I had overestimated the
> complexity of that approach so didn't go with it. I tried and the
> resulting patch doesn't look all that complicated, with the added
> bonus that the bulk update case shouldn't consume so much memory with
> this approach like the previous tuplestore version would have.
>
> Updated patches attached.
Patch 0001 conflicted with some pg_dump changes that were recently
committed, so rebased.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v11-0002-Enforce-foreign-key-correctly-during-cross-parti.patch | application/octet-stream | 58.0 KB |
v11-0001-Create-foreign-key-triggers-in-partitioned-table.patch | application/octet-stream | 41.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2021-12-20 13:17:16 | Re: simplifying foreign key/RI checks |
Previous Message | Masahiko Sawada | 2021-12-20 12:59:15 | Re: parallel vacuum comments |