Re: BUG #15720: `executor could not find named tuplestore ABC` in AFTER DELETE trigger referencing OLD TABLE as ABC

From: Jason Madden <jason(dot)madden(at)nextthought(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15720: `executor could not find named tuplestore ABC` in AFTER DELETE trigger referencing OLD TABLE as ABC
Date: 2019-03-29 12:42:38
Message-ID: 27B30574-291E-4C33-B89F-B690FB6CA492@nextthought.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


> On Mar 29, 2019, at 05:12, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Fri, Mar 29, 2019 at 4:49 AM PG Bug reporting form
> <noreply(at)postgresql(dot)org> wrote:
>> SQL failure calling: SELECT * FROM wip_upload_finish(...)
>> ...
>> Caused by: org.postgresql.util.PSQLException: ERROR: executor could not find
>> named tuplestore "updated_positions"
>> Where: SQL statement "UPDATE w_instances pi
>> SET last_modified = NOW()
>> FROM updated_positions up
>> WHERE pi.object_id = up.p_instance_id"
>> PL/pgSQL function wip_update_AC_trigger_func() line 24 at SQL statement
>> SQL statement "
>> DELETE FROM wip
>> WHERE p_instance_id = (
>> SELECT object_id
>> FROM w_instances
>> WHERE c1 = $1 and c2 = $2
>> )
>> "

Hi!

>
> I haven't had time to look into this properly yet but a couple of
> ideas occurred to me just from the description: I wonder if there is a
> rare case of row movement in your workload (ie an update that moves a
> row from one partition to another), and we have a bug in the
> transition table code in that path.

In this application, the rows that make up `wip` are immutable. Rows may be added and rows may be deleted, but no individual row is ever updated so I think row movement can be ruled out.

> Then again you said you reran the job and it worked, so unless the data that drives that jobs changed
> underneath it, that might not work.

Data changing underneath between runs is entirely possible. This was during a testing phase where data was being migrated from a previous system into this one while people were also interactively using the system, potentially editing that same data as it arrived.

> Or perhaps some combination of ON
> CONFLICT, partitions and transition tables is going awry... What does
> wip_upload_finish() do?

At the point this error was thrown, it hadn't had a chance to do very much yet :)

We have a set of rows sitting in a temporary table, `tmp_wip`. Those rows are detail rows ultimately belonging to a master row (1-many) in the `w_instances` table, which in turn are detail rows (1-many) of a master row in the `w` table (all relationships are supported with foreign key constraints; `wip.p_instance_id` -> `w_instances.object_id` (primary key) -> `w.object_id` (primary key)). (Somewhat like Vehicle -> Model -> Manufacturer tables for a car registry.)

`wip_upload_finish()` accepts a set of parameters that point it to the row in the `w` table and the row in the `w_instances` table that the details in `tmp_wip` belong to. At this point in the procedure, it has made sure the row in the `w` table exists (INSERT INTO w ON CONFLICT DO UPDATE SET last_modified = NOW() RETURNING object_id). It's just about to make sure that the row in the `w_instances` table exists, but the user has specified that the incoming rows in `tmp_wip` are to completely replace any existing `wip` detail rows (rather than append to them), so before we do that, we go ahead and remove any existing `wip` rows with the `DELETE FROM wip` statement (which failed). (Why delete first? The incoming data may not be consistent with the old data we're replacing, so this avoids any temporary situation where we'd have constraint violations, since CHECK constraints can't be deferred.) Ultimately, if the procedure had continued, the `w_instance` row would have been created/updated and the temporary data from `tmp_wip` would have been moved into `wip` (which perhaps would have necessitated making detail entries in yet more tables).

`wip_update_AC_trigger_func` is used to maintain summary data in `w_instances` based on the details in `wip`. The fact that `wip_update_AC_trigger_func` fired means that there were rows in `wip` that got deleted (all of them for this particular `p_instance_id`, which would reside in one partition of `wip`), which in turn means that the row in `w_instances` existed already, and so the `FOR rec in SELECT DISTINCT p_instance_id FROM updated_positions LOOP` (the first statement in `wip_update_AC_trigger_func`, before the UPDATE that failed) should have executed exactly exactly once.

I hope that at least begins to answer your question. Thanks for thinking about this!

Jason

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2019-03-29 12:55:15 BUG #15724: Can't create foreign table as partition
Previous Message Stepan Yankevych 2019-03-29 12:37:20 Can create foreign table as parition