Re: MERGE ... WHEN NOT MATCHED BY SOURCE

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: MERGE ... WHEN NOT MATCHED BY SOURCE
Date: 2024-01-29 10:07:42
Message-ID: CAEZATCUg763qn2OX_fUKB-d8FxVVmJZg6jdSwcWvLOVS4g-FWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 26 Jan 2024 at 15:57, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> Well, firstly this is clearly a feature we want to have, even though
> it's non-standard, because people use it and other implementations have
> it. (Eh, so maybe somebody should be talking to the SQL standard
> committee about it). As for code quality, I didn't do a comprehensive
> review, but I think it is quite reasonable. Therefore, my inclination
> would be to get it committed soonish, and celebrate it widely so that
> people can test it soon and complain if they see something they don't
> like.
>

Thanks. I have been going over this patch again, and for the most
part, I'm pretty happy with it.

One thing that's bothering me though is what happens if a row being
merged is concurrently updated. Specifically, if a concurrent update
causes a formerly matching row to no longer match the join condition,
and there are both NOT MATCHED BY SOURCE and NOT MATCHED BY TARGET
actions, so that it's doing in full join between the source and target
relations. In this case, when the EPQ mechanism rescans the subplan
node, there will be 2 possible output tuples (one with source null,
and one with target null), and EvalPlanQual() will just return the
first one, which is a more-or-less arbitrary choice, depending on the
type of join (hash/merge), and (for a mergejoin) the values of the
inner and outer join keys. Thus, it may execute a NOT MATCHED BY
SOURCE action, or a NOT MATCHED BY TARGET action, and it's difficult
to predict which.

Arguably it's not worth worrying too much about what happens in a
corner-case concurrent update like this, when MERGE is already
inconsistent under other concurrent update scenarios, but I don't like
having unpredictable results like this, which can depend on the plan
chosen.

I think the best (and probably simplest) solution is to always opt for
a NOT MATCHED BY TARGET action in this case, so then the result is
predictable, and we can document what is expected to happen.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2024-01-29 10:30:00 Re: Documentation to upgrade logical replication cluster
Previous Message Hayato Kuroda (Fujitsu) 2024-01-29 09:57:44 RE: speed up a logical replica setup