Re: Feature suggestion: ON CONFLICT DO NOTHING RETURNING which returns existing row data

From: Pantelis Theodosiou <ypercube(at)gmail(dot)com>
To: Tom Dunstan <pgsql(at)tomd(dot)cc>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Feature suggestion: ON CONFLICT DO NOTHING RETURNING which returns existing row data
Date: 2016-10-05 09:41:23
Message-ID: CAE3TBxy9FCdvJGqRqW1AD2-nb4m3ORreQAT7b-9cP3F4Q8JDoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This can be solved by chaining modifying CTEs.

Something like this (not tested) that can work with multiple rows inserted:

WITH
vals (bk1, bk2, other_t1_columns, other_t2_columns) AS
( VALUES (bk1val, bk2val, other_t1_values, other_t2_values),
(bk1val, bk2val, other_t1_values, other_t2_values)
),
ins_t1 AS
( INSERT INTO t1 (bk1, bk2, other columns)
SELECT bk1, bk2, other_t1_columns
FROM vals
ON CONFLICT (bk1val, bk2val) DO NOTHING
RETURNING id, bk1, bk2
)
INSERT INTO t2 (t1_id, other_t2_columns)
SELECT
COALESCE(t1.id, ins_t1,id),
val.bk1, val.bk2, val.other_t2_columns
FROM vals
LEFT JOIN ins_t1 ON (vals.bk1, vals.bk2) = (ins_t1.bk1, ins_t1.bk2)
LEFT JOIN t1 ON (vals.bk1, vals.bk2) = (t1.bk1, t1.bk2)
;

On Wed, Oct 5, 2016 at 1:53 AM, Tom Dunstan <pgsql(at)tomd(dot)cc> wrote:

> Hi all
>
> We recently moved to using 9.5 and were hoping to use the new upsert
> functionality, but unfortunately it doesn’t quite do what we need.
>
> Our setup is something like this:
>
> CREATE TABLE t1 (
> id BIGSERIAL NOT NULL PRIMARY KEY,
> bk1 INT,
> bk2 UUID
> — other columns
> );
> CREATE UNIQUE INDEX t1_bk ON t1 (bk1, bk2);
>
> CREATE TABLE t2 (
> t1_id BIGINT NOT NULL REFERENCES t1
> — other stuff
> );
>
> Data comes in as inserts of one tuple each of t1 and t2. We expect inserts
> to t1 to be heavily duplicated. That is, for stuff coming in we expect a
> large number of rows to have duplicate (bk1, bk2), and we wish to discard
> those, but not discard the t2 tuple - those should always be inserted and
> reference the correct t1 record.
>
> So we currently have an insertion function that does this:
>
> BEGIN
> INSERT INTO t1 (bk1, bk2, other columns)
> VALUES (bk1val, bk2val, other values)
> RETURNING id
> INTO t1_id;
> EXCEPTION WHEN unique_violation THEN
> SELECT id
> FROM t1
> WHERE bk1 = bk1val AND bk2 = bk2val
> INTO t1_id;
> END;
>
> INSERT INTO t2(t1_id, other columns) VALUES(t1_id, other values);
>
> We were hoping that we’d be able to do something like this:
>
> INSERT INTO t1 (bk1, bk2, other columns)
> VALUES (bk1val, bk2val, other values)
> ON CONFLICT (bk1val, bk2val) DO NOTHING
> RETURNING id
> INTO t1_id;
> INSERT INTO t2(t1_id, other columns) VALUES(t1_id, other values);
>
> But unfortunately it seems that the RETURNING clause returns null when
> there’s a conflict, rather than the existing row’s value.
>
> I understand that there is ambiguity if there were multiple rows that were
> in conflict. I think this sort of functionality really only makes sense
> where the conflict target is a unique constraint, so IMO it would make
> sense to only support returning columns in that case.
>
> I imagine that this would be possible to do more efficiently than the
> subsequent query that we are currently doing given that postgres has
> already found the rows in question, in the index at least. I have no idea
> how hard it would actually be to implement though. FWIW my use-case would
> be supported even if this only worked for indexes where the to-be-returned
> columns were stored in the index using Anastasia’s covering + unique index
> patch, when that lands.
>
> Thoughts?
>
> Tom
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-10-05 10:06:17 Re: Stopping logical replication protocol
Previous Message Daniel Gustafsson 2016-10-05 09:36:56 WIP: Secure Transport support as OpenSSL alternative on macOS