Re: Proposal: RETURNING primary_key()

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
Cc: "Igal (at) Lucee(dot)org" <igal(at)lucee(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ian Barwick <ian(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: RETURNING primary_key()
Date: 2016-03-11 08:40:06
Message-ID: CAMsr+YGH2RkxeEpOzTqb=Lsn_CDXwBE0b03=t81R4bt-wLUoCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11 March 2016 at 16:00, Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
wrote:

> Igal, thanks for the analysis.
>
> Craig>Part of the question for Pg is what exactly we should and should
> not be returning.
>
> I think the following might be a good starting point: return set of
> columns that would identify the inserted row(s).
> E.g. columns of any primary key would do. Columns of any unique key would
> do as well.
> "returning *" would do as well, however it would return excessive columns,
> thus it would be less efficient.
>
> I do not think it makes sense to tie "generated keys" to sequences or
> things like that.
>

That's why (sorry, Igal) I'd like to see some more tests for cases other
than identity columns. How is GENERATED ALWAYS handled, if supported? What
about if it's on a UNIQUE column? How about a PRIMARY KEY whose value is
assigned by a DEFAULT or by a trigger?

Based on the rather funky behaviour Igal found I suspect the answer will be
"nothing much" for all of those, i.e. it just doesn't work with other
drivers/vendors. But I'd like to know. I

> 2) Same for multicolumn keys: Pg just returns (col1, col2) == (42, 146).
> Then client would be able to locate the row via "where col1=42 and col2=146
>

Yeah, I was wondering about composite PKs. I think Igal focused only on
generated synthetic keys, which are after all overwhelmingly common case
when getting generated keys.

3) If multiple unique keys present, it is fine if Pg returns one or the
> another depending on the phase of the moon. Yet more compact key would be
> preferable to save on bandwidth.
>

I disagree there. Behavour must be well-defined and predictable unless it's
really unavoidable.

> I think naming the resulting column(s) like "generated_key" /
> "generated_keys" does not make much sense. Especially, for multi-column
> keys.
>

Yeah. At least in PgJDBC where it's a separate resultset (IIRC), so you
have metadata that means you don't have to guess column names etc.

> If ctid was update-proof, it could could do. Unfortunately, ctid might
> easily get changed.
>

Indeed. Especially since many of the apps that want to fetch generated keys
will be connection-pool oriented apps doing optimistic concurrency control
- ORMs and the like. So they won't be able to hold the transaction that
added the row open (to hold locks and protect against vacuum) while
fetching more info about the row. That'd be quite undesirable for
performance anyway, since it'd force at least one extra round-trip; you
couldn't pipeline the query for more info about the row until you knew the
ctid of the inserted row.

using ctid is a nonstarter IMO, at least as far as the client goes.

PS. Frankly speaking, I feel "generated keys" is more like a "plug & pray"
> kind of API. ORMs should know the column names of the primary keys => ORMs
> should use "return specific column names" API, not just "return something
> generated".
>

Yep. There are many "should"s. I absolutely agree that this is one of them.

One reason some clients do it this way is that earlier versions of the JDBC
API didn't have the String[] generatedKeys form of prepareStatement. So
they had to cope with not being able to ask for specific cols and getting
whatever the DB handed them.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2016-03-11 08:40:23 Re: Refectoring of receivelog.c
Previous Message Bruce Momjian 2016-03-11 08:36:45 Re: The plan for FDW-based sharding