Re: Efficient use of ON CONFLICT DO UPDATE with the PostgreSQL JDBC driver

From: Jorge Solórzano <jorsol(at)gmail(dot)com>
To: Christopher BROWN <brown(at)reflexe(dot)fr>
Cc: Dave Cramer <pg(at)fastcrypt(dot)com>, List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Efficient use of ON CONFLICT DO UPDATE with the PostgreSQL JDBC driver
Date: 2017-01-13 15:40:01
Message-ID: CA+cVU8PJcRfkzt=mXupApTJpt8SrUce0PTYU6Mh+NmfqZWKk2A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Sadly for better or worse (I think for worse), in the JDBC specification,
parameter markers are represented by positional "?" in the SQL string, so
you will need to send the data twice over the wire.

BTW this use case don't really seems to be of high impact on performance,
so adding an extension to the specification don't make sense IMO.

Jorge Solórzano
me.jorsol.com

On Fri, Jan 13, 2017 at 8:52 AM, Christopher BROWN <brown(at)reflexe(dot)fr> wrote:

> On 13 January 2017 at 15:20, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>
>>
>> On 13 January 2017 at 06:01, Christopher BROWN <brown(at)reflexe(dot)fr> wrote:
>>
>>> Hello,
>>>
>>> I've at last had an opportunity to use PostgreSQL 9.6 (instead of 9.4)
>>> for a project, and have been able to use the ON CONFLICT DO UPDATE clause.
>>> As it can involve repeating parameter values (including, as is my case,
>>> large binary files), I'm wondering if there's any performance or
>>> resource-usage side effects (or workarounds) when doing something like this:
>>>
>>> INSERT INTO foo (c_uuid, file_data, file_name) VALUES (?, ?, ?)
>>> ON CONFLICT (c_uuid) DO UPDATE SET file_data = ?, file_name = ?
>>>
>>> On the Java side, that means invoking PreparedStatement::setBinaryStream
>>> TWICE for the same data.
>>>
>>> I'm not noticing side effects on localhost (but haven't profiled
>>> either). Is this fine, or a potential problem? Is there a better
>>> approach?
>>>
>>
>> So to summarize you would like some sort of optimization where we figured
>> out that two of the values were the same in the initial insert and the on
>> conflict and only send it once ?
>>
>>
>> Dave Cramer
>>
>> davec(at)postgresintl(dot)com
>> www.postgresintl.com
>>
>>
> To summarize, I'd like to avoid sending the same data twice, if that has
> any potentially significant performance issues and/or resource-usage
> issues. I don't know if that's the case or not.
>
> The SQL syntax (and the query workflow, and parameter order) is specific
> to PostgreSQL, so I'm not against something that varies from the basic
> "PreparedStatement" syntax. If there's a safe way of figuring out the
> intention, that'd be excellent. Maybe if the driver has "seen" the
> parameter set (via PreparedStatement::setXXX) in the INSERT part for field
> "X", and it "sees" the field "X" is unset in the ON CONFLICT part, or if
> you could have some sort of back-reference (either by index, e.g. $1, $2 or
> whatever, or by name if that's possible).
>
>

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Brad DeJong 2017-01-13 17:18:01 Re: Efficient use of ON CONFLICT DO UPDATE with the PostgreSQL JDBC driver
Previous Message Thomas Kellerer 2017-01-13 15:16:43 Re: Efficient use of ON CONFLICT DO UPDATE with the PostgreSQL JDBC driver