Re: INSERT times - same storage space but more fields -> much slower inserts

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Matthew Wakeling <matthew(at)flymine(dot)org>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: INSERT times - same storage space but more fields -> much slower inserts
Date: 2009-04-15 00:31:37
Message-ID: 49E52AE9.2080205@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Stephen Frost wrote:
> * Matthew Wakeling (matthew(at)flymine(dot)org) wrote:
>> On Tue, 14 Apr 2009, Stephen Frost wrote:
>>> Bacula should be using COPY for the batch data loads, so hopefully won't
>>> suffer too much from having the fields split out. I think it would be
>>> interesting to try doing PQexecPrepared with binary-format data instead
>>> of using COPY though. I'd be happy to help you implement a test setup
>>> for doing that, if you'd like.
>> You can always do binary-format COPY.
>
> I've never played with binary-format COPY actually. I'd be happy to
> help test that too though.

I'd have to check the source/a protocol dump to be sure, but I think
PQexecPrepared(...), while it takes binary arguments, actually sends
them over the wire in text form. PostgreSQL does have a binary protocol
as well, but it suffers from the same issues as binary-format COPY:

Unlike PQexecPrepared(...), binary-format COPY doesn't handle endian and
type size issues for you. You need to convert the data to the database
server's endianness and type sizes, but I don't think the PostgreSQL
protocol provides any way to find those out.

It's OK if we're connected via a UNIX socket (and thus are on the same
host), though I guess a sufficiently perverse individual could install a
32-bit bacula+libpq, and run a 64-bit PostgreSQL server, or even vice versa.

It should also be OK when connected to `localhost' (127.0.0.0/8) .

In other cases, binary-format COPY would be unsafe without some way to
determine remote endianness and sizeof(various types).

--
Craig Ringer

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2009-04-15 00:40:09 Re: INSERT times - same storage space but more fields -> much slower inserts
Previous Message Stephen Frost 2009-04-14 16:15:34 Re: INSERT times - same storage space but more fields -> much slower inserts