Re: COPY speedup

From: Pierre Frédéric Caillaud <lists(at)peufeu(dot)com>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Merlin Moncure" <mmoncure(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY speedup
Date: 2009-08-12 21:28:53
Message-ID: op.uykjafsocke6l8@soyouz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>> >> We don't touch datatype APIs
>> >> lightly, because it affects too much code.
>> >>
>> >>                        regards, tom lane
>> >
>> >        I definitely agree with that.

Actually, let me clarify:

When I modified the datatype API, I was feeling uneasy, like "I shouldn't
really touch this".
But when I see a big red button, I just press it to see what happens.
Ugly hacks are useful to know how fast the thing can go ; then the
interesting part is to reimplement it cleanly, trying to reach the same
performance...

>> Is there any way to do this that is not as invasive?
>
> Maybe add new methods, fastrecv/fastsend etc. Types that don't
> implement them would simply use the slow methods, maintaining
> backwards compatibility.

Well, this would certainly work, and it would be even faster.

I considered doing it like this, but it is a lot more work : adding
entries to the system catalogs, creating all the new functions, deciding
what to do with getTypeBinaryOutputInfo (since there would be 2 variants),
etc. Types that don't support the new functions would need some form of
indirection to call the old functions instead, etc. In a word, doable, but
kludgy, and I would need help from a system catalog expert. Also, on
upgrade, information about the new functions must be inserted in the
system catalogs ? (I don't know how this process works). If you want to
help...

The way I see COPY BINARY is that its speed should be really something
massive.
COPY foo FROM ... BINARY should be as fast as CREATE TABLE foo AS SELECT *
FROM bar (which is extremely fast).
COPY foo TO ... BINARY should be as fast as the disk allows.

Why else would anyone use a binary format if it's slower than portable
text ?

So, there are two other ways (besides fastsend/fastrecv) that I can see :

1- The way I implemented

I'm not saying it's the definitive solution : just a simple way to see how
much overhead is introduced by the current API, returning BYTEAs and
palloc()'ing every tuple of every row. I think this approach gave two
interesting answers :

- once COPY's output buffer has been made more efficient, with things like
removing fwrite() for every row etc (see patch), all that remains is the
API overhead, which is very significant for binary mode, since I could get
massive speedups (3-4x !) by bypassing it. The table scan itself is
super-fast.

- however, for text mode, it is not so significant, as the speedups
bypassing the API were roughly 0-20%, since most of the time is spent in
datum to text conversions.

Now, I don't think the hack is so ugly. It does make me feel a bit uneasy,
but :

- The context field in the fcinfo struct is there for a reason, so I used
it.
- I checked every place in the code where SendFunctionCall() appears
(which are quite few actually).
- The context field is never used for SendFuncs or ReceiveFuncs (it is
always set to NULL)

2- Another way

- palloc() could be made faster for short blocks
- a generous sprinkling of inline's
- a few modifications to pq_send*
- a few modifications to StringInfo
- bits of my previous patch in copy.c (like not fwriting every row)

This would be less fast, but you'd still get a substantial speedup.

As a conclusion, I think :

- Alvaro's fastsend/fastrecv provides the cleanest solutin
- Method 2 is the easiest, but slower
- Method 1 is an intermediate, but the use of the context field is a
touchy subject.

Also, I will work on COPY FROM ... BINARY. I should be able to make it
also much faster. This will be useful for big imports.

Regards,
Pierre

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2009-08-12 21:33:44 Re: Why is vacuum_freeze_min_age 100m?
Previous Message Tom Lane 2009-08-12 21:22:11 Re: Why is vacuum_freeze_min_age 100m?