Re: Performance of COPY for Archive operations

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: <pgsql-hackers-win32(at)postgresql(dot)org>
Subject: Re: Performance of COPY for Archive operations
Date: 2004-09-15 20:16:29
Message-ID: NOEFLCFHBPDAFHEIPGBOEEIOCEAA.simon@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers-win32

> Bruce Momjian wrote
> Simon Riggs wrote:
> >
> > I've spent a while working with PITR functionality on the Win32 port.
> >
> > I noticed that *it works*, which is always great, but using a
> COPY command
> > the archival operation was significantly slower than the writing of the
> > xlogs themselves.
> >
> > At one point, I got to being more than 10 xlog files behind
> with the list
> > growing steadily, and took a while to clear the logjam when my test
> > workloads completed. Not much point having archiving thats
> actually slower
> > than the writing of xlog....
>
> Why was it slow? 'cp' was slower than the WAL writes? Seems strange to
> me. Do we have some sleep loop in there that is causing us to read
> that directory too slowly? I didn't think so.
>

(Win32 COPY, not cp.)

Yes, it seemed strange, that's why I mention it... nothing like that on
linux.

When there are multiple files ready for archive, ARCHIVER loops until
they're all done. You're right, it could conceivably be something to do with
directory access speed, but I'm thinking that the NT COPY command itself has
some strangeness.

My test involved writing 1 million records, each > 4k to a table using an
Insert Select. The Server had a single disk, but there's no reason to expect
that head movement on the disk would favour one process over another. That's
probably THE most common setup for people using the Windows version anyway,
so it is important.

I note also Mark Wong's recent large scale benchmark that showed less than a
1% overhead from archiving.

> > IIRC the COPY command isn't the best thing to use for bulk-copying on
> > Windows, but I can't remember what is better. Anybody?
>
> COPY is the fastest way to get data in and out of PostgreSQL.

Agreed....but I meant copying NT files around using the NT COPY command, not
the PostgreSQL COPY command.

I had some performance issues in '98 related to this - just hoping some
Win32 wiz will educate me...

...

More importantly, can anybody repeat this result? I performed this twice,
with the same results each time.

Thanks,

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers-win32 by date

  From Date Subject
Next Message Bruce Momjian 2004-09-16 00:30:32 Re: Performance of COPY for Archive operations
Previous Message Bruce Momjian 2004-09-15 13:55:10 Re: Performance of COPY for Archive operations