Re: PG_DUMP very slow because of STDOUT ??

From: Andras Fabian <Fabian(at)atrada(dot)net>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PG_DUMP very slow because of STDOUT ??
Date: 2010-07-13 06:31:18
Message-ID: B1A1AD14D5F9D647BD2A00988C53B8220ACA3000@atradaex03.nbg.atrada.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Scott,

Although I can't guarantee for 100% that there was no RAID rebuild at some point, I am almost sure that it wasn't the case. Two machines - the ones which were already in production - exhibited this problem. Both of them were already up for some weeks. Now, the reboot rather "fixed" one of them instead of making it worse (as your theory goes this way) the problem "disappeared" (but I don't know for how long). Now, only one of the production machines has the issue ... the one which wasn't rebooted. Strange, strange. Nevertheless thank you for your idea ... this is exactly the way I try to approach the problem, by making some theories and trying to prove or disapprove them :-)
Now I will try to further investigate along the tips from Craig and Greg.

Andras Fabian

-----Ursprüngliche Nachricht-----
Von: Scott Marlowe [mailto:scott(dot)marlowe(at)gmail(dot)com]
Gesendet: Dienstag, 13. Juli 2010 03:43
An: Andras Fabian
Cc: Tom Lane; pgsql-general(at)postgresql(dot)org
Betreff: Re: [GENERAL] PG_DUMP very slow because of STDOUT ??

On Mon, Jul 12, 2010 at 7:03 AM, Andras Fabian <Fabian(at)atrada(dot)net> wrote:
> This STDOU issue gets even weirder. Now I have set up our two new servers (identical hw/sw) as I would have needed to do so anyways. After having PG running, I also set up the same test scenario as I have it on our problematic servers, and started the COPY-to-STDOUT experiment. And you know what? Both new servers are performing well. No hanging, and the 3 GByte test dump was written in around 3 minutes (as expected). To make things even more complicated ... I went back to our production servers. Now, the first one - which I froze up with oprofile this morning and needed a REBOOT - is performing well too! It needed 3 minutes for the test case ... WTF? BUT, the second production server, which did not have a reboot, is still behaving badly.

I'm gonna take a scientific wild-assed guess that your machine was
rebuilding RAID arrays when you started out, and you had massive IO
contention underneath the OS level resulting in such a slow down.
Note that you mentioned ~5% IO Wait. That's actually fairly high if
you've got 8 to 16 cores or something like that. It's much better to
use iostat -xd 60 or something like that and look for IO Utilization
at the end of the lines.

Again, just a guess.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joe Conway 2010-07-13 06:35:36 Re: Redundant database objects.
Previous Message Andrew Bartley 2010-07-13 06:07:21 Re: Redundant database objects.