Re: PG_DUMP very slow because of STDOUT ??

From: Steve Clark <sclark(at)netwolves(dot)com>
To: Andras Fabian <Fabian(at)atrada(dot)net>
Cc: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PG_DUMP very slow because of STDOUT ??
Date: 2010-07-13 11:36:36
Message-ID: 4C3C4FC4.9040306@netwolves.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 07/13/2010 07:29 AM, Andras Fabian wrote:
> Now, I have found an unorthodox way, to make a slow machine run COPY-to-STDOUT fast. I empty the cache memory of the server, which makes "free" in "free -m" jump up to 14 GBytes (well, I just noticed, that most of the memory on the server is in "cache" ... up to 22 GBytes). I just entered:
>
> " sync;echo 3> /proc/sys/vm/drop_caches"
>
> Running the COPY-to-STDOUT test after this immediately went through in a snap (2 1/2 minutes). I also see, that something in relation with the file is "nicely" mapped into cache memory, because as soon as I delete the file (with "rm"), that immediately frees up 3 GBytes of the cache.
>
> This seems to prove, that a memory issue is/was behind the slow down. But still the question remains, why and how this can happen? I mean, at some point the memory manager most have taken a very wrong decision, if this is the result of its "normal" work ... And how the writing trough the socket affects this, I don't understand (because I still see the case, when a normal COPY-to-FILE didn't slow down at the same time when COPY-to-STDOUT was crouching).
>
> So, potentially, maybe ... as a quick fix I could clean caches in my backup script that starts each night. But is this a safe idea at all? Or could there be adverse implications (yes, of course, some queries that got their data from the file cache before would now need to repopulate it) ?
>
> Or is there a way to influence the memory manager Linux in a way, that it behaves a bit more conservative (or just different in a positive way)?
>
> Andras Fabian
>
>
> -----Ursprüngliche Nachricht-----
> Von: Craig Ringer [mailto:craig(at)postnewspapers(dot)com(dot)au]
> Gesendet: Dienstag, 13. Juli 2010 12:51
> An: Andras Fabian
> Cc: pgsql-general(at)postgresql(dot)org
> Betreff: Re: AW: AW: AW: [GENERAL] PG_DUMP very slow because of STDOUT ??
>
> On 13/07/2010 6:26 PM, Andras Fabian wrote:
>> Wait, now, here I see some correlation! Yes, it seems to be the memory! When I start my COPY-to-STDOUT experiment I had some 2000 MByte free (well ,the server has 24 GByte ... maybe other PostgreSQL processes used up the rest). Then, I could monitor via "ll -h" how the file nicely growed (obviously no congestion), and in parallel see, how "free -m" the "free" memory went down. Then, it reached a level below 192 MByte, and congestion began. Now it is going back and forth around 118-122-130 ... Obviously the STDOUT thing went out of some memory resources.
>> Now I "only" who and why is running out, and how I can prevent that.
>
> > Could there be some extremely big STDOUT buffering in play ????
>
> Remember, "STDOUT" is misleading. The data is sent down the network
> socket between the postgres backend and the client connected to that
> backend. There is no actual stdio involved at all.
>
> Imagine that the backend's stdout is redirected down the network socket
> to the client, so when it sends to "stdout" it's just going to the
> client. Any buffering you are interested in is in the unix or tcp/ip
> socket (depending on how you're connecting), in the client, and in the
> client's output to file/disk/whatever.
>
> --
> Craig Ringer
>
Have you posted this problem/issue to the linux kernel mailing list? You may get some rude responses about
the posting but someone will give you a tip or what is causing the problem.

I know there are some sysctls that affect memory management. I use the following 2 based on recommendation
of Linus to help interactivity for a desktop user.
/sbin/sysctl vm.dirty_background_ratio=3
/sbin/sysctl vm.dirty_ratio=5

--
Stephen Clark
NetWolves
Sr. Software Engineer III
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve(dot)clark(at)netwolves(dot)com
www.netwolves.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message pasman pasmański 2010-07-13 11:49:23 Planner features, discussion
Previous Message Andras Fabian 2010-07-13 11:35:19 Re: PG_DUMP very slow because of STDOUT ??