Re: [HACKERS] 8.3beta1 testing on Solaris

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] 8.3beta1 testing on Solaris
Date: 2007-11-15 20:49:27
Message-ID: 200711152049.lAFKnRg29115@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance


This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Jignesh K. Shah wrote:
>
> I changed CLOG Buffers to 16
>
> Running the test again:
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU ID FUNCTION:NAME
> 0 1027 :tick-5sec
>
> /export/home0/igen/pgdata/pg_clog/0024
> -2753028219296 1
> /export/home0/igen/pgdata/pg_clog/0025
> -2753028211104 1
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU ID FUNCTION:NAME
> 1 1027 :tick-5sec
>
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU ID FUNCTION:NAME
> 1 1027 :tick-5sec
>
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU ID FUNCTION:NAME
> 0 1027 :tick-5sec
>
> /export/home0/igen/pgdata/pg_clog/0025
> -2753028194720 1
>
>
> So Tom seems to be correct that it is a case of CLOG Buffer thrashing.
> But since I saw the same problem with two different workloads, I think
> people hitting this problem is pretty high.
>
> Also I am bit surprised that CLogControlFile did not show up as being
> hot.. Maybe because not much writes are going on .. Or maybe since I did
> not trace all 500 users to see their hot lock status..
>
>
> Dmitri has another workload to test, I might try that out later on to
> see if it causes similar impact or not.
>
> Of course I havent seen my throughput go up yet since I am already CPU
> bound... But this is good since the number of IOPS to the disk are
> reduced (and hence system calls).
>
>
> If I take this as my baseline number.. I can then proceed to hunt other
> bottlenecks????
>
>
> Whats the view of the community?
>
> Hunt down CPU utilizations or Lock waits next?
>
> Your votes are crucial on where I put my focus.
>
> Another thing Josh B told me to check out was the wal_writer_delay setting:
>
> I have done two settings with almost equal performance (with the CLOG 16
> setting) .. One with 100ms and other default at 200ms.. Based on the
> runs it seemed that the 100ms was slightly better than the default ..
> (Plus the risk of loosing data is reduced from 600ms to 300ms)
>
> Thanks.
>
> Regards,
> Jignesh
>
>
>
>
> Tom Lane wrote:
> > "Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
> >
> >> So the ratio of reads vs writes to clog files is pretty huge..
> >>
> >
> > It looks to me that the issue is simply one of not having quite enough
> > CLOG buffers. Your first run shows 8 different pages being fetched and
> > the second shows 10. Bearing in mind that we "pin" the latest CLOG page
> > into buffers, there are only NUM_CLOG_BUFFERS-1 buffers available for
> > older pages, so what we've got here is thrashing for the available
> > slots.
> >
> > Try increasing NUM_CLOG_BUFFERS to 16 and see how it affects this test.
> >
> > regards, tom lane
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/docs/faq
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2007-11-15 20:58:28 Terminal width for help output
Previous Message Tom Lane 2007-11-15 20:38:41 Re: [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled

Browse pgsql-performance by date

  From Date Subject
Next Message Gábor Farkas 2007-11-16 09:40:43 autovacuum: recommended?
Previous Message Vivek Khera 2007-11-15 20:28:44 Re: dell versus hp