Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Date: 2016-03-14 01:00:03
Message-ID: 1457917203.10232.2.camel@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2016-03-13 at 18:46 -0400, Noah Misch wrote:
> On Thu, Mar 03, 2016 at 06:08:07AM +0100, Tomas Vondra wrote:
> >
> > On 02/03/2016 06:46 AM, Noah Misch wrote:
> > >
> > > On Mon, Feb 01, 2016 at 07:03:45PM +0100, Tomas Vondra wrote:
> > > >
> > > > On 12/22/2015 03:49 PM, Noah Misch wrote:
> > > > >
> > > > > On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera
> > > > > wrote:
> > > > > >
> > > > > > I have pushed it now.  Further testing, of course, is
> > > > > > always welcome.
> > > > > While investigating stats.sql buildfarm failures, mostly on
> > > > > animals
> > > > > axolotl and shearwater, I found that this patch (commit
> > > > > 187492b)
> > > > > inadvertently removed the collector's ability to coalesce
> > > > > inquiries.
> > > > > Every PGSTAT_MTYPE_INQUIRY received now causes one stats file
> > > > > write.
> > > > > Before, pgstat_recv_inquiry() did:
> > > > >
> > > > > if (msg->inquiry_time > last_statrequest)
> > > > > last_statrequest = msg->inquiry_time;
> > > > >
> > > > > and pgstat_write_statsfile() did:
> > > > >
> > > > > globalStats.stats_timestamp = GetCurrentTimestamp();
> > > > > ... (work of writing a stats file) ...
> > > > > last_statwrite = globalStats.stats_timestamp;
> > > > > last_statrequest = last_statwrite;
> > > > >
> > > > > If the collector entered pgstat_write_statsfile() with more
> > > > > inquiries
> > > > > waiting in its socket receive buffer, it would ignore them as
> > > > > being too
> > > > > old once it finished the write and resumed message
> > > > > processing. Commit
> > > > > 187492b converted last_statrequest to a "last_statrequests"
> > > > > list that we
> > > > > wipe after each write.
> > So I've been looking at this today, and I think the attached patch
> > should do
> > the trick. I can't really verify it, because I've been unable to
> > reproduce the
> > non-coalescing - I presume it requires much slower system (axolotl
> > is RPi, not
> > sure about shearwater).
> >
> > The patch simply checks DBEntry,stats_timestamp in
> > pgstat_recv_inquiry() and
> > ignores requests that are already resolved by the last write (maybe
> > this
> > should accept a file written up to PGSTAT_STAT_INTERVAL in the
> > past).
> >
> > The required field is already in DBEntry (otherwise it'd be
> > impossible to
> > determine if backends need to send inquiries or not), so this is
> > pretty
> > trivial change. I can't think of a simpler patch.
> >
> > Can you try applying the patch on a machine where the problem is
> > reproducible? I might have some RPi machines laying around (for
> > other
> > purposes).
> I've not attempted to study the behavior on slow hardware.  Instead,
> my report
> used stat-coalesce-v1.patch[1] to simulate slow writes.  (That
> diagnostic
> patch no longer applies cleanly, so I'm attaching a rebased
> version.  I've
> changed the patch name from "stat-coalesce" to "slow-stat-simulate"
> to
> more-clearly distinguish it from the "pgstat-coalesce"
> patch.)  Problems
> remain after applying your patch; consider "VACUUM pg_am" behavior:
>
> 9.2 w/ stat-coalesce-v1.patch:
>   VACUUM returns in 3s, stats collector writes each file 1x over 3s
> HEAD w/ slow-stat-simulate-v2.patch:
>   VACUUM returns in 3s, stats collector writes each file 5x over 15s
> HEAD w/ slow-stat-simulate-v2.patch and your patch:
>   VACUUM returns in 10s, stats collector writes no files over 10s

Oh damn, the timestamp comparison in pgstat_recv_inquiry should be in
the opposite direction. After fixing that "VACUUM pg_am" completes in 3
seconds and writes each file just once.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
pgstat-coalesce-v2.patch text/x-patch 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Sewell 2016-03-14 01:16:44 Re: Parallel Aggregate
Previous Message Haribabu Kommi 2016-03-14 00:33:19 Re: Parallel Aggregate