Re: Fix checkpoint skip logic on idle systems by tracking LSN progress

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: david(at)pgmasters(dot)net
Cc: michael(dot)paquier(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fix checkpoint skip logic on idle systems by tracking LSN progress
Date: 2016-09-30 05:05:02
Message-ID: 20160930.140502.142564417.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry, it wrote wrong thing.

At Fri, 30 Sep 2016 14:00:15 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20160930(dot)140015(dot)150178454(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> Sorry, I might have torn off this thread somehow..
>
> At Thu, 29 Sep 2016 11:26:29 -0400, David Steele <david(at)pgmasters(dot)net> wrote in <30095aea-3910-dbb7-1790-a579fb93fa5e(at)pgmasters(dot)net>
> > On 9/28/16 10:32 PM, Michael Paquier wrote:
> > > On Thu, Sep 29, 2016 at 7:45 AM, David Steele <david(at)pgmasters(dot)net>
> > > wrote:
> > >>
> > >> In general I agree with the other comments that this could end up
> > >> being
> > >> a problem. On the other hand, since the additional locks are only
> > >> taken
> > >> at checkpoint or archive_timeout it may not be that big a deal.
> > >
> > > Yes, I did some tests on my laptop a couple of months back, that has 4
> > > cores. After reducing NUM_XLOGINSERT_LOCKS from 8 to 4 to increase
> > > contention and performing a bunch of INSERT using 4 clients on 4
> > > different relations I could not catch a difference.. Autovacuum was
> > > disabled to eliminate any noise. I tried checkpoint_segments at 30s to
> > > see its effects, as well as larger values to see the impact with the
> > > standby snapshot taken by the bgwriter. Other thoughts are welcome.
> >
> > I don't have any better ideas than that.
>
> I don't see no problem in setting progressAt in XLogInsertRecord.
> But I doubt GetProgressRecPtr is harmful, especially when

But I suspect that GetProgressRecPtr could be harmful.

> NUM_XLOGINSERT_LOCKS is *large*. So reducing the number seems
> rather alleviates the impact. But it actually doesn't seem so
> harmful up to 8. (Even though I don't like the locking in
> GetProgressRecPtr..)
>
> Currently possiblly harmful calling of GetProgressRecPtr is that
> in BackgroundWriterMain. It should be called with ther interval
> BgWriterDelay, and anytime pgwriter recieved SIGUSR1. But I don't
> see the issuer of SIGUSR1 of bgwriter..
>
>
> > >> +++ b/src/backend/postmaster/checkpointer.c
> > >> + /* OK, it's time to switch */
> > >> + elog(LOG, "Request XLog Switch");
> > >>
> > >> LOG level seems a bit much here, perhaps DEBUG1?
> > >
> > > That's from Horiguchi-san's patch, and those would be definitely
> > > better as DEBUG1 by looking at it. Now and in order to keep things
> > > simple I think that we had better discard this patch for now. I was
> > > planning to come back to this thing anyway once we are done with the
> > > first problem.
> >
> > I still see this:
> >
> > +++ b/src/backend/postmaster/checkpointer.c
> > /* OK, it's time to switch */
> > + elog(LOG, "Request XLog Switch");
> >
> > > Well for now attached are two patches, that could just be squashed
> > > into one.
>
> Mmmm. Sorry, this was for my quite private instant debug, spilt
> outside.. But I don't mind to leave it with DEBUG2 if it seems
> useful.
>
> > Yes, I think that makes sense.
> >
> > More importantly, there is a regression. With your new patch the
> > xlogs are switching on archive_timeout again even with no changes.
> > The v12 worked fine.
>
> As Michael mentioned in this or another thread, it is another
> issue that he wants to solve separately. I personally doubt that
> this patch (v11 and v13) can be evaluated alone without it, but
> we can review this with the excessive switching problem, perhaps?
>
> > The differences are all in 0002-hs-checkpoints-v12-2.patch and as far
> > as I can see the patch does not work correctly without these changes.
> > Am I missing something?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2016-09-30 05:40:15 Re: [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"
Previous Message Kyotaro HORIGUCHI 2016-09-30 05:00:15 Re: Fix checkpoint skip logic on idle systems by tracking LSN progress