Quick Links

Re: Patch for fail-back without fresh backup

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Samrat Revagade <revagade(dot)samrat(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Patch for fail-back without fresh backup
Date:	2014-01-16 19:01:29
Message-ID:	CAMkU=1yg=jdmGpN-pj2pGBy1++EAkag=5w6y4hpG-QQacQtMLg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Jan 16, 2014 at 9:37 AM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On 2014-01-16 09:25:51 -0800, Jeff Janes wrote:
> > On Thu, Nov 21, 2013 at 2:43 PM, Andres Freund <andres(at)2ndquadrant(dot)com
> >wrote:
> >
> > > On 2013-11-21 14:40:36 -0800, Jeff Janes wrote:
> > > > But if the transaction would not have otherwise generated WAL (i.e. a
> > > > select that did not have to do any HOT pruning, or an update with
> zero
> > > rows
> > > > matching the where condition), doesn't it now have to flush and wait
> when
> > > > it would otherwise not?
> > >
> > > We short circuit that if there's no xid assigned. Check
> > > RecordTransactionCommit().
> > >
> >
> > It looks like that only short-circuits the flush if both there is no xid
> > assigned, and !wrote_xlog. (line 1054 of xact.c)
>
> Hm. Indeed. Why don't we just always use the async commit behaviour for
> that? I don't really see any significant dangers from doing so?
>

I think the argument is that drawing the next value from a sequence can
generate xlog that needs to be flushed, but doesn't assign an xid.

I would think the sequence should flush that record before it hands out the
value, not before the commit, but...

>
> It's also rather odd to use the sync rep mechanisms in such
> scenarios... The if() really should test markXidCommitted instead of
> wrote_xlog.
>
> > I do see stalls on fdatasync on flush from select statements which had no
> > xid, but did generate xlog due to HOT pruning, I don't see why WAL
> logging
> > hint bits would be different.
>
> Are the stalls at commit or while the select is running? If wal_buffers
> is filled too fast, which can easily happen if loads of pages are hinted
> and wal logged, that will happen independently from
> RecordTransactionCommit().
>

In the real world, I'm not sure what the distribution is.

But in my present test case, they are coming almost exclusively from
RecordTransactionCommit.

I use "pgbench -T10" in a loop to generate dirty data and checkpoints (with
synchronous_commit on but with a BBU), and then to probe the consequences I
use:

pgbench -T10 -S -n --startup='set synchronous_commit='$f

(where --startup is an extension to pgbench proposed a few months ago)

Running the select-only query with synchronous_commit off almost completely
isolates it from the checkpoint drama that otherwise has a massive effect
on it. with synchronous_commit=on, it goes from 6000 tps normally to 30
tps during the checkpoint sync, with synchronous_commit=off it might dip to
4000 or so during the worst of it.

(To be clear, this is about the pruning, not the logging of the hint bits)

Cheers,

Jeff

In response to

Re: Patch for fail-back without fresh backup at 2014-01-16 17:37:50 from Andres Freund

Responses

Re: Patch for fail-back without fresh backup at 2014-01-16 19:05:33 from Andres Freund
Re: Patch for fail-back without fresh backup at 2014-01-16 19:15:45 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2014-01-16 19:02:08	Re: WAL Rate Limiting
Previous Message	Josh Berkus	2014-01-16 19:00:55	Re: Why conf.d should be default, and auto.conf and recovery.conf should be in it