Re: Error with index on unlogged table

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: andres(at)2ndquadrant(dot)com
Cc: thom(at)linux(dot)com, michael(dot)paquier(at)gmail(dot)com, fabriziomello(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Error with index on unlogged table
Date: 2015-03-27 04:54:13
Message-ID: 20150327.135413.172833040.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

At Thu, 26 Mar 2015 18:50:24 +0100, Andres Freund <andres(at)2ndquadrant(dot)com> wrote in <20150326175024(dot)GJ451(at)alap3(dot)anarazel(dot)de>
> I think the problem here is that the *primary* makes no such
> assumptions. Init forks are logged via stuff like
> smgrwrite(index->rd_smgr, INIT_FORKNUM, BTREE_METAPAGE,
..
> i.e. the data is written out directly to disk, circumventing
> shared_buffers. It's pretty bad that we don't do the same on the
> standby. For master I think we should just add a bit to the XLOG_FPI
> record saying the data should be forced out to disk. I'm less sure
> what's to be done in the back branches. Flushing every HEAP_NEWPAGE
> record isn't really an option.

The problem exists only for INIT_FORKNUM. So I suppose it is
enough to check forknum to decide whether to sync immediately.

Specifically for this instance, syncing buffers of INIT_FORKNUM
at the end of XLOG_FPI block in xlog_redo fixed the problem.

The another (ugly!) solution sould be syncing only buffers for
INIT_FORKNUM and is BM_DIRTY in ResetUnlogggedRelations(op =
UNLOGGED_RELATION_INIT). This is catching-all-at-once solution
though it is a kind of reversion of fast promotion. But buffers
to be synced here should be pretty few.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2015-03-27 05:51:09 Re: trying to study how sorting works
Previous Message Antonin Houska 2015-03-26 23:44:02 WIP: Split of hash index bucket