Re: hash index on unlogged tables doesn't behave as expected

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hash index on unlogged tables doesn't behave as expected
Date: 2017-07-04 07:53:51
Message-ID: CAB7nPqR-2GMDeEEVa7Sh5xNsOZ_SjnsWkxYLfwqBVhqnLO6v=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 3, 2017 at 7:40 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> There is already such a test, see create_index.sql.
> CREATE UNLOGGED TABLE unlogged_hash_table (id int4);
> CREATE INDEX unlogged_hash_index ON unlogged_hash_table USING hash (id
> int4_ops);
>
> Do you have something else in mind?
>
> I think the problem mentioned in this thread can be caught only if we
> promote the standby and restart it. We might be able to write it
> using TAP framework, but I think such a test should exist for other
> index types as well or in general for unlogged tables. I am not
> opposed to writing such a test if you and or others feel so.

There have been many requests for something like that. This overlaps a
lot with the existing make check though, so a buildfarm module using
wal_consistency_checking may be a better idea than something in core.

>> Apart from that i have tested the patch and the patch seems to be working
>> fine.
>
> Thanks.

It seems to me that this qualifies as an open item for 10. WAL-logging
is new for hash tables. Amit, could you add one?

+ use_wal = RelationNeedsWAL(rel) ||
+ (rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
+ forkNum == INIT_FORKNUM);
In access AMs, empty() routines are always only called for unlogged
relations, so you could relax that to check for INIT_FORKNUM only.

Looking at the patch, I think that you are right to do the logging
directly in _hash_init() and not separately in hashbuildempty().
Compared to other relations the init data is more dynamic.

+ /*
+ * Force the on-disk state of init forks to always be in sync with the
+ * state in shared buffers. See XLogReadBufferForRedoExtended.
+ */
+ XLogRecGetBlockTag(record, 0, NULL, &forknum, NULL);
+ if (forknum == INIT_FORKNUM)
+ FlushOneBuffer(metabuf);
I find those forced syncs a bit surprising. The buffer is already
marked as dirty, so even if there is a concurrent checkpoint they
would be flushed. If recovery comes again here, then they would just
be replayed. I am unclear why XLogReadBufferForRedoExtended is not
enough to replay a FPW of that.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2017-07-04 08:25:17 Re: UPDATE of partition key
Previous Message Dean Rasheed 2017-07-04 07:49:05 Re: Multi column range partition table