Re: hash index on unlogged tables doesn't behave as expected

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: amit(dot)kapila16(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, michael(dot)paquier(at)gmail(dot)com, ashu(dot)coek88(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: hash index on unlogged tables doesn't behave as expected
Date: 2017-07-10 05:09:24
Message-ID: 20170710.140924.107962324.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Sat, 8 Jul 2017 16:41:27 +0530, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote in <CAA4eK1+-DUto+MyeNdLE9P9u8G3Fv6n+SOjPSqmPSw6ashhPjw(at)mail(dot)gmail(dot)com>
> On Sat, Jul 8, 2017 at 9:06 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > On Sat, Jul 8, 2017 at 8:52 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> On Thu, Jul 6, 2017 at 5:54 AM, Kyotaro HORIGUCHI
> >> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >>> Check for INIT_FORKNUM appears both accompanied and not
> >>> accompanied by check for RELPER.._UNLOGGED, so I'm not sure which
> >>> is the convention here.
> >>
> >> Checking only for INIT_FORKNUM seems logical to me. Also checking for
> >> RELPERSISTENCE_UNLOGGED just makes the code longer to no benefit. I
> >> guess Amit copied the test from ATExecSetTablespace, which does it as
> >> he did, but it seems unnecessarily long-winded.
> >>
> >
> > Okay. If you and Michael feel the check that way is better, I will
> > change and submit the patch.
> >
>
> Changed as per suggestion.

> On Sat, Jul 8, 2017 at 9:08 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Fri, Jul 7, 2017 at 11:36 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >> I think we should do that as a separate patch (I can write the same as
> >> well) because that is not new behavior introduced by this patch, ...
> >
> > True, although maybe hash indexes are the only way that happens today?
> >
>
> No, I think it will happen for all other indexes as well. Basically,
> we log new page for init forks of other indexes and then while
> restoring that full page image, it will fall through the same path.

(Sorry, I didn't noticed that hash metapage is not using log_newpgae)

For example, (bt|gin|gist|spgist|brin)buildempty are using
log_newpage to log INIT_FORK metapages. I believe that the xlog
flow from log_newpage to XLogReadBufferForRedoExtended is
developed so that pages in init forks are surely flushed during
recovery, so we should fix it instead adding other flushes if the
path doesn't work. Am I looking wrong place? (I think it is
working.)

If I'm understanding correctly here, hashbild and hashbuildempty
should be refactored using the same structure with other *build
and *buildempty functions. Specifically metapages initialization
subroutines (_hash_init or _bt_initmetapage or SpGistInitMetapage
or...) does only on-memory initialization and does not emit WAL,
then *build and *buildempty emits WAL in their required way.

> >>> By the way the comment of the function ReadBufferWithoutRelcache
> >>> has the following sentense.
> >>>
> >>> | * NB: At present, this function may only be used on permanent relations, which
> >>> | * is OK, because we only use it during XLOG replay. If in the future we
> >>> | * want to use it on temporary or unlogged relations, we could pass additional
> >>> | * parameters.
> >>>
> >>> and does
> >>>
> >>> | return ReadBuffer_common(smgr, RELPERSISTENCE_PERMANENT, forkNum, blockNum,
> >>> mode, strategy, &hit);
> >>>
> >>> This surely works since BufferAlloc recognizes INIT_FORK. But
> >>> some adjustment may be needed around here.
> >>
> >> Yeah, it should probably mention that the init fork of an unlogged
> >> relation is also OK.
> >>
> >
> > I think we should do that as a separate patch (I can write the same as
> > well) because that is not new behavior introduced by this patch, but
> > let me know if you think that we should add such a comment in this
> > patch itself.
> >
>
> Attached a separate patch to adjust the comment atop ReadBufferWithoutRelcache.

+ * NB: At present, this function may only be used on permanent relations and
+ * init fork of an unlogged relation, which is OK, because we only use it
+ * during XLOG replay. If in the future we want to use it on temporary or
+ * unlogged relations, we could pass additional parameters.

*I* think the target of the funcion is permanent relations and
init forks, not unlogged relations. And I'd like to have an
additional comment about RELPERSISTENCE_PERMANENT. Something like
the attached.

reagards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
fix_comment_atop_ReadBufferWithoutRelcache_v2.patch text/x-patch 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2017-07-10 05:15:41 Re: Oddity in error handling of constraint violation in ExecConstraints for partitioned tables
Previous Message Mithun Cy 2017-07-10 04:43:56 Re: POC: Cache data in GetSnapshotData()