Re: Freeze avoidance of very large table.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: sawada(dot)mshk(at)gmail(dot)com
Cc: jeff(dot)janes(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, amit(dot)kapila16(at)gmail(dot)com, andres(at)anarazel(dot)de, simon(at)2ndquadrant(dot)com, josh(at)agliodbs(dot)com, masao(dot)fujii(at)gmail(dot)com, alvherre(at)2ndquadrant(dot)com, Jim(dot)Nasby(at)bluetreble(dot)com, michael(dot)paquier(at)gmail(dot)com, petr(at)2ndquadrant(dot)com, bruce(at)momjian(dot)us, stark(at)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Freeze avoidance of very large table.
Date: 2015-12-02 04:00:45
Message-ID: 20151202.130045.34052345.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

> You're right, it's not necessary.
> Attached latest v29 patch which removes the mention in pg_upgrade documentation.

The changes looks to be correct but I haven't tested.
And I have some additional random comments.

visibilitymap.c:

In visibilitymap_set, the followint lines.

map = PageGetContents(page);
...
if (flags != (map[mapByte] & (flags << mapBit)))

map is (char*), PageGetContents returns (char*) but flags is
uint8. I think that defining map as (uint8*) would be safer.

In visibilitymap_set, the following lines does something
different from what to do. Only right side of the inequality
gets shifted and what should be used in right side is not flags
but VISIBILITYMAP_VALID_BITS.

- if (!(map[mapByte] & (1 << mapBit)))
+ if (flags != (map[mapByte] & (flags << mapBit)))

Something like the following will do the right thing.

+ if (flags != (map[mapByte]>>mapBit & VISIBILITYMAP_VALID_BITS))

analyze.c:

In do_analyze_rel, the successive if (!inh) in the following
steps looks a bit odd. This would be emphasized by the first if
block you added:) These blocks should be enclosed by if (!inh)
{} block.

> /* Calculate the number of all-visible and all-frozen bit */
> if (!inh)
> relallvisible = visibilitymap_count(onerel, &relallfrozen);
> if (!inh)
> vac_update_relstats(onerel,
> if (!inh && !(options & VACOPT_VACUUM))
> {
> for (ind = 0; ind < nindexes; ind++)
...
> }
> if (!inh)
> pgstat_report_analyze(onerel, totalrows, totaldeadrows, relallfrozen);

vacuum.c:

>- * relpages and relallvisible, we try to maintain certain lazily-updated
>- * DDL flags such as relhasindex, by clearing them if no longer correct.
>- * It's safe to do this in VACUUM, which can't run in parallel with
>- * CREATE INDEX/RULE/TRIGGER and can't be part of a transaction block.
>- * However, it's *not* safe to do it in an ANALYZE that's within an

>+ * relpages, relallvisible, we try to maintain certain lazily-updated

Why did you just drop the 'and' after relpages? And this seems
the only change of this file except the additinally missing
letter just below:p

>+ * DDL flags such as relhasindex, by clearing them if no onger correct.
>+ * It's safe to do this in VACUUM, which can't run in
>+ * parallel with CREATE INDEX/RULE/TRIGGER and can't be part of a transaction
>+ * block. However, it's *not* safe to do it in an ANALYZE that's within an

nodeIndexonlyscan.c:

A duplicate letters. And the line exceeds right margin.

> - * Note on Memory Ordering Effects: visibilitymap_test does not lock
-> + * Note on Memory Ordering Effects: visibilitymap_get_stattus does not lock
+ * Note on Memory Ordering Effects: visibilitymap_get_status does not lock

The edited line exceeds right margin by removing a newline.

- if (!visibilitymap_test(scandesc->heapRelation,
- ItemPointerGetBlockNumber(tid),
- &node->ioss_VMBuffer))
+ if (!VM_ALL_VISIBLE(scandesc->heapRelation, ItemPointerGetBlockNumber(tid),
+ &node->ioss_VMBuffer))

costsize.c:

Duplicate words and it is the only change.

> - * pages for which the visibility map shows all tuples are visible.
-> + * pages for which the visibility map map shows all tuples are visible.
+ * pages for which the visibility map shows all tuples are visible.

pgstat.c:

The new parameter frozenpages of pgstat_report_vacuum() is
defined as int32, but its callers give BlockNumber(=uint32). I
recommend to define the frozenpages as BlockNumber.
PgStat_MsgVacuum has a corresponding member defined as int32.

pg_upgrade.c:

BITS_PER_HEAPBLOCK is defined in two c files with the same
definition. This might be better to be merged into some header
file.

heapam_xlog.h, hio.h, execnodes.h:

Have we decided to rename vm to pim? Anyway it is inconsistent
with that of corresponding definition of the function body
remains as 'vm_buffer'. (I'm not confident on that, though.)

>- Buffer vm_buffer, TransactionId cutoff_xid);
>+ Buffer pim_buffer, TransactionId cutoff_xid, uint8 flags);

regards,

At Wed, 2 Dec 2015 00:10:09 +0530, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoC72S2ShoeAmCxWYUyGSNOaTn4fMHJ-ZKNX-MPcsQpaOw(at)mail(dot)gmail(dot)com>
> On Tue, Dec 1, 2015 at 3:04 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> > On Mon, Nov 30, 2015 at 9:18 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > After running pg_upgrade, if I manually vacuum a table a start getting warnings:
> >
> > WARNING: page is not marked all-visible (and all-frozen) but
> > visibility map bit(s) is set in relation "foo" page 32756
> > WARNING: page is not marked all-visible (and all-frozen) but
> > visibility map bit(s) is set in relation "foo" page 32756
...
> > The warnings are right where the blocks would start using the 2nd page
> > of the _vm, so I think the problem is there. And looking at the code,
> > I think that "cur += SizeOfPageHeaderData;" in the inner loop cannot
> > be correct. We can't skip a header in the current (old) block each
> > time we reach the end of the new block. The thing we are skipping in
> > the current block is half the time not a header, but the data at the
> > halfway point through the block.
> >
>
> Thank you for reviewing.
>
> You're right, it's not necessary.
> Attached latest v29 patch which removes the mention in pg_upgrade documentation.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-12-02 04:04:35 Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
Previous Message Etsuro Fujita 2015-12-02 03:20:46 Re: Foreign join pushdown vs EvalPlanQual