Re: Freeze avoidance of very large table.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Freeze avoidance of very large table.
Date: 2015-07-07 11:49:15
Message-ID: CAA4eK1K219GWYHEEMKn0kYK0wmh_9jsDgrmhaQGwCffcts9Scg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 2, 2015 at 9:00 PM, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
wrote:
>
>
> Thank you for bug report, and comments.
>
> Fixed version is attached, and source code comment is also updated.
> Please review it.
>

I am looking into this patch and would like to share my findings with
you:

1.
@@ -2131,8 +2133,9 @@ heap_insert(Relation relation, HeapTuple tup,
CommandId cid,

CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);

/*
- * Find buffer to insert this
tuple into. If the page is all visible,
- * this will also pin the requisite visibility map page.
+
* Find buffer to insert this tuple into. If the page is all visible
+ * of all frozen, this will also pin
the requisite visibility map and
+ * frozen map page.
*/

typo in comments.

/of all frozen/or all frozen

2.
visibilitymap.c
+ * The visibility map is a bitmap with two bits (all-visible and all-frozen
+ * per heap page.

/and all-frozen/and all-frozen)
closing round bracket is missing.

3.
visibilitymap.c
-/*#define TRACE_VISIBILITYMAP */
+#define TRACE_VISIBILITYMAP

why is this hash define opened?

4.
-visibilitymap_count(Relation rel)
+visibilitymap_count(Relation rel, bool for_visible)

This API needs to count set bits for either visibility info, frozen info
or both (if required), it seems better to have second parameter as
uint8 flags rather than bool. Also, if it is required to be called at most
places for both visibility and frozen bits count, why not get them
in one call?

5.
Clearing visibility and frozen bit separately for the dml
operations would lead locking/unlocking the corresponding buffer
twice, can we do it as a one operation. I think this is suggested
by Simon as well.

6.
- * Before locking the buffer, pin the visibility map page if it appears to
- * be necessary.
Since we haven't got the lock yet, someone else might be
+ * Before locking the buffer, pin the
visibility map if it appears to be
+ * necessary. Since we haven't got the lock yet, someone else might
be

Why you have deleted 'page' in above comment?

7.
@@ -3490,21 +3532,23 @@ l2:
UnlockTupleTuplock(relation, &(oldtup.t_self), *lockmode);

if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
+
bms_free
(hot_attrs);

Seems unnecessary change.

8.
@@ -1919,11 +1919,18 @@ index_update_stats(Relation rel,
{
BlockNumber relpages =
RelationGetNumberOfBlocks(rel);
BlockNumber relallvisible;
+ BlockNumber
relallfrozen;

if (rd_rel->relkind != RELKIND_INDEX)
- relallvisible =
visibilitymap_count(rel);
+ {
+ relallvisible = visibilitymap_count(rel,
true);
+ relallfrozen = visibilitymap_count(rel, false);
+ }
else
/* don't bother for indexes */
+ {
relallvisible = 0;
+
relallfrozen = 0;
+ }

I think in this function, you have forgotten to update the
relallfrozen value in pg_class.

9.
vacuumlazy.c

@@ -253,14 +258,16 @@ lazy_vacuum_rel(Relation onerel, int options,
VacuumParams *params,
* NB: We
need to check this before truncating the relation, because that
* will change ->rel_pages.
*/
-
if (vacrelstats->scanned_pages < vacrelstats->rel_pages)
+ if ((vacrelstats->scanned_pages +
vacrelstats->vmskipped_pages)
+ < vacrelstats->rel_pages)
{
- Assert(!scan_all);

Why you have removed this Assert, won't the count of
vacrelstats->scanned_pages + vacrelstats->vmskipped_pages be
equal to vacrelstats->rel_pages when scall_all = true.

10.
vacuumlazy.c
lazy_vacuum_rel()
..
+ scanned_all |= scan_all;
+

Why this new assignment is added, please add a comment to
explain it.

11.
lazy_scan_heap()
..
+ * Also, skipping even a single page accorind to all-visible bit of
+ * visibility map means that we can't update relfrozenxid, so we only want
+ * to do it if we can skip a goodly number. On the other hand, we count
+ * both how many pages we skipped according to all-frozen bit of visibility
+ * map and how many pages we freeze page, so we can update relfrozenxid if
+ * the sum of their is as many as tuples per page.

a.
typo
/accorind/according
b.
is the second part of comment (starting from On the other hand)
right? I mean you are comparing sum of pages skipped due to
all_frozen bit and number of pages freezed with tuples per page.
I don't understand how are they related?

12.
@@ -918,8 +954,13 @@ lazy_scan_heap(Relation onerel, LVRelStats
*vacrelstats,
else
{
num_tuples += 1;
+ ntup_in_blk += 1;
hastup = true;

+ /* If current tuple is already frozen, count it up */
+ if (HeapTupleHeaderXminFrozen(tuple.t_data))
+ already_nfrozen += 1;
+
/*
* Each non-removable tuple must be checked to see if it needs
* freezing. Note we already have exclusive buffer lock.

Here, if tuple is already_frozen, can't we just continue and
check for next tuple?

13.
+extern XLogRecPtr log_heap_frozenmap(RelFileNode rnode, Buffer heap_buffer,
+ Buffer fm_buffer);

It seems like this function is not used.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-07-07 11:59:46 Re: replication slot restart_lsn initialization
Previous Message Andres Freund 2015-07-07 11:41:00 Re: Comfortably check BackendPID with psql