Re: Microvacuum support for Hash Index

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Microvacuum support for Hash Index
Date: 2017-03-14 19:45:29
Message-ID: CA+TgmoaySYOxJaH+WsMai-3grLQLF6VFPrMrO7WERnM4htRMvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 14, 2017 at 8:02 AM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> Attached is the v6 patch for microvacuum in hash index rebased on top
> of 'v10 patch for WAL in hash index - [1]' and 'v1 patch for WAL
> consistency check for hash index - [2]'.
>
> [1] - https://www.postgresql.org/message-id/CAA4eK1%2Bk5wR4-kAjPqLoKemuHayQd6RkQQT9gheTfpn%2B72o1UA%40mail.gmail.com
> [2] - https://www.postgresql.org/message-id/flat/CAGz5QCJLERUn_zoO0eDv6_Y_d0o4tNTMPeR7ivTLBg4rUrJdwg(at)mail(dot)gmail(dot)com#CAGz5QCJLERUn_zoO0eDv6_Y_d0o4tNTMPeR7ivTLBg4rUrJdwg@mail.gmail.com
>
> Also, the patch (mask_hint_bit_LH_PAGE_HAS_DEAD_TUPLES.patch) to mask
> 'LH_PAGE_HAS_DEAD_TUPLES' flag which got added as a part of
> Microvacuum patch is attached with this mail.

Generally, this patch looks like a pretty straightforward adaptation
of the similar btree mechanism to hash indexes, so if it works for
btree it ought to work here, too. But I noticed a few things while
reading through it.

+ /* Get RelfileNode from relation OID */
+ rel = relation_open(htup->t_tableOid, NoLock);
+ rnode = rel->rd_node;
+ relation_close(rel, NoLock);
itup->t_tid = htup->t_self;
- _hash_doinsert(index, itup);
+ _hash_doinsert(index, itup, rnode);

This is an awfully low-level place to be doing something like this.
I'm not sure exactly where this should be happening, but not in the
per-tuple callback.

+ /*
+ * If there's nothing running on the standby we don't need to derive a
+ * full latestRemovedXid value, so use a fast path out of here. This
+ * returns InvalidTransactionId, and so will conflict with all HS
+ * transactions; but since we just worked out that that's zero people,
+ * it's OK.
+ */
+ if (CountDBBackends(InvalidOid) == 0)
+ return latestRemovedXid;

I see that this comment (and most of what surrounds it) was just
copied from the existing btree example, but isn't there a discrepancy
between the comment and the code? It says it returns
InvalidTransactionId, but it doesn't. Also, you dropped the XXX from
the btree original, and the following reachedConsistency check.

+ * Hash Index delete records can conflict with standby queries.You might
+ * think that vacuum records would conflict as well, but we've handled

But they're not called delete records in a hash index. The function
is called hash_xlog_vacuum_one_page. The corresponding btree function
is btree_xlog_delete. So this comment needs a little more updating.

+ if (IsBufferCleanupOK(bucket_buf))
+ {
+ _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf,
+ (buf == bucket_buf) ? true : false,
+ hnode);
+ if (bucket_buf != buf)
+ LockBuffer(bucket_buf, BUFFER_LOCK_UNLOCK);
+
+ if (PageGetFreeSpace(page) >= itemsz)
+ break; /* OK, now we have enough space */
+ }

I might be missing something, but I don't quite see why this needs a
cleanup lock on the primary bucket page. I would think a cleanup lock
on the page we're actually modifying would suffice, and I think if
that's correct it would be a better way to go. If that's not correct,
then I think the comments needs some work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-14 19:46:43 Re: logical replication access control patches
Previous Message Mark Kirkwood 2017-03-14 19:43:57 Re: Write Ahead Logging for Hash Indexes