Re: Microvacuum support for Hash Index

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Microvacuum support for Hash Index
Date: 2017-03-15 15:53:49
Message-ID: CAE9k0PmFe-xHVLL8bkwaUK9Pax7KYmk+QtcKN-HmfdLgHPt51w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> I think one possibility is to get it using
> indexrel->rd_index->indrelid in _hash_doinsert().
>

Thanks. I have tried the same in the v7 patch shared upthread.

>
>>
>> But they're not called delete records in a hash index. The function
>> is called hash_xlog_vacuum_one_page. The corresponding btree function
>> is btree_xlog_delete. So this comment needs a little more updating.
>>
>> + if (IsBufferCleanupOK(bucket_buf))
>> + {
>> + _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf,
>> + (buf == bucket_buf) ? true : false,
>> + hnode);
>> + if (bucket_buf != buf)
>> + LockBuffer(bucket_buf, BUFFER_LOCK_UNLOCK);
>> +
>> + if (PageGetFreeSpace(page) >= itemsz)
>> + break; /* OK, now we have enough space */
>> + }
>>
>> I might be missing something, but I don't quite see why this needs a
>> cleanup lock on the primary bucket page. I would think a cleanup lock
>> on the page we're actually modifying would suffice, and I think if
>> that's correct it would be a better way to go.
>>
>
> Offhand, I also don't see any problem with it.

I too found no problem with that...

>
> Few other comments:
> 1.
> + if (ndeletable > 0)
> + {
> + /* No ereport(ERROR) until changes are logged */
> + START_CRIT_SECTION();
> +
> + PageIndexMultiDelete(page, deletable, ndeletable);
> +
> + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page);
> + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES;
>
> You clearing this flag while logging the action, but same is not taken
> care during replay. Any reasons?

That's because we conditionally WAL Log this flag status and when we
do so, we take a it's FPI.

>
> 2.
> + /* No ereport(ERROR) until changes are logged */
> + START_CRIT_SECTION
> ();
> +
> + PageIndexMultiDelete(page, deletable, ndeletable);
> +
> + pageopaque =
> (HashPageOpaque) PageGetSpecialPointer(page);
> + pageopaque->hasho_flag &=
> ~LH_PAGE_HAS_DEAD_TUPLES;
> +
> + /*
> + * Write-lock the meta page so that we can
> decrement
> + * tuple count.
> + */
> + LockBuffer(metabuf,
> BUFFER_LOCK_EXCLUSIVE);
>
> The lock on buffer should be acquired before critical section.

Point taken. I have taken care of it in the v7 patch.

>
> 3.
> -
> /*
> - * Since this can be redone later if needed, mark as a
> hint.
> - */
> - MarkBufferDirtyHint(buf, true);
> +
> if (so->numKilled < MaxIndexTuplesPerPage)
> + {
> + so-
>>killedItems[so->numKilled].heapTid = so->hashso_heappos;
> + so-
>>killedItems[so->numKilled].indexOffset =
> +
> ItemPointerGetOffsetNumber(&(so->hashso_curpos));
> + so->numKilled++;
> +
> }
>
> By looking at above code, the first thing that comes to mind is when
> numKilled can become greater than MaxIndexTuplesPerPage and why we are
> ignoring the marking of dead tuples when it becomes greater than
> MaxIndexTuplesPerPage. After looking at similar btree code, I realize
> that it could
> happen if user reverses the scan direction. I think you should
> mention in comments that see btgettuple to know the reason of
> numKilled overun test or something like that.

Added comment. Please refer to v7 patch.

>
> 4.
> + * We match items by heap TID before assuming they are the right ones to
> + * delete. If an item has
> moved off the current page due to a split, we'll
> + * fail to find it and do nothing (this is not an
> error case --- we assume
> + * the item will eventually get marked in a future indexscan).
> + */
> +void
> +_hash_kill_items(IndexScanDesc scan)
>
> I think this comment doesn't make much sense for hash index because
> item won't move off from the current page due to split, only later
> cleanup can remove it.

Yes. The reason being, no cleanup will happen when scan in progress.
Corrected it .

>
> 5.
>
> +
> /*
> * Maximum size of a hash index item (it's okay to have only one per page)
>
> Spurious white space change.

Fixed.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2017-03-15 15:56:06 Re: SQL/JSON in PostgreSQL
Previous Message Michael Paquier 2017-03-15 15:48:11 Re: scram and \password