Re: Hash Indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash Indexes
Date: 2016-12-20 09:51:32
Message-ID: CAA4eK1+Qp8bwjmxUuGORjNtNUZAPCEG08OBzbdGKt-tGVxYNhg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 19, 2016 at 11:05 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sun, Dec 18, 2016 at 8:54 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> I committed remove-hash-wrtbuf and fix_dirty_marking_v1 but I've got
>>> some reservations about fix_lock_chaining_v1. ISTM that the natural
>>> fix here would be to change the API contract for _hash_freeovflpage so
>>> that it doesn't release the lock on the write buffer. Why does it
>>> even do that? I think that the only reason why _hash_freeovflpage
>>> should be getting wbuf as an argument is so that it can handle the
>>> case where wbuf happens to be the previous block correctly.
>>
>> Yeah, as of now that is the only case, but for WAL patch, I think we
>> need to ensure that the action of moving all the tuples to the page
>> being written and the overflow page being freed needs to be logged
>> together as an atomic operation.
>
> Not really. We can have one operation that empties the overflow page
> and another that unlinks it and makes it free.
>

We have mainly four actions for squeeze operation, add tuples to the
write page, empty overflow page, unlinks overflow page, make it free
by setting the corresponding bit in overflow page. Now, if we don't
log the changes to write page and freeing of overflow page as one
operation, then won't query on standby can either see duplicate tuples
or miss the tuples which are freed in overflow page.

>> Now apart from that, it is
>> theoretically possible that write page will remain locked for multiple
>> overflow pages being freed (when the page being written has enough
>> space that it can accommodate tuples from multiple overflow pages). I
>> am not sure if it is worth worrying about such a case because
>> practically it might happen rarely. So, I have prepared a patch to
>> retain a lock on wbuf in _hash_freeovflpage() as suggested by you.
>
> Committed.
>

Thanks.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-12-20 09:51:36 Re: Declarative partitioning - another take
Previous Message Petr Jelinek 2016-12-20 09:48:23 Re: Logical Replication WIP