Re: Minmax indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minmax indexes
Date: 2013-09-27 06:40:44
Message-ID: CAA4eK1+YFF0wskPQhYs2kLW0abJMmFUhOotTcS9VgCkLQ_WgBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 27, 2013 at 11:49 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Thu, Sep 26, 2013 at 1:46 AM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
>> Amit Kapila escribió:
>>> On Sun, Sep 15, 2013 at 5:44 AM, Alvaro Herrera
>>> <alvherre(at)2ndquadrant(dot)com> wrote:
>>
>
>>
>>> On Windows, patch gives below compilation errors:
>>> src\backend\access\minmax\mmtuple.c(96): error C2057: expected
>>> constant expression
>>
>> I have fixed all these compile errors (fix attached). Thanks for
>> reporting them. I'll post a new version shortly.
>
> Thanks for fixing it. In last few days I had spent some time
> reading about minmax or equivalent indexes in other databases (Netezza
> and Oracle) and going through some parts of your proposal. Its a bit
> bigger patch and needs much more time, but I would like to share my
> findings/thoughts I had developed till now.
>
> Firstly about interface and use case, as far as I could understand
> other databases provide this index automatically rather than having a
> separate Create Index command which may be because such an index can
> be mainly useful when the data is ordered or if it's distributed in
> such a way that it's quite useful for repeatedly executing queries.
> You have proposed it as a command which means user needs to take care
> of it which I find is okay for first version, later may be we can also
> have some optimisations so that it can get created automatically.
> For the page range, If I read correctly, currently you have used hash
> define, do you want to expose it to user in some way like GUC or
> maintain it internally and assign the right value based on performance
> of different queries?
>
> Operations on this index seems to be very fast, like Oracle has this
> as an in-memory structure and I read in Netezza that write operations
> doesn't carry any significant overhead for zone maps as compare to
> other indexes, so shouldn't we consider it to be without WAL logged?
> OTOH I think because these structures get automatically created in
> those databases, so it might be okay but if we provide it as a
> command, then user might be bothered if he didn't find it
> automatically on server restart.
>
> Few Questions and observations:
> 1.
> + When a new heap tuple is inserted in a summarized page range, it is
> possible to
> + compare the existing index tuple with the new heap tuple. If the
> heap tuple is
> + outside the minimum/maximum boundaries given by the index tuple for
> any indexed
> + column (or if the new heap tuple contains null values but the index tuple
> + indicate there are no nulls), it is necessary to create a new index tuple with
> + the new values. To do this, a new index tuple is inserted, and the
> reverse range
> + map is updated to point to it. The old index tuple is left in
> place, for later
> + garbage collection.
>
>
> Is there a reason why we can't directly update the value rather then
> new insert in index, as I understand for other indexes like btree
> we do this because we might need to rollback, but here even if after
> updating the min or max value, rollback happens, it will not cause
> any harm (tuple loss).
>
> 2.
> + If the reverse range map points to an invalid TID, the corresponding
> page range
> + is not summarized.
>
> 3.
> It might be better if you can mention when range map will point to an
> invalid TID, it's not explained in your proposal, but you have used it
> in you proposal to explain some other things.
>
> 4.
> Range reverse map is a good terminology, but isn't Range translation
> map better. I don't mind either way, it's just a thought came to my
> mind while understanding concept of Range Reverse map.
>
> 5.
> /*
> * As above, except that instead of scanning the complete heap, only the given
> * range is scanned. Scan to end-of-rel can be signalled by passing
> * InvalidBlockNumber as end block number.
> */
> double
> IndexBuildHeapRangeScan(Relation heapRelation,
> Relation indexRelation,
> IndexInfo *indexInfo,
> bool allow_sync,
> BlockNumber start_blockno,
> BlockNumber numblocks,
> IndexBuildCallback callback,
> void *callback_state)
>
> In comments you have used end block number, which parameter does it
> refer to? I could see only start_blockno and numb locks?
>
> 6.
> currently you are passing 0 as start block and InvalidBlockNumber as
> number of blocks, what's the logic for it?
> return IndexBuildHeapRangeScan(heapRelation, indexRelation,
> indexInfo, allow_sync,
> 0, InvalidBlockNumber,
> callback, callback_state);

I got it, I think here it means scan all the pages.

> 7.
> In mmbuildCallback, it only add's tuple to minmax index, if it
> satisfies page range, else this can lead to waste of big scan incase
> page range is large (1280 pages as you mentiones in one of your
> mails). Why can't we include it end of scan?
>
>
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2013-09-27 06:42:21 Re: FW: REVIEW: Allow formatting in log_line_prefix
Previous Message Amit Kapila 2013-09-27 06:19:06 Re: Minmax indexes