From: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Claudio Freire <klaussfreire(at)gmail(dot)com> |
Subject: | Re: Sparse bit set data structure |
Date: | 2019-03-14 05:15:58 |
Message-ID: | 9C5D1ADA-16AC-4988-A058-8ECAE19F79C8@yandex-team.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
> 14 марта 2019 г., в 0:18, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> написал(а):
> <0001-Add-SparseBitset-to-hold-a-large-set-of-64-bit-ints-.patch><0002-Andrey-Borodin-s-test_blockset-tool-adapted-for-Spar.patch>
That is very interesting idea. Basically, B-tree and radix tree is a tradeoff between space and time.
In general, lookup into radix tree will touch less CPU cache lines.
In this terms Bitmapset is on most performant and memory-wasteful side: lookup into Bitmapset is always 1 cache line.
Performance of radix tree can be good in case of skewed distribution, while B-tree will be OK on uniform. I think that distribution of GiST inner pages is uniform, distribution of empty leafs is... I have no idea, let's consider uniform too.
I'd review this data structure ASAP. I just need to understand: do we aim to v12 or v13? (I did not solve concurrency issues in GiST VACUUM yet, but will fix them at weekend)
Also, maybe we should consider using RoaringBitmaps? [0]
As a side not I would add that while balanced trees are widely used for operations on external memory, there are more performant versions for main memory. Like AVL-tree and RB-tree.
Brest regards, Andrey Borodin.
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro HORIGUCHI | 2019-03-14 05:16:30 | Re: Timeout parameters |
Previous Message | Michael Paquier | 2019-03-14 04:54:10 | Re: current_logfiles not following group access and instead follows log_file_mode permissions |