From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
Cc: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIP: BRIN multi-range indexes |
Date: | 2021-01-12 17:42:10 |
Message-ID: | 8a40b7e4-357e-b0c2-e8b7-72001993fc32@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 1/12/21 6:28 PM, John Naylor wrote:
> On Sat, Dec 19, 2020 at 8:15 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com <mailto:tomas(dot)vondra(at)enterprisedb(dot)com>>
> wrote:
> > [12-20 version]
>
> Hi Tomas,
>
> The measurements look good. In case it fell through the cracks, my
> earlier review comments for Bloom BRIN indexes regarding minor details
> don't seem to have been addressed in this version. I'll point to earlier
> discussion for convenience:
>
> https://www.postgresql.org/message-id/CACPNZCt%3Dx-fOL0CUJbjR3BFXKgcd9HMPaRUVY9cwRe58hmd8Xg%40mail.gmail.com
> <https://www.postgresql.org/message-id/CACPNZCt%3Dx-fOL0CUJbjR3BFXKgcd9HMPaRUVY9cwRe58hmd8Xg%40mail.gmail.com>
>
> https://www.postgresql.org/message-id/CACPNZCuqpkCGt8%3DcywAk1kPu0OoV_TjPXeV-J639ABQWyViyug%40mail.gmail.com
> <https://www.postgresql.org/message-id/CACPNZCuqpkCGt8%3DcywAk1kPu0OoV_TjPXeV-J639ABQWyViyug%40mail.gmail.com>
>
Whooops :-( I'll go through those again, thanks for reminding me.
> > The improvements are fairly minor:
> >
> > 1) Rejecting bloom filters that are clearly too large (larger than page)
> > early. This is imperfect, as it works for individual index keys, not the
> > whole row. But per discussion it seems useful.
>
> I think this is good enough.
>
> > So based on this I'm tempted to just use the version with two hashes, as
> > implemented in 0005. It's much simpler than the partitioning scheme,
> > does not need any of the logic to generate primes etc.
>
> Sounds like the best engineering decision.
>
> Circling back to multi-minmax build times, I ran a couple quick tests on
> bigger hardware, and found that not only is multi-minmax slower than
> minmax, which is to be expected, but also slower than btree. (unlogged
> table ~12GB in size, maintenance_work_mem = 1GB, median of three runs)
>
> btree 38.3s
> minmax 26.2s
> multi-minmax 110s
>
> Since btree indexes are much larger, I imagine something algorithmic is
> involved. Is it worth digging further to see if some code path is taking
> more time than we would expect?
>
I suspect it'd due to minmax having to decide which "ranges" to merge,
which requires repeated sorting, etc. I certainly don't dare to claim
the current algorithm is perfect. I wouldn't have expected such big
difference, though - so definitely worth investigating.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2021-01-12 17:53:56 | Re: pg_upgrade test for binary compatibility of core data types |
Previous Message | Tom Lane | 2021-01-12 17:40:42 | pg_dump PublicationRelInfo objects need to be marked with owner |