Re: On-disk bitmap index patch

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dann Corbit <DCorbit(at)connx(dot)com>, Luke Lonergan <llonergan(at)greenplum(dot)com>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Jie Zhang <jzhang(at)greenplum(dot)com>, Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, Josh Berkus <josh(at)agliodbs(dot)com>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: On-disk bitmap index patch
Date: 2006-07-28 22:12:20
Message-ID: 44CA8BC4.2080204@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
>
> Both of these pages say up front that they are considering read-only
> data. So one of the questions that has to be answered (and the
> submitters have been entirely mum about) is exactly how bad is the
> update performance? If it's really awful that's going to constrain
> the use cases quite a lot, whereas merely "a bit slower than btree"
> wouldn't be such a problem.
>
> In any case, arguing that other DBs find it's a win will cut no ice
> with me. See adjacent discussion about hash indexes --- those *ought*
> to be a win, but they aren't in Postgres, for reasons that are still
> guesses. The translation gap between other DBs' experience and ours
> can be large.
>

Notwithstanding that, I have a couple of non-postgres data points /
anecdotes on this.

Back in my days as an Ingres DBA in the mid 90s, our fairly highly tuned
system used hash organised tables only for small fairly static
lookup-type tables (state codes, postcodes, etc). Everything that was
more dynamic was done with btree indexed tables.

A few years later, I was producing very large tables of email addresses
using BDB. I quickly stopped using hash tables when I found that the
reorganisation penalty was huge. Switching to btree worked just fine,
with no sudden performance blip. This might not be directly relevant,
but clearly the bucket size is.

I guess what we need to demonstrate is that the better hash performance
will actually persist to a scale where it is actually worth it - surely
for very small tables the index method won't matter much anyway.

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2006-07-28 22:43:03 Re: The vacuum-ignore-vacuum patch
Previous Message Peter Eisentraut 2006-07-28 22:09:27 DTrace enabled build fails