Re: A design for amcheck heapam verification

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A design for amcheck heapam verification
Date: 2017-05-01 20:39:45
Message-ID: CAH2-WzmiKCsFL9RsYn98WGvQN9NN+BHTD-jA=aRRL7QQt_tTcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 1, 2017 at 12:46 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Bloom filters are one of those things that come up on this mailing
> list incredibly frequently but rarely get used in committed code; thus
> far, contrib/bloom is the only example we've got, and not for lack of
> other proposals.

They certainly are a fashionable data structure, but it's not as if
they're a new idea. The math behind them is very well understood. They
solve one narrow class of problem very well.

> One problem is that Bloom filters assume you can get
> n independent hash functions for a given value, which we have not got.
> That problem would need to be solved somehow. If you only have one
> hash function, the size of the required bloom filter probably gets
> very large.

I don't think that that's a problem, because you really only need 2
hash functions [1], which we have already (recall that Andres added
Murmur hash to Postgres 10). It's an area that I'd certainly need to
do more research on if I'm to go forward with bloom filters, but I'm
pretty confident that there is a robust solution to the practical
problem of not having arbitrary many hash functions at hand. (I think
that you rarely need all that many independent hash functions, in any
case.)

It isn't that hard to evaluate whether or not an implementation has
things right, at least for a variety of typical cases. We know how to
objectively evaluate a hash function while making only some pretty
mild assumptions.

> When hashing index and heap tuples, do you propose to include the heap
> TID in the data getting hashed? I think that would be a good idea,
> because otherwise you're only verifying that every heap tuple has an
> index pointer pointing at something, not that every heap tuple has an
> index tuple pointing at the right thing.

Yes -- I definitely want to hash the heap TID from each IndexTuple.

> I wonder if it's also worth having a zero-error mode, even if it runs
> for a long time. Scan the heap, and probe the index for the value
> computed from each heap tuple. Maybe that's so awful that nobody
> would ever use it, but I'm not sure. It might actually be simpler to
> implement than what you have in mind.

It's easy if you don't mind that the implementation will be an ad-hoc
nested loop join. I guess I could do that too, if only because it
won't be that hard, and that's really what you want when you know you
have corruption. Performance will probably be prohibitively poor when
verification needs to be run in any kind of routine way, which is a
problem if that's the only way it can work. My sense is that
verification needs to be reasonably low overhead, and it needs to
perform pretty consistently, even if you only use it for
stress-testing new features.

To reiterate, another thing that makes a bloom filter attractive is
how it simplifies resource management relative to an approach
involving sorting or a hash table. There are a bunch of edge cases
that I don't have to worry about around resource management (e.g., a
subset of very wide outlier IndexTuples, or two indexes that are of
very different sizes associated with the same table that need to
receive an even share of memory).

As I said, even if I was totally willing to duplicate the effort that
went into respecting work_mem as a budget within places like
tuplesort.c, having as little infrastructure code as possible is a
specific goal for amcheck.

[1] https://www.eecs.harvard.edu/~michaelm/postscripts/rsa2008.pdf
--
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-05-01 20:48:26 Faster pg_timezone_names view
Previous Message Andres Freund 2017-05-01 20:38:48 Potential hot-standby bug around xacts committed but in xl_running_xacts