From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] A design for amcheck heapam verification |
Date: | 2018-03-31 01:20:31 |
Message-ID: | 20180331012031.ftf4vbiwabsg7uvb@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2018-03-29 19:42:42 -0700, Peter Geoghegan wrote:
> >> + /*
> >> + * Aim for two bytes per element; this is sufficient to get a false
> >> + * positive rate below 1%, independent of the size of the bitset or total
> >> + * number of elements. Also, if rounding down the size of the bitset to
> >> + * the next lowest power of two turns out to be a significant drop, the
> >> + * false positive rate still won't exceed 2% in almost all cases.
> >> + */
> >> + bitset_bytes = Min(bloom_work_mem * 1024L, total_elems * 2);
> >> + /* Minimum allowable size is 1MB */
> >> + bitset_bytes = Max(1024L * 1024L, bitset_bytes);
> >
> > Some upcasting might be advisable, to avoid dangers of overflows?
>
> When it comes to sizing work_mem, using long literals to go from KB to
> bytes is How It's Done™. I actually think that's silly myself, because
> it's based on the assumption that long is wider than int, even though
> it isn't on Windows. But that's okay because we have the old work_mem
> size limits on Windows.
> What would the upcasting you have in mind look like?
Just use UINT64CONST()? Let's try not to introduce further code that'll
need to get painfully fixed.
>
> >> + /* Use 64-bit hashing to get two independent 32-bit hashes */
> >> + hash = DatumGetUInt64(hash_any_extended(elem, len, filter->seed));
> >
> > Hm. Is that smart given how some hash functions are defined? E.g. for
> > int8 the higher bits aren't really that independent for small values:
>
> Robert suggested that I do this. I don't think that we need to make it
> about the quality of the hash function that we have available. That
> really seems like a distinct question to me. It seems clear that this
> ought to be fine (or should be fine, if you prefer). I understand why
> you're asking about this, but it's not scalable to ask every user of a
> hash function to care that it might be a bit crap. Hash functions
> aren't supposed to be a bit crap.
Well, they're not supposed to be, but they are. Practical realities
matter... But I think it's moot because we don't use any of the bad
ones, I only got to that later...
> >> +DROP FUNCTION bt_index_check(regclass);
> >> +CREATE FUNCTION bt_index_check(index regclass,
> >> + heapallindexed boolean DEFAULT false)
> >> +RETURNS VOID
> >> +AS 'MODULE_PATHNAME', 'bt_index_check'
> >> +LANGUAGE C STRICT PARALLEL RESTRICTED;
> >
> > This breaks functions et al referencing the existing function.
>
> This sounds like a general argument against ever changing a function's
> signature. It's not like we haven't done that a number of times in
> extensions like pageinspect. Does it really matter?
Yes, it does. And imo we shouldn't.
> > Also, I
> > don't quite recall the rules, don't we have to drop the function from
> > the extension first?
>
> But...I did drop the function?
I mean in the ALTER EXTENSION ... DROP FUNCTION sense.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2018-03-31 01:24:11 | Re: [HACKERS] Planning counters in pg_stat_statements |
Previous Message | Peter Geoghegan | 2018-03-31 01:11:50 | Re: [HACKERS] A design for amcheck heapam verification |