From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <rhaas(at)postgresql(dot)org> |
Subject: | Re: HyperLogLog.c and pg_leftmost_one_pos32() |
Date: | 2020-07-30 16:21:23 |
Message-ID: | 1a112df8b6d82c2701c57f9457f872ecc20128b3.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 2020-07-29 at 17:32 -0700, Peter Geoghegan wrote:
> How did you test this? What kind of difference are we talking about?
Essentially:
initHyperLogLog(&hll, 5)
for i in 0 .. one billion
addHyperLogLog(&hll, hash(i))
estimateHyperLogLog
The numbers are the same regardless of bwidth.
Before my patch, it takes about 15.6s. After my patch, it takes about
6.6s, so it's more than a 2X speedup (including the hash calculation).
As a part of HashAgg, when I test the worst case, it goes from a 4-5%
penalty to ~1% (within noise).
> I think that you should change back the rhs() variable names to match
> HyperLogLog upstream (as well as the existing rhs() comments).
Done.
> > I think it's overall an improvement, but
> > addHyperLogLog() itself seemed to show up as a cost, so it can hurt
> > spilled-but-still-in-memory cases. I'd also like to backpatch this
> > to
> > 13 (as I already did for 9878b643), if that's acceptable.
>
> I still wonder if it was ever necessary to add HLL to abbreviated
> keys. It only served to avoid some pretty narrow worse cases, at the
> expense of typical cases. Given that the existing users of HLL are
> pretty narrow, and given the importance of preserving the favorable
> performance characteristics of hash aggregate, I'm inclined to agree
> that it's worth backpatching to 13 now. Assuming it is a really
> measurable cost in practice.
Yes, the difference (at least in a tight loop, on my machine) is not
subtle. I went ahead and committed and backpatched.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2020-07-30 17:16:19 | Re: HyperLogLog.c and pg_leftmost_one_pos32() |
Previous Message | Stephen Frost | 2020-07-30 14:37:56 | Re: Missing CFI in hlCover()? |