Re: [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients
Date: 2017-03-24 19:02:30
Message-ID: 20170324190230.bbcxbbdf6267dqhh@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Robert Haas wrote:
> On Fri, Mar 24, 2017 at 1:16 PM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > Implement multivariate n-distinct coefficients
>
> dromedary and arapaima have failures like this, which seems likely
> related to this commit:
>
> EXPLAIN
> SELECT COUNT(*) FROM ndistinct GROUP BY a, d;
> QUERY PLAN
> ---------------------------------------------------------------------
> ! HashAggregate (cost=225.00..235.00 rows=1000 width=16)
> Group Key: a, d
> ! -> Seq Scan on ndistinct (cost=0.00..150.00 rows=10000 width=8)
> (3 rows)

Yes. What seems to be going on here, is that both arapaima and
dromedary are 32 bit machines; all the 64 bit ones are passing (except
for prion which showed a real relcache bug, which I already stomped).
Now, the difference is that the total cost in those machines for seqscan
is 155 instead of 150. Tomas suggests that this happens because
MAXALIGN is different, leading to packing tuples differently: the
expected cost (on our laptop's 64 bit) is 155, and the cost we get in 32
bit arch is 150 -- so 5 pages of difference. We insert 1000 rows on the
table; 4 bytes per tuple would amount to 40 kB, which is exactly 5
pages.

I'll push an alternate expected file for this test, which we think is
the simplest fix.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2017-03-24 19:09:54 Re: Re: [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients
Previous Message Peter Eisentraut 2017-03-24 18:52:17 pgsql: Check that published table exists on subscriber

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-03-24 19:07:28 Re: WIP: Faster Expression Processing v4
Previous Message Robert Haas 2017-03-24 19:00:31 Re: increasing the default WAL segment size