From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Vitaliy Garnashevich <vgarnashevich(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Bitmap scan is undercosted? - boolean correlation |
Date: | 2017-12-03 04:04:30 |
Message-ID: | 20171203040430.GC14008@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On Sat, Dec 02, 2017 at 05:27:51PM -0800, Jeff Janes wrote:
> I think the non-extended stats code also has trouble with booleans.
> pg_stats gives me a correlation of 0.8 or higher for the flag column.
It's not due to the boolean though; you see the same thing if you do:
CREATE INDEX aaa_f ON aaa((flag::text));
ANALYZE aaa;
correlation | 0.81193
or:
ALTER TABLE aaa ADD flag2 int; UPDATE aaa SET flag2= flag::int
correlation | 0.81193
I think it's caused by having so few (2) values to correlate.
most_common_vals | {f,t}
most_common_freqs | {0.9014,0.0986}
correlation | 0.822792
It thinks there's somewhat-high correlation since it gets a list of x and y
values (integer positions by logical and physical sort order) and 90% of the x
list (logical value) are the same value ('t'), and the CTIDs are in order on
the new index, so 90% of the values are 100% correlated.
It improves (by which I mean here that it spits out a lower number) if it's not
a 90/10 split:
CREATE TABLE aaa5 AS SELECT (id%100)::int num, (id%10>5)::bool flag FROM generate_series(1, 10000000) id;
CREATE INDEX ON aaa5 (flag);
tablename | aaa5
attname | flag
correlation | 0.522184
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2017-12-03 06:21:56 | pg_dumpall -r -c try to drop user postgres |
Previous Message | chenhj | 2017-12-03 03:27:57 | Re:Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files |
From | Date | Subject | |
---|---|---|---|
Next Message | Vitaliy Garnashevich | 2017-12-03 21:15:01 | Re: Bitmap scan is undercosted? |
Previous Message | Jeff Janes | 2017-12-03 01:27:51 | Re: Bitmap scan is undercosted? |