Re: [PATCH] Never convert n_distinct < 2 values to a ratio when computing stats

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dan McGee <dan(at)archlinux(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Never convert n_distinct < 2 values to a ratio when computing stats
Date: 2012-03-25 15:59:22
Message-ID: 15920.1332691162@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> The bit about maybe not getting both t and f as MCVs on a Boolean does
> seem a little worrying, but I'm not sure whether it actually affects
> query planning in a materially negative way. Can you demonstrate a
> case where it matters?

If we were trying to force that to happen it would be wrong anyway.
Consider a column that contains *only* "t", or at least has so few
"f"'s that "f" appears never or only once in the selected sample.
(IIRC there is a clamp that prevents selecting anything as an MCV
unless it appears at least twice in the sample.)

Like Robert, I'm not convinced whether or not this is a reasonable
change, but arguing for it on the basis of boolean columns doesn't
seem very sound.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2012-03-25 16:12:33 occasional startup failures
Previous Message Robert Haas 2012-03-25 15:32:04 Re: Gsoc2012 Idea --- Social Network database schema