Re: POC: converting Lists into arrays

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: converting Lists into arrays
Date: 2019-07-17 14:16:18
Message-ID: 27721.1563372978@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> I've only looked at 0002. Here are my thoughts:

Thanks for looking!

> get_tables_to_cluster:
> Looks fine. It's a heap scan. Any previous order was accidental, so if
> it causes issues then we might need to think of using a more
> well-defined order for CLUSTER;

Check.

> get_rels_with_domain:
> This is a static function. Changing the order of the list seems to
> only really affect the error message that a failed domain constraint
> validation could emit. Perhaps this might break someone else's tests,
> but they should just be able to update their expected results.

Also, this is already dependent on the order of pg_depend entries,
so it's not terribly stable anyhow.

> get_relation_statistics:
> RelationGetStatExtList does not seem to pay much attention to the
> order it returns its results, so I don't think the order we apply
> extended statistics was that well defined before. We always attempt to
> use the stats with the most matching columns in
> choose_best_statistics(), so I think
> for people to be affected they'd either multiple stats with the same
> sets of columns or a complex clause that equally well matches two sets
> of stats, and in that case the other columns would be matched to the
> other stats later... I'd better check that... erm... actually that's
> not true. I see statext_mcv_clauselist_selectivity() makes no attempt
> to match the clause list to another set of stats after finding the
> first best match. I think it likely should do that.
> estimate_multivariate_ndistinct() seems to have an XXX comment
> mentioning thoughts about the stability of which stats are used, but
> nothing is done.

I figured that (a) this hasn't been around so long that anybody's
expectations are frozen, and (b) if there is a meaningful difference in
results then it's probably incumbent on the extstats code to do better.
That seems to match your conclusions. But I don't see any regression
test changes from making this change, so at least in simple cases it
doesn't matter.

(As you say, any extstats changes that we conclude are needed should
be a separate patch.)

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2019-07-17 14:39:52 Re: [HACKERS] WIP: Aggregation push-down
Previous Message Daniel Gustafsson 2019-07-17 14:12:28 Re: POC: converting Lists into arrays