Re: pg_stat_statements fingerprinting logic and ArrayExpr

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements fingerprinting logic and ArrayExpr
Date: 2013-12-10 23:08:58
Message-ID: 12017.1386716938@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Right, but the flip side is that you could collapse things that people
> don't want collapsed. If you've got lots of query that differ only in
> that some of them say user_id IN (const1, const2) and others say
> user_id IN (const1, const2, const3) and the constants vary a lot, then
> of course this seems attractive. On the other hand if you have two
> queries and one of them looks like this:

> WHERE status IN ('active') AND user_id = ?

> and the other looks like this:

> WHERE status IN ('inactive', 'deleted') AND user_id = ?

> ...it might actually annoy you to have those two things conflated;
> it's easy to imagine one having much different performance
> characteristics than the other.

Of course, "status = 'active'" and "status = 'deleted'" might have very
different performance characteristics all by themselves, yet we've
already hard-wired a decision that pg_stat_statements will conflate them.
So I don't think the above argument holds a lot of water.

A different point of view is that it's more or less an implementation
artifact that pg_stat_statements doesn't already see the cases as
equivalent; that happens only because it looks at the querytree before
the planner gets around to constant-folding ARRAY[1,2,3] into the single
Const '{1,2,3}'::int[].

So my objection to what Peter is suggesting is not that it's a bad idea
in isolation, but that I don't see where he's going to stop, short of
reinventing every query-normalization behavior that exists in the planner.
If this particular case is worthy of fixing with a hack in the
fingerprinter, aren't there going to be dozens more with just as good
claims? (Perhaps not, but what's the basis for thinking this is far
worse than any other normalization issue?)

I'm wondering whether this doesn't indicate that we need to rethink where
the fingerprinter has been plugged in. I'm not sure that somewhere in
the planner, post-constant-folding, would be a better place; but it's
worth thinking about.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2013-12-10 23:12:53 Re: Dynamic Shared Memory stuff
Previous Message Kevin Grittner 2013-12-10 23:06:22 Re: Reference to parent query from ANY sublink