Re: pg_stat_statements fingerprinting logic and ArrayExpr

From: Daniel Farina <daniel(at)heroku(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements fingerprinting logic and ArrayExpr
Date: 2013-12-10 23:24:00
Message-ID: CAAZKuFYPG7jt6=ZODaHffc-_W7-CgNbnwTzOJofTs9S_pc-KTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 10, 2013 at 3:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> So my objection to what Peter is suggesting is not that it's a bad idea
> in isolation, but that I don't see where he's going to stop, short of
> reinventing every query-normalization behavior that exists in the planner.
> If this particular case is worthy of fixing with a hack in the
> fingerprinter, aren't there going to be dozens more with just as good
> claims? (Perhaps not, but what's the basis for thinking this is far
> worse than any other normalization issue?)

Qualitatively, the dynamic length values list is the worst offender.

There is no algebraic solution to where to stop with normalizations,
but as Peter points out, that bridge has been crossed already:
assumptions have already been made that toss some potentially useful
information already, and yet the program is undoubtedly practical.

Based on my own experience (which sounds similar to Andres's), I am
completely convinced the canonicalization he proposes here is more
practical than the current definition to a large degree, so I suggest
the goal is worthwhile.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2013-12-10 23:26:56 Re: ANALYZE sampling is too good
Previous Message Andres Freund 2013-12-10 23:20:27 Re: Dynamic Shared Memory stuff