Re: pg_stat_statements and "IN" conditions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Pavel Trukhanov <pavel(dot)trukhanov(at)gmail(dot)com>
Subject: Re: pg_stat_statements and "IN" conditions
Date: 2022-03-10 17:32:08
Message-ID: CA+TgmoZwXyxUq6VKJWpuAPybVxzNPvq6Kapf1G-egy2tNsycmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 10, 2022 at 12:12 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > This seems incorrect, as the only feedback I've got was "this is a bad
> > idea", and no reaction on follow-up questions.
>
> I changed the status because it seems to me there is no chance of
> this being committed as-is.
>
> 1. I think an absolute prerequisite before we could even consider
> changing the query jumbler rules this much is to do the work that was
> put off when the jumbler was moved into core: that is, provide some
> honest support for multiple query-ID generation methods being used at
> the same time. Even if you successfully make a case for
> pg_stat_statements to act this way, other consumers of query IDs
> aren't going to be happy with it.

FWIW, I don't find this convincing at all. Query jumbling is already
somewhat expensive, and it seems unlikely that the same person is
going to want to jumble queries in one way for pg_stat_statements and
another way for pg_stat_broccoli or whatever their other extension is.
Putting a lot of engineering work into something with such a marginal
use case seems not worthwhile to me - and also likely futile, because
I don't see how it could realistically be made nearly as cheap as a
single jumble.

> 2. You haven't made a case for it. The original complaint was
> about different lengths of IN lists not being treated as equivalent,
> but this patch has decided to do I'm-not-even-sure-quite-what
> about treating different Params as equivalent. Plus you're trying
> to invoke eval_const_expressions in the jumbler; that is absolutely
> Not OK, for both safety and semantic reasons.

I think there are two separate points here, one about patch quality
and the other about whether the basic idea is good. I think the basic
idea is good. I do not contend that collapsing IN-lists of arbitrary
length is what everyone wants in all cases, but it seems entirely
reasonable to me to think that it is what some people want. So I would
say just make it a parameter and let people configure whichever
behavior they want. My bet is 95% of users would prefer to have it on,
but even if that's wildly wrong, having it as an optional behavior
hurts nobody. Let it be off by default and let those who want it flip
the toggle. On the code quality issue, I haven't read the patch but
your concerns sound well-founded to me from reading what you wrote.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2022-03-10 17:42:14 Re: Parameter for planner estimate of recursive queries
Previous Message Joshua Brindle 2022-03-10 17:26:42 Re: role self-revocation