Re: pg_stat_statements and "IN" conditions

From: David Geier <geidav(dot)pg(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, vignesh C <vignesh21(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Pavel Trukhanov <pavel(dot)trukhanov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_stat_statements and "IN" conditions
Date: 2023-02-11 10:03:36
Message-ID: dda65b89-4ebf-261c-de9c-7b91ab9e4047@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2/9/23 16:02, Dmitry Dolgov wrote:
>> Unfortunately, rebase is needed again due to recent changes in queryjumblefuncs ( 9ba37b2cb6a174b37fc51d0649ef73e56eae27fc )
I reviewed the last patch applied to some commit from Feb. 4th.
>> It seems a little strange to me that with const_merge_threshold = 1, such a test case gives the same result as with const_merge_threshold = 2
>>
>> select pg_stat_statements_reset();
>> set const_merge_threshold to 1;
>> select * from test where i in (1,2,3);
>> select * from test where i in (1,2);
>> select * from test where i in (1);
>> select query, calls from pg_stat_statements order by query;
>>
>> query | calls
>> -------------------------------------+-------
>> select * from test where i in (...) | 2
>> select * from test where i in ($1) | 1
>>
>> Probably const_merge_threshold = 1 should produce only "i in (...)"?
> Well, it's not intentional, probably I need to be more careful with
> off-by-one. Although I agree to a certain extent with Peter questioning

Please add tests for all the corner cases. At least for (1) IN only
contains a single element and (2) const_merge_threshold = 1.

Beyond that:

- There's a comment about find_const_walker(). I cannot find that
function anywhere. What am I missing?

- What about renaming IsConstList() to IsMergableConstList().

- Don't you intend to use the NUMERIC data column in SELECT * FROM
test_merge_numeric WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)?
Otherwise, the test is identical to previous test cases and you're not
checking for what happens with NUMERICs which are wrapped in FuncExpr
because of the implicit coercion.

- Don't we want to extend IsConstList() to allow a list of all
implicitly coerced constants? It's inconsistent that otherwise e.g.
NUMERICs don't work.

- Typo in /* The firsts merged constant */ (first not firsts)

- Prepared statements are not supported as they contain INs with Param
instead of Const nodes. While less likely, I've seen applications that
use prepared statements in conjunction with queries generated through a
UI which ended up with tons of prepared queries with different number of
elements in the IN clause. Not necessarily something that must go into
this patch but maybe worth thinking about.

- The setting name const_merge_threshold is not very telling without
knowing the context. While being a little longer, what about
jumble_const_merge_threshold or queryid_const_merge_threshold or similar?

- Why do we actually only want to merge constants? Why don't we ignore
the type of element in the IN and merge whatever is there? Is this
because the original jumbling logic as of now only has support for
constants?

- Ideally we would even remove duplicates. That would even improve
cardinality estimation but I guess we don't want to spend the cycles on
doing so in the planner?

- Why did you change palloc() to palloc0() for clocations array? The
length is initialized to 0 and FWICS RecordConstLocation() initializes
all members. Seems to me like we don't have to spend these cycles.

- Can't the logic at the end of IsConstList() not be simplified to:

        foreach(temp, elements)
            if (!IsA(lfirst(temp), Const))
                return false;

        // All elements are of type Const
        *firstConst = linitial_node(Const, elements);
        *lastConst = llast_node(Const, elements);
        return true;

--
David Geier
(ServiceNow)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2023-02-11 10:47:07 Re: pg_stat_statements and "IN" conditions
Previous Message Pavel Stehule 2023-02-11 07:20:37 Re: possible memory leak in VACUUM ANALYZE