Re: JIT doing duplicative optimization?

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: JIT doing duplicative optimization?
Date: 2021-11-14 17:22:04
Message-ID: CALNJ-vRe=JMeRFo6snryBaOWKVmSJGR-hAkkg-Q=jn86+8gLKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 14, 2021 at 9:07 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:

> On 2021-Nov-11, Alvaro Herrera wrote:
>
> > But what really surprised me is that the the average time to optimize
> > per function is now 2.06ms ... less than half of the previous
> > measurement. It emits 10% less functions than before, but the time to
> > both optimize and emit is reduced by 50%. How does that make sense?
>
> Ah, here's a query that illustrates what I'm on about. I found this
> query[1] in a blog post[2].
>
> ```
> WITH RECURSIVE typeinfo_tree(
> oid, ns, name, kind, basetype, elemtype, elemdelim,
> range_subtype, attrtypoids, attrnames, depth)
> AS (
> SELECT
> ti.oid, ti.ns, ti.name, ti.kind, ti.basetype,
> ti.elemtype, ti.elemdelim, ti.range_subtype,
> ti.attrtypoids, ti.attrnames, 0
> FROM
> (
> SELECT
> t.oid AS oid,
> ns.nspname AS ns,
> t.typname AS name,
> t.typtype AS kind,
> (CASE WHEN t.typtype = 'd' THEN
> (WITH RECURSIVE typebases(oid, depth) AS (
> SELECT
> t2.typbasetype AS oid,
> 0 AS depth
> FROM
> pg_type t2
> WHERE
> t2.oid = t.oid
> UNION ALL
> SELECT
> t2.typbasetype AS oid,
> tb.depth + 1 AS depth
> FROM
> pg_type t2,
> typebases tb
> WHERE
> tb.oid = t2.oid
> AND t2.typbasetype != 0
> ) SELECT oid FROM typebases ORDER BY depth DESC LIMIT 1)
> ELSE NULL
> END) AS basetype,
> t.typelem AS elemtype,
> elem_t.typdelim AS elemdelim,
> range_t.rngsubtype AS range_subtype,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.atttypid ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrtypoids,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.attname::text ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrnames
> FROM
> pg_catalog.pg_type AS t
> INNER JOIN pg_catalog.pg_namespace ns ON (
> ns.oid = t.typnamespace)
> LEFT JOIN pg_type elem_t ON (
> t.typlen = -1 AND
> t.typelem != 0 AND
> t.typelem = elem_t.oid
> )
> LEFT JOIN pg_range range_t ON (
> t.oid = range_t.rngtypid
> )
> ) AS ti
> WHERE
> ti.oid = any(ARRAY[16,17]::oid[])
>
> UNION ALL
>
> SELECT
> ti.oid, ti.ns, ti.name, ti.kind, ti.basetype,
> ti.elemtype, ti.elemdelim, ti.range_subtype,
> ti.attrtypoids, ti.attrnames, tt.depth + 1
> FROM
> (
> SELECT
> t.oid AS oid,
> ns.nspname AS ns,
> t.typname AS name,
> t.typtype AS kind,
> (CASE WHEN t.typtype = 'd' THEN
> (WITH RECURSIVE typebases(oid, depth) AS (
> SELECT
> t2.typbasetype AS oid,
> 0 AS depth
> FROM
> pg_type t2
> WHERE
> t2.oid = t.oid
> UNION ALL
> SELECT
> t2.typbasetype AS oid,
> tb.depth + 1 AS depth
> FROM
> pg_type t2,
> typebases tb
> WHERE
> tb.oid = t2.oid
> AND t2.typbasetype != 0
> ) SELECT oid FROM typebases ORDER BY depth DESC LIMIT 1)
> ELSE NULL
> END) AS basetype,
> t.typelem AS elemtype,
> elem_t.typdelim AS elemdelim,
> range_t.rngsubtype AS range_subtype,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.atttypid ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrtypoids,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.attname::text ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrnames
> FROM
> pg_catalog.pg_type AS t
> INNER JOIN pg_catalog.pg_namespace ns ON (
> ns.oid = t.typnamespace)
> LEFT JOIN pg_type elem_t ON (
> t.typlen = -1 AND
> t.typelem != 0 AND
> t.typelem = elem_t.oid
> )
> LEFT JOIN pg_range range_t ON (
> t.oid = range_t.rngtypid
> )
> ) ti,
> typeinfo_tree tt
> WHERE
> (tt.elemtype IS NOT NULL AND ti.oid = tt.elemtype)
> OR (tt.attrtypoids IS NOT NULL AND ti.oid = any(tt.attrtypoids))
> OR (tt.range_subtype IS NOT NULL AND ti.oid = tt.range_subtype)
> )
> SELECT DISTINCT
> *,
> basetype::regtype::text AS basetype_name,
> elemtype::regtype::text AS elemtype_name,
> range_subtype::regtype::text AS range_subtype_name
> FROM
> typeinfo_tree
> ORDER BY
> depth DESC;
> ```
>
> I did an EXPLAIN ANALYZE and at the bottom you see this:
>
> Planning Time: 2.606 ms
> JIT:
> Functions: 148
> Options: Inlining true, Optimization true, Expressions true, Deforming
> true
> Timing: Generation 19.670 ms, Inlining 19.224 ms, Optimization 435.153
> ms, Emission 282.216 ms, Total 756.263 ms
> Execution Time: 757.643 ms
>
> Average time to optimize, per function 435.153/148 = 2.940ms;
> average time to emit per function 282.216/148 = 1.906ms
>
> Now let's change the query by making the first innermost recursive CTE into
> non-recursive; just delete the RECURSIVE keyword and everything after the
> UNION
> ALL. You now get this query.
>
> ```
> explain analyze
> WITH RECURSIVE typeinfo_tree(
> oid, ns, name, kind, basetype, elemtype, elemdelim,
> range_subtype, attrtypoids, attrnames, depth)
> AS (
> SELECT
> ti.oid, ti.ns, ti.name, ti.kind, ti.basetype,
> ti.elemtype, ti.elemdelim, ti.range_subtype,
> ti.attrtypoids, ti.attrnames, 0
> FROM
> (
> SELECT
> t.oid AS oid,
> ns.nspname AS ns,
> t.typname AS name,
> t.typtype AS kind,
> (CASE WHEN t.typtype = 'd' THEN
> (WITH typebases(oid, depth) AS (
> SELECT
> t2.typbasetype AS oid,
> 0 AS depth
> FROM
> pg_type t2
> WHERE
> t2.oid = t.oid
> ) SELECT oid FROM typebases ORDER BY depth DESC LIMIT 1)
> ELSE NULL
> END) AS basetype,
> t.typelem AS elemtype,
> elem_t.typdelim AS elemdelim,
> range_t.rngsubtype AS range_subtype,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.atttypid ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrtypoids,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.attname::text ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrnames
> FROM
> pg_catalog.pg_type AS t
> INNER JOIN pg_catalog.pg_namespace ns ON (
> ns.oid = t.typnamespace)
> LEFT JOIN pg_type elem_t ON (
> t.typlen = -1 AND
> t.typelem != 0 AND
> t.typelem = elem_t.oid
> )
> LEFT JOIN pg_range range_t ON (
> t.oid = range_t.rngtypid
> )
> ) AS ti
> WHERE
> ti.oid = any(ARRAY[16,17]::oid[])
>
> UNION ALL
>
> SELECT
> ti.oid, ti.ns, ti.name, ti.kind, ti.basetype,
> ti.elemtype, ti.elemdelim, ti.range_subtype,
> ti.attrtypoids, ti.attrnames, tt.depth + 1
> FROM
> (
> SELECT
> t.oid AS oid,
> ns.nspname AS ns,
> t.typname AS name,
> t.typtype AS kind,
> (CASE WHEN t.typtype = 'd' THEN
> (WITH RECURSIVE typebases(oid, depth) AS (
> SELECT
> t2.typbasetype AS oid,
> 0 AS depth
> FROM
> pg_type t2
> WHERE
> t2.oid = t.oid
> UNION ALL
> SELECT
> t2.typbasetype AS oid,
> tb.depth + 1 AS depth
> FROM
> pg_type t2,
> typebases tb
> WHERE
> tb.oid = t2.oid
> AND t2.typbasetype != 0
> ) SELECT oid FROM typebases ORDER BY depth DESC LIMIT 1)
> ELSE NULL
> END) AS basetype,
> t.typelem AS elemtype,
> elem_t.typdelim AS elemdelim,
> range_t.rngsubtype AS range_subtype,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.atttypid ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrtypoids,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.attname::text ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrnames
> FROM
> pg_catalog.pg_type AS t
> INNER JOIN pg_catalog.pg_namespace ns ON (
> ns.oid = t.typnamespace)
> LEFT JOIN pg_type elem_t ON (
> t.typlen = -1 AND
> t.typelem != 0 AND
> t.typelem = elem_t.oid
> )
> LEFT JOIN pg_range range_t ON (
> t.oid = range_t.rngtypid
> )
> ) ti,
> typeinfo_tree tt
> WHERE
> (tt.elemtype IS NOT NULL AND ti.oid = tt.elemtype)
> OR (tt.attrtypoids IS NOT NULL AND ti.oid = any(tt.attrtypoids))
> OR (tt.range_subtype IS NOT NULL AND ti.oid = tt.range_subtype)
> )
>
> SELECT DISTINCT
> *,
> basetype::regtype::text AS basetype_name,
> elemtype::regtype::text AS elemtype_name,
> range_subtype::regtype::text AS range_subtype_name
> FROM
> typeinfo_tree
> ORDER BY
> depth DESC
> ```
>
> At the bottom of the explain you get this.
>
> Planning Time: 2.508 ms
> JIT:
> Functions: 137
> Options: Inlining true, Optimization true, Expressions true, Deforming
> true
> Timing: Generation 10.346 ms, Inlining 10.210 ms, Optimization 374.103
> ms, Emission 254.557 ms, Total 649.216 ms
> Execution Time: 650.168 ms
>
> Average time to optimize, per function: 374.103/137 = 2.730ms
> Average time to emit, per function 254.557 / 137 = 1.858ms
>
>
> Now do it one more time, with the second innermost recursive CTE. You get
> this
> query.
>
> ```
> explain analyze
> WITH RECURSIVE typeinfo_tree(
> oid, ns, name, kind, basetype, elemtype, elemdelim,
> range_subtype, attrtypoids, attrnames, depth)
> AS (
> SELECT
> ti.oid, ti.ns, ti.name, ti.kind, ti.basetype,
> ti.elemtype, ti.elemdelim, ti.range_subtype,
> ti.attrtypoids, ti.attrnames, 0
> FROM
> (
> SELECT
> t.oid AS oid,
> ns.nspname AS ns,
> t.typname AS name,
> t.typtype AS kind,
> (CASE WHEN t.typtype = 'd' THEN
> (WITH typebases(oid, depth) AS (
> SELECT
> t2.typbasetype AS oid,
> 0 AS depth
> FROM
> pg_type t2
> WHERE
> t2.oid = t.oid
> ) SELECT oid FROM typebases ORDER BY depth DESC LIMIT 1)
> ELSE NULL
> END) AS basetype,
> t.typelem AS elemtype,
> elem_t.typdelim AS elemdelim,
> range_t.rngsubtype AS range_subtype,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.atttypid ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrtypoids,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.attname::text ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrnames
> FROM
> pg_catalog.pg_type AS t
> INNER JOIN pg_catalog.pg_namespace ns ON (
> ns.oid = t.typnamespace)
> LEFT JOIN pg_type elem_t ON (
> t.typlen = -1 AND
> t.typelem != 0 AND
> t.typelem = elem_t.oid
> )
> LEFT JOIN pg_range range_t ON (
> t.oid = range_t.rngtypid
> )
> ) AS ti
> WHERE
> ti.oid = any(ARRAY[16,17]::oid[])
>
> UNION ALL
>
> SELECT
> ti.oid, ti.ns, ti.name, ti.kind, ti.basetype,
> ti.elemtype, ti.elemdelim, ti.range_subtype,
> ti.attrtypoids, ti.attrnames, tt.depth + 1
> FROM
> (
> SELECT
> t.oid AS oid,
> ns.nspname AS ns,
> t.typname AS name,
> t.typtype AS kind,
> (CASE WHEN t.typtype = 'd' THEN
> (WITH typebases(oid, depth) AS (
> SELECT
> t2.typbasetype AS oid,
> 0 AS depth
> FROM
> pg_type t2
> WHERE
> t2.oid = t.oid
> ) SELECT oid FROM typebases ORDER BY depth DESC LIMIT 1)
> ELSE NULL
> END) AS basetype,
> t.typelem AS elemtype,
> elem_t.typdelim AS elemdelim,
> range_t.rngsubtype AS range_subtype,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.atttypid ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrtypoids,
> (CASE WHEN t.typtype = 'c' THEN
> (SELECT
> array_agg(ia.attname::text ORDER BY ia.attnum)
> FROM
> pg_attribute ia
> INNER JOIN pg_class c
> ON (ia.attrelid = c.oid)
> WHERE
> ia.attnum > 0 AND NOT ia.attisdropped
> AND c.reltype = t.oid)
> ELSE NULL
> END) AS attrnames
> FROM
> pg_catalog.pg_type AS t
> INNER JOIN pg_catalog.pg_namespace ns ON (
> ns.oid = t.typnamespace)
> LEFT JOIN pg_type elem_t ON (
> t.typlen = -1 AND
> t.typelem != 0 AND
> t.typelem = elem_t.oid
> )
> LEFT JOIN pg_range range_t ON (
> t.oid = range_t.rngtypid
> )
> ) ti,
> typeinfo_tree tt
> WHERE
> (tt.elemtype IS NOT NULL AND ti.oid = tt.elemtype)
> OR (tt.attrtypoids IS NOT NULL AND ti.oid = any(tt.attrtypoids))
> OR (tt.range_subtype IS NOT NULL AND ti.oid = tt.range_subtype)
> )
>
> SELECT DISTINCT
> *,
> basetype::regtype::text AS basetype_name,
> elemtype::regtype::text AS elemtype_name,
> range_subtype::regtype::text AS range_subtype_name
> FROM
> typeinfo_tree
> ORDER BY
> depth DESC;
> ```
>
> And the JIT numbers at the bottom look like this:
>
> Planning Time: 2.041 ms
> JIT:
> Functions: 126
> Options: Inlining false, Optimization true, Expressions true, Deforming
> true
> Timing: Generation 10.002 ms, Inlining 0.000 ms, Optimization 229.128
> ms, Emission 167.338 ms, Total 406.469 ms
> Execution Time: 407.315 ms
>
> Average time to optimize per function 229.128 / 126 = 1.181ms
> Average time to emit per function 167.338 / 126 = 1.328ms
>
>
> In summary,
>
> Query 1, 148 functions JIT-compiled.
> Average time to optimize, per function 435.153/148 = 2.940ms;
> average time to emit per function 282.216/148 = 1.906ms
>
> Query 2, 137 functions JIT-compiled.
> Average time to optimize, per function: 374.103/137 = 2.730ms
> Average time to emit, per function 254.557 / 137 = 1.858ms
>
> Query 3, 126 functions JIT-compiled.
> Average time to optimize per function 229.128 / 126 = 1.181ms
> Average time to emit per function 167.338 / 126 = 1.328ms
>
> ???? Something looks very wrong here.
>
> [1] https://gist.github.com/saicitus/251ba20b211e9e73285af35e61b19580
> [2]
> https://dev.to/xenatisch/cascade-of-doom-jit-and-how-a-postgres-update-led-to-70-failure-on-a-critical-national-service-3f2a
>
> --
> Álvaro Herrera PostgreSQL Developer —
> https://www.EnterpriseDB.com/
> "Java is clearly an example of money oriented programming" (A. Stepanov)
>
>
> Hi,
For the 3rd query, I noticed some difference in options:

Options: Inlining false, Optimization true, Expressions true, Deforming
true

Inlining was true for the first two queries.

FYI

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-11-14 18:30:35 Re: JIT doing duplicative optimization?
Previous Message Alvaro Herrera 2021-11-14 17:06:46 Re: JIT doing duplicative optimization?