Re: strange slow query - lost lot of time somewhere

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: strange slow query - lost lot of time somewhere
Date: 2022-05-06 10:31:23
Message-ID: CAFj8pRASdTyjNi4dJFKY2tVb4UQ6ZT=A0PcXGa0r=Q92TTzSzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pá 6. 5. 2022 v 10:05 odesílatel David Rowley <dgrowleyml(at)gmail(dot)com> napsal:

> On Fri, 6 May 2022 at 17:52, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
> > Breakpoint 1, build_hash_table (size=4369066, mstate=0xfc7f08) at
> nodeMemoize.c:268
> > 268 if (size == 0)
> > (gdb) p size
> > $1 = 4369066
>
> Thanks for the report. I think I now see the problem. Looking at
> [1], it seems that's a bushy plan. That's fine, but less common than a
> left deep plan.
>
> I think the problem is down to some incorrect code in
> get_memoize_path() where we pass the wrong value of "calls" to
> create_memoize_path(). I think instead of outer_path->parent->rows it
> instead should be outer_path->rows.
>
> If you look closely at the plan, you'll see that the outer side of the
> inner-most Nested Loop is parameterized by some higher-level nested
> loop.
>
> -> Nested Loop (cost=1.14..79.20 rows=91 width=8) (actual
> time=0.024..0.024 rows=1 loops=66)
> -> Index Only Scan using
> uq_isi_itemid_itemimageid on item_share_image itemsharei2__1
> (cost=0.57..3.85 rows=91 width=16) (actual time=0.010..0.010 rows=1
> loops=66)
> Index Cond: (item_id = itembo0_.id)
> Heap Fetches: 21
> -> Memoize (cost=0.57..2.07 rows=1 width=8)
> (actual time=0.013..0.013 rows=1 loops=66)
>
> so instead of passing 91 to create_memoize_path() as I thought. Since
> I can't see any WHERE clause items filtering rows from the
> itemsharei2__1 relation, then the outer_path->parent->rows is should
> be whatever pg_class.reltuples says.
>
> Are you able to send the results of:
>
> explain select item_id from item_share_image group by item_id; -- I'm
> interested in the estimated number of groups in the plan's top node.
>

>
> select reltuples from pg_class where oid = 'item_share_image'::regclass;
>
> I'm expecting the estimated number of rows in the top node of the
> group by plan to be about 4369066.
>

(2022-05-06 12:30:23) prd_aukro=# explain select item_id from
item_share_image group by item_id;
QUERY PLAN

────────────────────────────────────────────────────────────────────────────
Finalize HashAggregate (cost=1543418.63..1554179.08 rows=1076045 width=8)
Group Key: item_id
-> Gather (cost=1000.57..1532658.18 rows=4304180 width=8)
Workers Planned: 4
-> Group (cost=0.57..1101240.18 rows=1076045 width=8)
Group Key: item_id
-> Parallel Index Only Scan using ixfk_isi_itemid on
item_share_image (cost=0.57..1039823.86 rows=24566530 width=8)
(7 řádek)

Čas: 1,808 ms
(2022-05-06 12:30:26) prd_aukro=# select reltuples from pg_class where oid
= 'item_share_image'::regclass;
reltuples
────────────
9.826612e+07
(1 řádka)

Čas: 0,887 ms

Regards

Pavel

> David
>
> [1] https://explain.depesz.com/s/2rBw#source
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-05-06 10:58:43 Re: Possible corruption by CreateRestartPoint at promotion
Previous Message Etsuro Fujita 2022-05-06 10:08:01 Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit