Re: More records after sort

From: Nicolas Seinlet <nicolas(at)seinlet(dot)com>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: More records after sort
Date: 2021-07-20 15:07:27
Message-ID: XIvi0CUmhXFqNzBx3QlNJcwzA9_twozgYMgvvU77b_9eFO0e6_X0LbONKuLPFsGSWGWDRG931KAVa3_RGo41LtFD9WZWlLqpX53N-5mS4oQ=@seinlet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tuesday, July 20th, 2021 at 16:34, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Nicolas Seinlet nicolas(at)seinlet(dot)com writes:
>

> > I'm facing a strange behaviour and I don't understand why. From a wider query plan, https://explain.dalibo.com/plan/7yh ,a table is scanned and filter based on its primary key, returning 98 records. Then, those records are sorted on another field, and the output is 758,247,643 records.
>

> You haven't showed us the whole plan; tsk tsk.
>

> However, I'm going to bet that this sort is the inner input of a merge
>

> join, and that the apparent "extra" tuples are caused by the merge backing
>

> up and rescanning the sort result to deal with duplicate join keys in its
>

> outer input. There must be a heck of a lot of duplicates. The planner
>

> would typically avoid using merge join if it knew that, so I'm wondering
>

> if you have up-to-date statistics for the outer side of the join.
>

> regards, tom lane

Many thanks for the fast response. The full query plan is below, and the Dalibo link points to it as well. You're right, it's a merge join. One table doesn't have up-to-date statistics, because the table is filled during the transaction involving this query.

A workaround we found, but we don't understand why and it's frustrating, is to remove a join from the query (the one with currency table) and replace it with a new column in the table for which the sort generates records. the currency table has up to date statistics, is vacuumed, ...

The table with no up-to-date statistics is involved in a where clause:
AND NOT EXISTS (SELECT invl_id FROM invl_aml_mapping WHERE invl_id=il.id)
AND NOT EXISTS (SELECT aml_id FROM invl_aml_mapping WHERE aml_id=ml.id)

Query:
INSERT INTO invl_aml_mapping_temp(invl_id, aml_id, cond)
SELECT il.id, ml.id, 48
FROM account_invoice_line il
JOIN account_invoice i ON i.id = il.invoice_id
JOIN account_move m ON m.id = i.move_id
JOIN account_move_line ml ON ml.move_id = m.id
JOIN res_company comp ON comp.id = i.company_id
WHERE il.display_type IS NULL
AND ml.tax_line_id IS NULL
AND NOT EXISTS (SELECT invl_id FROM invl_aml_mapping WHERE invl_id=il.id)
AND NOT EXISTS (SELECT aml_id FROM invl_aml_mapping WHERE aml_id=ml.id)
AND i.id = ANY(ARRAY[3839,6912,7164,11026,6479,9973,3599,11306,4092,10652,2732,8625,198,10536,5876,7864,5498,13080,4660,8948,597,10762,12573,11613,12905,11483,11227,12876,4470,8954,5628,4386,9321,4398,11595,7125,8116,1231,1610,6828,2105,9658,1616,5735,9066,4631,51,1185,11736,9579,4303,9983,12759,3889,4667,11258,9423,11855,6555,3164,7597,5596,10396,620,10330,4082,926,3252,5375,8118,7141,3500,12065,2794,2367,12088,7454,7998,11439,8893,6304,10814,7723,4393,3901,3383,5770,5857,4746,1858,5320,4574,8314,11914,7930,7308,6228,3627])
AND il.account_id = ml.account_id AND (
ARRAY(SELECT r.tax_id
FROM account_invoice_line_tax r
WHERE r.invoice_line_id = il.id
ORDER BY r.tax_id)
=
ARRAY(SELECT r.account_tax_id
FROM account_move_line_account_tax_rel r
WHERE r.account_move_line_id = ml.id
ORDER BY r.account_tax_id)
) AND
ROUND(il.price_subtotal - ml._mig_124_precomputed_amount,
i.decimal_places) = 0.0
AND il.product_id = ml.product_id;

Plan:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Insert on invl_aml_mapping_temp (cost=241.69..505.08 rows=1 width=12) (actual time=185635.849..185635.883 rows=0 loops=1)
-> Nested Loop Anti Join (cost=241.69..505.08 rows=1 width=12) (actual time=185635.847..185635.880 rows=0 loops=1)
-> Nested Loop (cost=241.27..504.61 rows=1 width=8) (actual time=185635.846..185635.873 rows=0 loops=1)
-> Nested Loop (cost=241.15..504.43 rows=1 width=12) (actual time=185635.839..185635.866 rows=0 loops=1)
Join Filter: (i.move_id = m.id)
-> Merge Join (cost=240.73..503.98 rows=1 width=20) (actual time=185635.815..185635.836 rows=0 loops=1)
Merge Cond: (res_currency.id = i.currency_id)
Join Filter: ((il.invoice_id = i.id) AND (ml.move_id = i.move_id))
Rows Removed by Join Filter: 766064628
-> Nested Loop (cost=12.22..16793.68 rows=64 width=20) (actual time=7.908..91222.587 rows=39084931 loops=1)
Join Filter: (round((il.price_subtotal - ml._mig_124_precomputed_amount), res_currency.decimal_places) = 0.0)
Rows Removed by Join Filter: 2990678
-> Index Scan using res_currency_pkey on res_currency (cost=0.14..14.27 rows=176 width=8) (actual time=0.015..0.254 rows=6 loops=1)
-> Materialize (cost=12.07..16522.64 rows=73 width=28) (actual time=0.386..13293.316 rows=7012602 loops=6)
-> Nested Loop (cost=12.07..16522.27 rows=73 width=28) (actual time=2.262..74119.980 rows=8415119 loops=1)
-> Nested Loop Anti Join (cost=0.42..11037.09 rows=306 width=22) (actual time=0.063..481.749 rows=54864 loops=1)
-> Seq Scan on account_invoice_line il (cost=0.00..9885.48 rows=752 width=22) (actual time=0.049..121.238 rows=150448 loops=1)
Filter: (display_type IS NULL)
-> Index Only Scan using _upg_invl_aml_mapping_invl_id_17ac41c4868 on invl_aml_mapping (cost=0.42..2.26 rows=2 width=4) (actual time=0.002..0.002 rows=1 loops=150448)
Index Cond: (invl_id = il.id)
Heap Fetches: 95944
-> Bitmap Heap Scan on account_move_line ml (cost=11.65..17.92 rows=1 width=22) (actual time=0.565..1.322 rows=153 loops=54864)
Recheck Cond: ((product_id = il.product_id) AND (account_id = il.account_id))
Filter: ((tax_line_id IS NULL) AND ((SubPlan 1) = (SubPlan 2)))
Rows Removed by Filter: 0
Heap Blocks: exact=4597567
-> BitmapAnd (cost=11.65..11.65 rows=1 width=0) (actual time=0.550..0.550 rows=0 loops=54864)
-> Bitmap Index Scan on upgrade_fk_related_idx_42 (cost=0.00..1.33 rows=15 width=0) (actual time=0.013..0.013 rows=161 loops=54864)
Index Cond: (product_id = il.product_id)
-> Bitmap Index Scan on account_move_line_account_id_index (cost=0.00..10.07 rows=1047 width=0) (actual time=0.968..0.968 rows=37775 loops=30031)
Index Cond: (account_id = il.account_id)
SubPlan 1
-> Index Only Scan using account_invoice_line_tax_pkey on account_invoice_line_tax r (cost=0.42..2.64 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=8425000)
Index Cond: (invoice_line_id = il.id)
Heap Fetches: 0
SubPlan 2
-> Index Only Scan using account_move_line_account_tax_rel_pkey on account_move_line_account_tax_rel r_1 (cost=0.29..2.51 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=8425000)
Index Cond: (account_move_line_id = ml.id)
Heap Fetches: 1099
-> Sort (cost=228.52..228.76 rows=98 width=16) (actual time=1.502..31540.651 rows=758247643 loops=1)
Sort Key: i.currency_id
Sort Method: quicksort Memory: 29kB
-> Index Scan using account_invoice_pkey on account_invoice i (cost=0.29..225.28 rows=98 width=16) (actual time=0.035..1.481 rows=98 loops=1)
Index Cond: (id = ANY ('{3839,6912,7164,11026,6479,9973,3599,11306,4092,10652,2732,8625,198,10536,5876,7864,5498,13080,4660,8948,597,10762,12573,11613,12905,11483,11227,12876,4470,8954,5628,4386,9321,4398,11595,7125,8116,1231,1610,6828,2105,9658,1616,5735,9066,4631,51,1185,1173
6,9579,4303,9983,12759,3889,4667,11258,9423,11855,6555,3164,7597,5596,10396,620,10330,4082,926,3252,5375,8118,7141,3500,12065,2794,2367,12088,7454,7998,11439,8893,6304,10814,7723,4393,3901,3383,5770,5857,4746,1858,5320,4574,8314,11914,7930,7308,6228,3627}'::integer[]))
-> Index Only Scan using account_move_pkey on account_move m (cost=0.41..0.44 rows=1 width=4) (never executed)
Index Cond: (id = ml.move_id)
Heap Fetches: 0
-> Index Only Scan using res_company_pkey on res_company comp (cost=0.12..0.18 rows=1 width=4) (never executed)
Index Cond: (id = i.company_id)
Heap Fetches: 0
-> Index Only Scan using _upg_invl_aml_mapping_aml_id_17ac41c486f on invl_aml_mapping invl_aml_mapping_1 (cost=0.42..0.46 rows=2 width=4) (never executed)
Index Cond: (aml_id = ml.id)
Heap Fetches: 0
Planning Time: 14.884 ms
Execution Time: 185674.278 ms
(55 rows)

Thanks again,

Nicolas.

Attachment Content-Type Size
publickey - nicolas@seinlet.com - 0xCAEB7FAF.asc application/pgp-keys 729 bytes

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2021-07-20 17:07:43 Re: More records after sort
Previous Message Tom Lane 2021-07-20 14:34:37 Re: More records after sort