Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs
Date: 2023-07-10 17:25:36
Message-ID: 901df6ce-ac9a-2806-c5c6-e5add4e62a80@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Well, I don't have a detailed plan either. In principle it shouldn't be
> that hard, I think - examine_variable is loading the statistics, so it
> could apply the same null_frac correction, just like nulltestsel would
> do a bit later.
>
> The main question is how to pass the information to examine_variable. It
> doesn't get the SpecialJoinInfo (which is what nulltestsel used), so
> we'd need to invent something new ... add a new argument?

Sorry I didn't answer right away, I could adapt the last version of the
patch [2] to the current idea, but so far I have implemented
it only for the situation where we save the number of zero values in
SpecialJoinInfo variable.

I'm starting to look for different functions scalararraysel_containment,
boolvarsel and I try to find some bad cases for current problem,
when I can fix in similar way it in those functions. But I'm not sure
about different ones functions:
(mergejoinscansel, estimate_num_groups, estimate_hash_bucket_stats,
get_restriction_variable, get_join_variables).

The examine_variable function is also called in them, and even though
there is no being outer join in them
(the absence of a SpecialJoinInfo variable),  I can't think of similar
cases, when we have such problem caused by the same reasons.

The code passes all regression tests and I found no deterioration in
cardinality prediction for queries from [1], except for one:

EXPLAIN ANALYZE
  SELECT * FROM large FULL JOIN small ON (large.id = small.id)
WHERE (large.a IS NULL);

MASTER:

*QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------

 Merge Full Join  (cost=127921.69..299941.59 rows=56503
width=16)(actual time=795.092..795.094 rows=0 loops=1)

   Merge Cond: (small.id = large.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 1000000

   ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
(actualtime=0.038..0.046 rows=100 loops=1)

         Sort Key: small.id
         Sort Method: quicksort  Memory: 29kB

         ->  Seq Scan on small  (cost=0.00..32.60 rows=2260
width=8)(actual time=0.013..0.022 rows=100 loops=1)   ->  Materialize
(cost=127763.19..132763.44 rows=1000050 width=8)(actual
time=363.016..649.103 rows=1000000 loops=1)         ->  Sort
(cost=127763.19..130263.31 rows=1000050 width=8)(actual
time=363.012..481.480 rows=1000000 loops=1)

               Sort Key: large.id
               Sort Method: external merge  Disk: 17664kB

               ->  Seq Scan on large (cost=0.00..14425.50
rows=1000050width=8) (actual time=0.009..111.166 rows=1000000 loops=1)

 Planning Time: 0.124 ms
 Execution Time: 797.139 ms
(15 rows)*

With patch:

*QUERY PLAN
----------------------------------------------------------------------------------------------------------------------

 Hash Full Join  (cost=3.25..18179.25 rows=999900 width=16)
(actualtime=261.480..261.482 rows=0 loops=1)

   Hash Cond: (large.id = small.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 1000000

   ->  Seq Scan on large  (cost=0.00..14425.00 rows=1000000
width=8)(actual time=0.006..92.827 rows=1000000 loops=1)   ->  Hash
(cost=2.00..2.00 rows=100 width=8) (actualtime=0.032..0.034 rows=100
loops=1)

         Buckets: 1024  Batches: 1  Memory Usage: 12kB

         ->  Seq Scan on small  (cost=0.00..2.00 rows=100
width=8)(actual time=0.008..0.015 rows=100 loops=1)

 Planning Time: 0.151 ms
 Execution Time: 261.529 ms
(10 rows)

[1]
https://www.mail-archive.com/pgsql-hackers(at)lists(dot)postgresql(dot)org/msg146044.html

[2]
https://www.postgresql.org/message-id/148ff8f1-067b-1409-c754-af6117de9b7d%40yandex.ru

Unfortunately, I found that my previous change leads to a big error in
predicting selectivity.
I will investigate this case in more detail and try to find a solution
(I wrote the code and query below).

// unmatchfreq = (1.0 - nullfrac1) * (1.0 - nullfrac2);
if (nd1 != nd2)
{
        selec /= Max(nd1, nd2);
        *unmatched_frac = abs(nd1 - nd2) * 1.0 / Max(nd1, nd2);
}
else
{
        selec /= nd2;
        *unmatched_frac = 0.0;
}

postgres=# explain analyze select * from large l1 inner join large l2 on
l1.a is null where l1.a <100;

MASTER:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1000.00..35058.43 rows=1000000 width=16) (actual
time=91.846..93.622 rows=0 loops=1)
   ->  Gather (cost=1000.00..10633.43 rows=1 width=8) (actual
time=91.844..93.619 rows=0 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Parallel Seq Scan on large l1  (cost=0.00..9633.33 rows=1
width=8) (actual time=86.153..86.154 rows=0 loops=3)
               Filter: ((a IS NULL) AND (a < 100))
               Rows Removed by Filter: 333333
   ->  Seq Scan on large l2 (cost=0.00..14425.00 rows=1000000 width=8)
(never executed)
 Planning Time: 0.299 ms
 Execution Time: 93.689 ms

(10 rows)

With patch:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop (cost=1000.00..20863771.84 rows=1667083350 width=16)
(actual time=290.255..290.323 rows=0 loops=1)
   ->  Seq Scan on large l2 (cost=0.00..14425.50 rows=1000050 width=8)
(actual time=0.104..94.037 rows=1000000 loops=1)
   ->  Materialize (cost=1000.00..10808.63 rows=1667 width=8) (actual
time=0.000..0.000 rows=0 loops=1000000)
         ->  Gather (cost=1000.00..10800.29 rows=1667 width=8) (actual
time=79.472..79.539 rows=0 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               ->  Parallel Seq Scan on large l1  (cost=0.00..9633.59
rows=695 width=8) (actual time=75.337..75.338 rows=0 loops=3)
                     Filter: ((a IS NULL) AND (a < 100))
                     Rows Removed by Filter: 333333
 Planning Time: 0.721 ms
 Execution Time: 290.425 ms

(11 rows)

I remember, it could fix this one:

SELECT i AS id INTO l FROM generate_series(1,100000) i;
CREATE TABLE r (id int8, v text);
INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
ANALYZE l,r;
EXPLAIN ANALYZE
SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;
ERROR:  relation "l" already exists
ERROR:  relation "r" already exists
INSERT 0 2
ANALYZE
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------

 Hash Left Join  (cost=1.09..1944.13 rows=99998 width=14)
(actualtime=0.152..84.184 rows=99999 loops=1)

   Hash Cond: (l.id = r.id)
   Filter: (r.v IS NULL)
   Rows Removed by Filter: 2

   ->  Seq Scan on l  (cost=0.00..1443.00 rows=100000 width=4)
(actualtime=0.040..27.635 rows=100000 loops=1)   ->  Hash
(cost=1.04..1.04 rows=4 width=10) (actualtime=0.020..0.022 rows=4 loops=1)

         Buckets: 1024  Batches: 1  Memory Usage: 9kB

         ->  Seq Scan on r  (cost=0.00..1.04 rows=4 width=10)
(actualtime=0.009..0.011 rows=4 loops=1)

 Planning Time: 0.954 ms
 Execution Time: 92.309 ms
(10 rows)

Do you think it's worth trying to apply the fourth method now?
As far as I remember, here we will face the problem of estimating
multiple conditions, in the 4th approach there is a chance to avoid this.

I couldn't get such case. I found only one that I also found
interesting, but so far I haven't been able to figure out well enough
what influenced the prediction of cardinality and, accordingly, the
choice of a better plan.

CREATE TABLE large (id INT, a INT);
  INSERT INTO large SELECT i, 1 FROM generate_series(1,50000) s(i);
CREATE TABLE small (id INT, b INT);
  INSERT INTO small SELECT i, 1 FROM generate_series(1,25000) s(i);
INSERT INTO large SELECT i+50000, i+2 FROM generate_series(1,50000) s(i);
INSERT INTO small SELECT i+25000, i+2 FROM generate_series(1,25000) s(i);

explain analyze SELECT * FROM large LEFT JOIN small ON (large.id = small.id)
          WHERE ((large.a=1) OR (small.b=1));

With patch:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=1347.00..3911.91 rows=99775 width=16) (actual
time=36.864..82.634 rows=50000 loops=1)
   Hash Cond: (large.id = small.id)
   Filter: ((large.a = 1) OR (small.b = 1))
   Rows Removed by Filter: 50000
   ->  Seq Scan on large  (cost=0.00..1440.75 rows=99775 width=8)
(actual time=0.034..12.536 rows=100000 loops=1)
   ->  Hash  (cost=722.00..722.00 rows=50000 width=8) (actual
time=36.752..36.754 rows=50000 loops=1)
         Buckets: 65536  Batches: 1  Memory Usage: 2466kB
         ->  Seq Scan on small  (cost=0.00..722.00 rows=50000 width=8)
(actual time=0.028..13.337 rows=50000 loops=1)
 Planning Time: 2.363 ms
 Execution Time: 84.790 ms
(10 rows)

original:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Merge Right Join  (cost=14400.45..516963.33 rows=250528 width=16)
(actual time=85.498..126.799 rows=50000 loops=1)
   Merge Cond: (small.id = large.id)
   Filter: ((large.a = 1) OR (small.b = 1))
   Rows Removed by Filter: 50000
   ->  Sort  (cost=4640.80..4766.23 rows=50172 width=8) (actual
time=41.538..44.204 rows=50000 loops=1)
         Sort Key: small.id
         Sort Method: quicksort  Memory: 3710kB
         ->  Seq Scan on small  (cost=0.00..723.72 rows=50172 width=8)
(actual time=0.068..15.182 rows=50000 loops=1)
   ->  Sort  (cost=9759.65..10009.95 rows=100118 width=8) (actual
time=43.939..53.697 rows=100000 loops=1)
         Sort Key: large.id
         Sort Method: external sort  Disk: 2160kB
         ->  Seq Scan on large  (cost=0.00..1444.18 rows=100118
width=8) (actual time=0.046..10.378 rows=100000 loops=1)
 Planning Time: 0.406 ms
 Execution Time: 130.109 ms
(14 rows)

If you disable merge join with the original code, then the plans
coincide, as well as the error value approximately, only in one case
cardinality
overestimation occurs in the other vice versa.

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=1347.00..3915.00 rows=75082 width=16) (actual
time=43.625..68.901 rows=50000 loops=1)
   Hash Cond: (large.id = small.id)
   Filter: ((large.a = 1) OR (small.b = 1))
   Rows Removed by Filter: 50000
   ->  Seq Scan on large  (cost=0.00..1443.00 rows=100000 width=8)
(actual time=0.008..9.895 rows=100000 loops=1)
   ->  Hash  (cost=722.00..722.00 rows=50000 width=8) (actual
time=22.546..22.548 rows=50000 loops=1)
         Buckets: 65536  Batches: 1  Memory Usage: 2466kB
         ->  Seq Scan on small  (cost=0.00..722.00 rows=50000 width=8)
(actual time=0.006..7.218 rows=50000 loops=1)
 Planning Time: 0.239 ms
 Execution Time: 70.860 ms
(10 rows)

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
0001-Fix-calculating-the-underestimated-cardinality.patch text/x-patch 16.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-07-10 17:39:19 Re: pg_usleep for multisecond delays
Previous Message Reid Thompson 2023-07-10 17:14:12 Re: DecodeInterval fixes