Re: plan with result cache is very slow when work_mem is not enough

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: plan with result cache is very slow when work_mem is not enough
Date: 2021-05-07 18:45:36
Message-ID: CAFj8pRB2oRRp9FyCAO59U7S2ewQLq2bA8j4yWf72cQbwP6YfMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pá 7. 5. 2021 v 20:24 odesílatel Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
napsal:

> Hi
>
> I am testing new features of Postgres 14, and now I am trying to check the
> result cache. Unfortunately on my test data, the result is not too good.
> the behaviour is very non linear. Is it expected?
>
> create table t1(a int, t2_id int);
> insert into t1 select random() * 100000, random() * 100000 from
> generate_series(1,1000000);
> create table t2(b int, id int);
> insert into t2 select random() * 100000, random() * 100000 from
> generate_series(1,1000000);
> create index on t2(id);
>
> vacuum analyze t1, t2;
>
> when work_mem is 40MB
>
> QUERY PLAN
>
>
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>
> Nested Loop (cost=4.65..472639.79 rows=1000000 width=16) (actual
> time=0.041..1078.882 rows=1000000 loops=1)
>
>
> -> Seq Scan on t1 (cost=0.00..14425.00 rows=1000000 width=8) (actual
> time=0.010..60.212 rows=1000000 loops=1)
>
> -> Result Cache (cost=4.65..4.67 rows=1 width=8) (actual
> time=0.001..0.001 rows=1 loops=1000000)
>
>
> Cache Key: t1.t2_id
>
>
> Hits: 900006 Misses: 99994 Evictions: 0 Overflows: 0 Memory
> Usage: 10547kB
>
> -> Aggregate (cost=4.64..4.65 rows=1 width=8) (actual
> time=0.003..0.003 rows=1 loops=99994)
>
>
> -> Index Only Scan using t2_id_idx on t2 (cost=0.42..4.62
> rows=11 width=0) (actual time=0.002..0.003 rows=10 loops=99994)
>
> Index Cond: (id = t1.t2_id)
>
>
> Heap Fetches: 0
>
>
> Planning Time: 0.091 ms
>
>
> Execution Time: 1120.177 ms
>
> when work_mem is 10MB
>
>
> postgres=# set work_mem to '10MB'; -- 11MB is ok
> SET
> postgres=# explain analyze select * from t1, lateral(select count(*) from
> t2 where t1.t2_id = t2.id) s ;
> QUERY
> PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=4.65..472639.79 rows=1000000 width=16) (actual
> time=0.040..56576.187 rows=1000000 loops=1)
> -> Seq Scan on t1 (cost=0.00..14425.00 rows=1000000 width=8) (actual
> time=0.010..76.753 rows=1000000 loops=1)
> -> Result Cache (cost=4.65..4.67 rows=1 width=8) (actual
> time=0.056..0.056 rows=1 loops=1000000)
> Cache Key: t1.t2_id
> Hits: 884158 Misses: 115842 Evictions: 18752 Overflows: 0
> Memory Usage: 10241kB
> -> Aggregate (cost=4.64..4.65 rows=1 width=8) (actual
> time=0.005..0.005 rows=1 loops=115842)
> -> Index Only Scan using t2_id_idx on t2 (cost=0.42..4.62
> rows=11 width=0) (actual time=0.003..0.004 rows=10 loops=115842)
> Index Cond: (id = t1.t2_id)
> Heap Fetches: 0
> Planning Time: 0.087 ms
> Execution Time: 56621.421 ms
> (11 rows)
>
>
can be possible to disable caching when the number of evictions across some
limit ?

Can be calculated some average cache hit ratio against evictions, and when
this ratio will be too big, then the cache can be bypassed.

> The query without result cache
>
> postgres=# explain analyze select * from t1, lateral(select count(*) from
> t2 where t1.t2_id = t2.id) s ;
> QUERY PLAN
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=4.64..4689425.00 rows=1000000 width=16) (actual
> time=0.031..3260.858 rows=1000000 loops=1)
> -> Seq Scan on t1 (cost=0.00..14425.00 rows=1000000 width=8) (actual
> time=0.008..71.792 rows=1000000 loops=1)
> -> Aggregate (cost=4.64..4.65 rows=1 width=8) (actual
> time=0.003..0.003 rows=1 loops=1000000)
> -> Index Only Scan using t2_id_idx on t2 (cost=0.42..4.62
> rows=11 width=0) (actual time=0.002..0.002 rows=10 loops=1000000)
> Index Cond: (id = t1.t2_id)
> Heap Fetches: 0
> Planning Time: 0.081 ms
> Execution Time: 3293.543 ms
> (8 rows)
>
>
>
> Samples: 119K of event 'cycles', 4000 Hz, Event count (approx.):
> Overhead Shared Object Symbol
> 79.20% postgres [.] cache_reduce_memory
> 1.94% [kernel] [k] native_write_msr_safe
> 1.63% [kernel] [k] update_cfs_shares
> 1.00% [kernel] [k] trigger_load_balance
> 0.97% [kernel] [k] timerqueue_add
> 0.51% [kernel] [k] task_tick_fair
> 0.51% [kernel] [k] task_cputime
> 0.50% [kernel] [k] perf_event_task_tick
> 0.50% [kernel] [k] update_curr
> 0.49% [kernel] [k] hrtimer_active
>
> Regards
>
> Pavel
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yura Sokolov 2021-05-07 19:06:40 Re: plan with result cache is very slow when work_mem is not enough
Previous Message Andrew Dunstan 2021-05-07 18:31:43 Re: Why do we have perl and sed versions of Gen_dummy_probes?