Re: strange slow query - lost lot of time somewhere

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: strange slow query - lost lot of time somewhere
Date: 2022-05-02 14:02:18
Message-ID: CAFj8pRAo5CrF8mpPxMvnBYFSqu4HYDqRsQnLqGphckNHkHosFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

po 2. 5. 2022 v 15:28 odesílatel Matthias van de Meent <
boekewurm+postgres(at)gmail(dot)com> napsal:

> On Mon, 2 May 2022 at 11:00, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
> >
> > Hi
> >
> > I found a query that is significantly slower with more memory
>
> Which PostgreSQL version did you use? Did you enable assert checking?
> Do you have an example database setup to work with?
>
> > plan 2
> > QUERY PLAN
> > ----------------
> > Nested Loop Anti Join (cost=46.53..2914.58 rows=1 width=16) (actual
> time=18.306..23.065 rows=32 loops=1)
> > ...
> > Execution Time: 451.161 ms
>
> Truly strange; especially the 418ms difference between execution time
> and the root node's "actual time". I haven't really seen such
> differences happen, except when concurrent locks were held at the
> table / index level.
>
> > plan 1 - fast https://explain.depesz.com/s/XM1f
> >
> > plan 2 - slow https://explain.depesz.com/s/2rBw
> >
> > Strange - the time of last row is +/- same, but execution time is 10x
> worse
>
> The only difference between the two plans that I see is that plan 1
> doesn't use memoization, whereas plan 2 does use 2 memoize plan nodes
> (one of 66 misses; one of 342 misses). The only "expensive" operation
> that I see in memoize nodes is the check for memory size in
> assert-enabled builds; and that should have very low overhead
> considering that the size of the memoized data is only 8kB and 25kB
> respectively.
>

This is PostgreSQL 14 used in production environment

(2022-05-02 15:37:29) prd_aukro=# show debug_assertions ;
┌──────────────────┐
│ debug_assertions │
├──────────────────┤
│ off │
└──────────────────┘
(1 řádka)

Čas: 0,295 ms
(2022-05-02 15:37:35) prd_aukro=# select version();
┌────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ version

├────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ PostgreSQL 14.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
20210514 (Red Hat 8.5.0-4), 64-bit │
└────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(1 řádka)
Čas: 0,629 ms

there is just shared buffers changed to 32GB and work_mem to 70MB.
Unfortunately - it is in production environment with customer data, so I
cannot to play too much

This is perf of slow

25,94% postmaster [kernel.kallsyms] [k] clear_page_erms
11,06% postmaster [kernel.kallsyms] [k] page_fault
5,51% postmaster [kernel.kallsyms] [k] prepare_exit_to_usermode
5,18% postmaster [kernel.kallsyms] [k] __list_del_entry_valid
5,15% postmaster libc-2.28.so [.] __memset_avx2_erms
3,99% postmaster [kernel.kallsyms] [k] unmap_page_range
3,07% postmaster postgres [.] hash_search_with_hash_value
2,73% postmaster [kernel.kallsyms] [k] cgroup_throttle_swaprate
2,49% postmaster postgres [.] heap_page_prune_opt
1,92% postmaster [kernel.kallsyms] [k] try_charge
1,85% postmaster [kernel.kallsyms] [k]
swapgs_restore_regs_and_return_to_usermode
1,82% postmaster [kernel.kallsyms] [k] error_entry
1,73% postmaster postgres [.] _bt_checkkeys
1,48% postmaster [kernel.kallsyms] [k] free_pcppages_bulk
1,35% postmaster [kernel.kallsyms] [k] get_page_from_freelist
1,20% postmaster [kernel.kallsyms] [k] __pagevec_lru_add_fn
1,08% postmaster [kernel.kallsyms] [k]
percpu_ref_put_many.constprop.84
1,08% postmaster postgres [.] 0x00000000003c1be6
1,06% postmaster [kernel.kallsyms] [k] get_mem_cgroup_from_mm.part.49
0,86% postmaster [kernel.kallsyms] [k] __handle_mm_fault
0,79% postmaster [kernel.kallsyms] [k] mem_cgroup_charge
0,70% postmaster [kernel.kallsyms] [k] release_pages
0,61% postmaster postgres [.] _bt_checkpage
0,61% postmaster [kernel.kallsyms] [k] free_pages_and_swap_cache
0,60% postmaster [kernel.kallsyms] [k] handle_mm_fault
0,57% postmaster postgres [.] tbm_iterate
0,56% postmaster [kernel.kallsyms] [k] __count_memcg_events.part.70
0,55% postmaster [kernel.kallsyms] [k] __mod_memcg_lruvec_state
0,52% postmaster postgres [.] 0x000000000015f6e5
0,50% postmaster [kernel.kallsyms] [k] prep_new_page
0,49% postmaster [kernel.kallsyms] [k] __do_page_fault
0,46% postmaster [kernel.kallsyms] [k] _raw_spin_lock
0,44% postmaster [kernel.kallsyms] [k] do_anonymous_page

This is fast

21,13% postmaster postgres [.] hash_search_with_hash_value
15,33% postmaster postgres [.] heap_page_prune_opt
10,22% postmaster libc-2.28.so [.] __memset_avx2_erms
10,00% postmaster postgres [.] _bt_checkkeys
6,23% postmaster postgres [.] 0x00000000003c1be6
4,94% postmaster postgres [.] _bt_checkpage
2,85% postmaster postgres [.] tbm_iterate
2,31% postmaster postgres [.] nocache_index_getattr
2,13% postmaster postgres [.] pg_qsort
1,58% postmaster postgres [.] heap_hot_search_buffer
1,58% postmaster postgres [.] FunctionCall2Coll
1,58% postmaster postgres [.] 0x000000000015f6e5
1,17% postmaster postgres [.] LWLockRelease
0,85% postmaster libc-2.28.so [.] __memcmp_avx2_movbe
0,64% postmaster postgres [.] 0x00000000003e4233
0,54% postmaster postgres [.] hash_bytes
0,53% postmaster postgres [.] 0x0000000000306fbb
0,53% postmaster postgres [.] LWLockAcquire
0,42% postmaster postgres [.] 0x00000000003c1c6f
0,42% postmaster postgres [.] _bt_compare

Regards

Pavel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-05-02 14:03:23 Re: Unfiltered server logs routing via a new elog hook or existing emit_log_hook bypassing log_min_message check
Previous Message Julien Rouhaud 2022-05-02 13:57:45 Re: limiting collected query text length in pg_stat_statements