Re: strange slow query - lost lot of time somewhere

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: strange slow query - lost lot of time somewhere
Date: 2022-05-02 14:08:31
Message-ID: CAFj8pRCUkuYYvX3O-KnK=nKe4vNnXbZS173EBHv4Bnqv8qHsSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

po 2. 5. 2022 v 16:02 odesílatel Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
napsal:

>
>
> po 2. 5. 2022 v 15:28 odesílatel Matthias van de Meent <
> boekewurm+postgres(at)gmail(dot)com> napsal:
>
>> On Mon, 2 May 2022 at 11:00, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>> wrote:
>> >
>> > Hi
>> >
>> > I found a query that is significantly slower with more memory
>>
>> Which PostgreSQL version did you use? Did you enable assert checking?
>> Do you have an example database setup to work with?
>>
>> > plan 2
>> > QUERY PLAN
>> > ----------------
>> > Nested Loop Anti Join (cost=46.53..2914.58 rows=1 width=16) (actual
>> time=18.306..23.065 rows=32 loops=1)
>> > ...
>> > Execution Time: 451.161 ms
>>
>> Truly strange; especially the 418ms difference between execution time
>> and the root node's "actual time". I haven't really seen such
>> differences happen, except when concurrent locks were held at the
>> table / index level.
>>
>> > plan 1 - fast https://explain.depesz.com/s/XM1f
>> >
>> > plan 2 - slow https://explain.depesz.com/s/2rBw
>> >
>> > Strange - the time of last row is +/- same, but execution time is 10x
>> worse
>>
>> The only difference between the two plans that I see is that plan 1
>> doesn't use memoization, whereas plan 2 does use 2 memoize plan nodes
>> (one of 66 misses; one of 342 misses). The only "expensive" operation
>> that I see in memoize nodes is the check for memory size in
>> assert-enabled builds; and that should have very low overhead
>> considering that the size of the memoized data is only 8kB and 25kB
>> respectively.
>>
>
> This is PostgreSQL 14 used in production environment
>
> (2022-05-02 15:37:29) prd_aukro=# show debug_assertions ;
> ┌──────────────────┐
> │ debug_assertions │
> ├──────────────────┤
> │ off │
> └──────────────────┘
> (1 řádka)
>
> Čas: 0,295 ms
> (2022-05-02 15:37:35) prd_aukro=# select version();
>
> ┌────────────────────────────────────────────────────────────────────────────────────────────────────────┐
> │ version
> │
>
> ├────────────────────────────────────────────────────────────────────────────────────────────────────────┤
> │ PostgreSQL 14.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
> 20210514 (Red Hat 8.5.0-4), 64-bit │
>
> └────────────────────────────────────────────────────────────────────────────────────────────────────────┘
> (1 řádka)
> Čas: 0,629 ms
>
> there is just shared buffers changed to 32GB and work_mem to 70MB.
> Unfortunately - it is in production environment with customer data, so I
> cannot to play too much
>
> This is perf of slow
>
> 25,94% postmaster [kernel.kallsyms] [k] clear_page_erms
> 11,06% postmaster [kernel.kallsyms] [k] page_fault
> 5,51% postmaster [kernel.kallsyms] [k] prepare_exit_to_usermode
> 5,18% postmaster [kernel.kallsyms] [k] __list_del_entry_valid
> 5,15% postmaster libc-2.28.so [.] __memset_avx2_erms
> 3,99% postmaster [kernel.kallsyms] [k] unmap_page_range
> 3,07% postmaster postgres [.] hash_search_with_hash_value
> 2,73% postmaster [kernel.kallsyms] [k] cgroup_throttle_swaprate
> 2,49% postmaster postgres [.] heap_page_prune_opt
> 1,92% postmaster [kernel.kallsyms] [k] try_charge
> 1,85% postmaster [kernel.kallsyms] [k]
> swapgs_restore_regs_and_return_to_usermode
> 1,82% postmaster [kernel.kallsyms] [k] error_entry
> 1,73% postmaster postgres [.] _bt_checkkeys
> 1,48% postmaster [kernel.kallsyms] [k] free_pcppages_bulk
> 1,35% postmaster [kernel.kallsyms] [k] get_page_from_freelist
> 1,20% postmaster [kernel.kallsyms] [k] __pagevec_lru_add_fn
> 1,08% postmaster [kernel.kallsyms] [k]
> percpu_ref_put_many.constprop.84
> 1,08% postmaster postgres [.] 0x00000000003c1be6
> 1,06% postmaster [kernel.kallsyms] [k] get_mem_cgroup_from_mm.part.49
> 0,86% postmaster [kernel.kallsyms] [k] __handle_mm_fault
> 0,79% postmaster [kernel.kallsyms] [k] mem_cgroup_charge
> 0,70% postmaster [kernel.kallsyms] [k] release_pages
> 0,61% postmaster postgres [.] _bt_checkpage
> 0,61% postmaster [kernel.kallsyms] [k] free_pages_and_swap_cache
> 0,60% postmaster [kernel.kallsyms] [k] handle_mm_fault
> 0,57% postmaster postgres [.] tbm_iterate
> 0,56% postmaster [kernel.kallsyms] [k] __count_memcg_events.part.70
> 0,55% postmaster [kernel.kallsyms] [k] __mod_memcg_lruvec_state
> 0,52% postmaster postgres [.] 0x000000000015f6e5
> 0,50% postmaster [kernel.kallsyms] [k] prep_new_page
> 0,49% postmaster [kernel.kallsyms] [k] __do_page_fault
> 0,46% postmaster [kernel.kallsyms] [k] _raw_spin_lock
> 0,44% postmaster [kernel.kallsyms] [k] do_anonymous_page
>
> This is fast
>
> 21,13% postmaster postgres [.] hash_search_with_hash_value
> 15,33% postmaster postgres [.] heap_page_prune_opt
> 10,22% postmaster libc-2.28.so [.] __memset_avx2_erms
> 10,00% postmaster postgres [.] _bt_checkkeys
> 6,23% postmaster postgres [.] 0x00000000003c1be6
> 4,94% postmaster postgres [.] _bt_checkpage
> 2,85% postmaster postgres [.] tbm_iterate
> 2,31% postmaster postgres [.] nocache_index_getattr
> 2,13% postmaster postgres [.] pg_qsort
> 1,58% postmaster postgres [.] heap_hot_search_buffer
> 1,58% postmaster postgres [.] FunctionCall2Coll
> 1,58% postmaster postgres [.] 0x000000000015f6e5
> 1,17% postmaster postgres [.] LWLockRelease
> 0,85% postmaster libc-2.28.so [.] __memcmp_avx2_movbe
> 0,64% postmaster postgres [.] 0x00000000003e4233
> 0,54% postmaster postgres [.] hash_bytes
> 0,53% postmaster postgres [.] 0x0000000000306fbb
> 0,53% postmaster postgres [.] LWLockAcquire
> 0,42% postmaster postgres [.] 0x00000000003c1c6f
> 0,42% postmaster postgres [.] _bt_compare
>
>
It looks so memoization allocate lot of memory - maybe there are some
temporal memory leak

Performance counter stats for process id '4004464':

84,26 msec task-clock # 0,012 CPUs utilized

3 context-switches # 0,036 K/sec

0 cpu-migrations # 0,000 K/sec

19 page-faults # 0,225 K/sec

0 cycles # 0,000 GHz

106 873 995 instructions

20 225 431 branches # 240,026 M/sec

348 834 branch-misses # 1,72% of all
branches

7,106142051 seconds time elapsed

Performance counter stats for process id '4004464':

1 116,97 msec task-clock # 0,214 CPUs utilized

4 context-switches # 0,004 K/sec

0 cpu-migrations # 0,000 K/sec

99 349 page-faults # 0,089 M/sec

0 cycles # 0,000 GHz

478 842 411 instructions

89 495 015 branches # 80,123 M/sec

1 014 763 branch-misses # 1,13% of all
branches

5,221116331 seconds time elapsed

Regards

Pavel

>
>
>
> Regards
>
> Pavel
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Wong 2022-05-02 14:30:23 Re: DBT-5 Stored Procedure Development (2022)
Previous Message Tom Lane 2022-05-02 14:07:48 Re: Unfiltered server logs routing via a new elog hook or existing emit_log_hook bypassing log_min_message check