From: | Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: strange slow query - lost lot of time somewhere |
Date: | 2022-05-04 14:08:25 |
Message-ID: | AM8PR07MB824897E7F4EDB4F041522B1FF6C39@AM8PR07MB8248.eurprd07.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> I do agree that the perf report does indicate that the extra time is taken due to
> some large amount of memory being allocated. I just can't quite see how that
> would happen in Memoize given that
> estimate_num_groups() clamps the distinct estimate as the number of input
> rows, which is 91 in both cases in your problem query.
>
> Are you able to run the Memoize query in psql with \watch 0.1 for a few seconds
> while you do:
>
> perf record --call-graph dwarf --pid <pid> sleep 2
>
> then send along the perf report.
>
> I locally hacked build_hash_table() in nodeMemoize.c to make the hashtable 100
> million elements and I see my perf report for a trivial Memoize query come up
> as:
>
[..]
>
> Failing that, are you able to pg_dump these tables and load them into a
> PostgreSQL instance that you can play around with and patch?
> Provided you can actually recreate the problem on that instance.
>
+1 to what David says, we need a reproducer. In [1] Pavel wrote that he's having a lot of clear_page_erms(), so maybe this will be a little help: I recall having similar issue having a lot of minor page faults and high %sys when raising work_mem. For me it was different issue some time ago, but it was something like build_hash_table() being used by UNION recursive calls -> BuildTupleHashTable() -> .. malloc() -> mmap64(). When mmap() is issued with MAP_ANONYMOUS the kernel will zero out the memory (more memory -> potentially bigger CPU waste visible as minor page faults; erms stands for "Enhanced REP MOVSB/STOSB"; this is on kernel side). The culprit was planner allocating something that wouldn't be used later.
Additional three ways to figure that one (all are IMHO production safe):
a) already mentioned perf with --call-graph dwarf -p PID
b) strace -p PID -e 'mmap' # verify if mmap() NULL is not having MAP_ANONYMOUS flag, size of mmap() request will somehow match work_mem sizing
c) gdb -p PID and then breakpoint for mmap and verify each mmap() # check MAP_ANONYMOUS as above
-J.
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2022-05-04 14:34:15 | Re: pg_upgrade (12->14) fails on aggregate |
Previous Message | Peter Eisentraut | 2022-05-04 14:05:45 | Re: configure openldap crash warning |