| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Avoiding memory leakage in jsonpath evaluation |
| Date: | 2026-03-18 16:01:11 |
| Message-ID: | 766734.1773849671@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> writes:
>> On Mar 18, 2026, at 05:33, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I got an off-list report that a query like this consumes
>> an unreasonable amount of memory:
>> SELECT jsonb_path_query((SELECT jsonb_agg(i) FROM generate_series(1,10000) i),
>> '$[*] ? (@ < $)');
> This patch looks like a big win. It not only saves memory, but also makes the query much faster.
Thanks for looking at it!
> I tested the query on my MacBook M4, increasing the iteration count from 10000 to 50000.
Hmm, at that point you were probably mostly measuring how fast the
machine could swap :-(. I had been testing on Linux x86_64, but when
I tried it on macOS just now, the unpatched memory consumption seemed
even worse than on Linux.
> After reviewing the patch, I thought JsonValueListLength() might be
> worth optimizing, since it is O(n).
Well, it's really O(log N) given that we doubled the chunk size each
time we added a chunk. I had also thought about tracking the total
number of items, but concluded that incrementing an additional counter
during each item addition couldn't possibly get repaid given that
amortization and the fact that we don't compute the list length more
than once in any operation.
However, looking at it again today, I realized that no caller actually
needs to know the total length of a long JsonValueList. They only
really care whether the list has zero, one, or more than one member.
And we can determine that just by looking at the base chunk.
Hence, v2 attached. 0001 is the same as before, and 0002 removes
JsonValueListLength in favor of some constant-time tests. (I'd
plan to squash these to one commit in the end, but it seemed easier
to review if I present the delta-from-yesterday separately.)
BTW, I don't love the function name JsonValueListIsMultiple, but if
there's a common term analogous to "singleton" but describing sets
with more than one member, I don't know it.
regards, tom lane
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-Fix-transient-memory-leakage-in-jsonpath-evaluati.patch | text/x-diff | 37.7 KB |
| v2-0002-Remove-JsonValueListLength.patch | text/x-diff | 6.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2026-03-18 16:05:29 | Use SIGTERM instead of SIGUSR1 for slotsync worker to exit during promotion? |
| Previous Message | Ashutosh Bapat | 2026-03-18 16:01:06 | Re: SQL Property Graph Queries (SQL/PGQ) |