Re: remaining sql/json patches

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Erik Rijkers <er(at)xs4all(dot)nl>, jian he <jian(dot)universality(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: remaining sql/json patches
Date: 2024-03-06 04:07:33
Message-ID: CA+HiwqGYkYjXe2g0=c==BGSpVQfMJZBhPE-vesX3xCz4enBYPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas,

On Wed, Mar 6, 2024 at 6:30 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> Hi,
>
> I know very little about sql/json and all the json internals, but I
> decided to do some black box testing. I built a large JSONB table
> (single column, ~7GB of data after loading). And then I did a query
> transforming the data into tabular form using JSON_TABLE.
>
> The JSON_TABLE query looks like this:
>
> SELECT jt.* FROM
> title_jsonb t,
> json_table(t.info, '$'
> COLUMNS (
> "id" text path '$."id"',
> "type" text path '$."type"',
> "title" text path '$."title"',
> "original_title" text path '$."original_title"',
> "is_adult" text path '$."is_adult"',
> "start_year" text path '$."start_year"',
> "end_year" text path '$."end_year"',
> "minutes" text path '$."minutes"',
> "genres" text path '$."genres"',
> "aliases" text path '$."aliases"',
> "directors" text path '$."directors"',
> "writers" text path '$."writers"',
> "ratings" text path '$."ratings"',
> NESTED PATH '$."aliases"[*]'
> COLUMNS (
> "alias_title" text path '$."title"',
> "alias_region" text path '$."region"'
> ),
> NESTED PATH '$."directors"[*]'
> COLUMNS (
> "director_name" text path '$."name"',
> "director_birth_year" text path '$."birth_year"',
> "director_death_year" text path '$."death_year"'
> ),
> NESTED PATH '$."writers"[*]'
> COLUMNS (
> "writer_name" text path '$."name"',
> "writer_birth_year" text path '$."birth_year"',
> "writer_death_year" text path '$."death_year"'
> ),
> NESTED PATH '$."ratings"[*]'
> COLUMNS (
> "rating_average" text path '$."average"',
> "rating_votes" text path '$."votes"'
> )
> )
> ) as jt;
>
> again, not particularly complex. But if I run this, it consumes multiple
> gigabytes of memory, before it gets killed by OOM killer. This happens
> even when ran using
>
> COPY (...) TO '/dev/null'
>
> so there's nothing sent to the client. I did catch memory context info,
> where it looks like this (complete stats attached):
>
> ------
> TopMemoryContext: 97696 total in 5 blocks; 13056 free (11 chunks);
> 84640 used
> ...
> TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); ...
> PortalContext: 1024 total in 1 blocks; 560 free (0 chunks); ...
> ExecutorState: 2541764672 total in 314 blocks; 6528176 free
> (1208 chunks); 2535236496 used
> printtup: 8192 total in 1 blocks; 7952 free (0 chunks); ...
> ...
> ...
> Grand total: 2544132336 bytes in 528 blocks; 7484504 free
> (1340 chunks); 2536647832 used
> ------
>
> I'd say 2.5GB in ExecutorState seems a bit excessive ... Seems there's
> some memory management issue? My guess is we're not releasing memory
> allocated while parsing the JSON or building JSON output.
>
> I'm not attaching the data, but I can provide that if needed - it's
> about 600MB compressed. The structure is not particularly complex, it's
> movie info from [1] combined into a JSON document (one per movie).

Thanks for the report.

Yeah, I'd like to see the data to try to drill down into what's piling
up in ExecutorState. I want to be sure of if the 1st, query functions
patch, is not implicated in this, because I'd like to get that one out
of the way sooner than later.

--
Thanks, Amit Langote

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2024-03-06 04:32:43 Re: Switching XLog source from archive to streaming when primary available
Previous Message John Naylor 2024-03-06 03:59:34 Re: [PoC] Improve dead tuple storage for lazy vacuum