Re: Death postgres

From: Marc Millas <marc(dot)millas(at)mokadb(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Death postgres
Date: 2023-05-12 15:41:37
Message-ID: CADX_1aYQN95m9BggNAUaWmci0U+weJf1bJybiFSc3qSGF0mbJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, May 12, 2023 at 8:31 AM Peter J. Holzer <hjp-pgsql(at)hjp(dot)at> wrote:

> On 2023-05-11 21:27:57 +0200, Marc Millas wrote:
> > the 750000 lines in each tables are not NULLs but '' empty varchar,
> which,
> > obviously is not the same thing.
> > and which perfectly generates 500 billions lines for the left join.
> > So, no planner or statistics pbs. apologies for the time wasted.
>
> No problem. Glad to have solved that puzzle.
>
> > Back to the initial pb: if, with temp_file_limit positioned to 210 GB,
> > I try to run the select * from table_a left join table_b on the col_a
> > (which contains the 750000 '' on both tables)
> > then postgres do crash, killed by oom, after having taken 1.1 TB of
> additional
> > disk space.
>
> My guess is that the amount of parallelism is the problem.
>
> work_mem is a per-node limit. Even a single process can use a multiple of
> work_mem if the query contains nested nodes (which almost every query
> does, but most nodes don't need much memory). With 5 parallel workers,
> the total consumption will be 5 times that. So to prevent the OOM
> condition you would need to reduce work_mem or max_parallel_workers (at
> least for this query).
>

we have more than 100GB RAM and only 1 user, with one request running.
work_mem is set to 10MB. for oom to kill due to work_mem it means that for
one request with 2 left join, postgres needs more than 10.000 work_mem
buffers.
to me, it seems difficult to believe. but that postgres may need that RAM
space for hashing or whatever other similar purpose seems more probable.
no ?

>
> The description temp_file_limit says "...the maximum amount of disk
> space that *a process* can use...". So with 5 workers that's 210*5 =
> 1050 GB total. Again, you may want to reduce either temp_file_limit or
> max_parallel_workers.
>
Yes, but if so, we may have had a request canceled due to temp_file limit,
which was not the case.

>
> > to my understanding, before postgres 13, hash aggregate did eat RAM
> limitless
> > in such circumstances.
> > but in 14.2 ??
> > (I know, 14.8 is up...)
>
> Maybe the older version of postgres didn't use as many workers for that
> query (or maybe not parallelize it at all)?
>
> hp
>
> --
> _ | Peter J. Holzer | Story must make more sense than reality.
> |_|_) | |
> | | | hjp(at)hjp(dot)at | -- Charles Stross, "Creative writing
> __/ | http://www.hjp.at/ | challenge!"
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2023-05-12 15:54:00 Re: Adding SHOW CREATE TABLE
Previous Message Thorsten Glaser 2023-05-12 15:35:45 Re: Adding SHOW CREATE TABLE