From: | James Coleman <jtc331(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |
Date: | 2019-11-10 02:25:46 |
Message-ID: | CAAaqYe9BJi4LUX54L9ZmprPqny4H4VZR8C9zsHY9e6PJoqMprw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Sat, Nov 9, 2019 at 6:14 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Sun, Nov 10, 2019 at 7:27 AM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > Hmmm, but the expected row width is only 16B, and with 6M rows that's
> > only about 90GB. So how come this needs 1TB temporary files? I'm sure
> > there's a bit of overhead, but 10X seems a bit much.
>
> (s/6M/6B/) Yeah, that comes out to only ~90GB but ... PHJ doesn't
> immediately unlink files from the previous generation when it
> repartitions. You need at two generations' worth of files (old and
> new) while repartitioning, but you don't need the grand-parent
> generation. I didn't think this was a problem because I didn't expect
> to have to repartition many times (and there is a similar but
> different kind of amplification in the non-parallel code). If this
> problem is due to the 100% extreme skew threshold causing us to go
> berserk, then that 10X multiplier is of the right order, if you
> imagine this thing started out with ~512 batches and got up to ~1M
> batches before it blew a gasket.
Are you saying that it also doesn't unlink the grand-parent until the end?
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2019-11-10 03:02:22 | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |
Previous Message | James Coleman | 2019-11-10 02:25:00 | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |