Re: parallel data loading for pgbench -i

From: lakshmi <lakshmigcdac(at)gmail(dot)com>
To: Mircea Cadariu <cadariu(dot)mircea(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, tomas(at)vondra(dot)me
Subject: Re: parallel data loading for pgbench -i
Date: 2026-02-05 07:17:02
Message-ID: CAEvyyTht69zjnosPjziW6dqNLqs-n6eKia2vof108zQp1QFX=Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Mircea,

Thanks again for the updated patch.
I did some additional testing on 19devel with a larger scale factor.
For scale 100,parallel initialization with -j 10 shows a clear overall
speedup and correct results ,as mentioned earlier.
For scale 500,i observed that client-side data generation becomes
significantly faster with parallel loading,but the total run time was
slightly higher than the serial case on my system.This appears to be mainly
due to much longer vacuum phase after the parallel load.
so the parallel approach clearly improves data generation time,but the
overall benefit may depend on scale and workload characteristics.
Regression tests still pass locally,and correctness checks look good.

just sharing these observations in case they are useful for further
evaluation.

Best regards,
lakshmi

On Thu, Jan 29, 2026 at 4:49 PM Mircea Cadariu <cadariu(dot)mircea(at)gmail(dot)com>
wrote:

> Hi Lakshmi,
> On 19/01/2026 09:25, lakshmi wrote:
>
> Hi Mircea,
>
> I tested the patch on 19devel and it worked well for me.
> Before applying it, -j is rejected in pgbench initialization mode as
> expected. After applying the patch, pgbench -i -s 100 -j 10 runs
> successfully and shows a clear speedup.
> On my system the total runtime dropped to about 9.6s, with client-side
> data generation around 3.3s.
> I also checked correctness after the run — row counts for
> pgbench_accounts, pgbench_branches, and pgbench_tellers all match the
> expected values.
>
> Thanks for working on this, the improvement is very noticeable.
>
> Best regards,
> lakshmi
>
> Thanks for having a look and trying it out!
>
> FYI this is one of Tomas Vondra's patch ideas from his blog [1].
>
> I have attached a new version which now includes docs, tests, a proposed
> commit message, and an attempt to fix the current CI failures (Windows).
>
> [1] - https://vondra.me/posts/patch-idea-parallel-pgbench-i
>
> --
> Thanks,
> Mircea Cadariu
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message wenhui qiu 2026-02-05 07:29:54 Re: Convert NOT IN sublinks to anti-joins when safe
Previous Message Jakub Wartak 2026-02-05 07:15:53 Re: FileFallocate misbehaving on XFS