| From: | Boris Mironov <boris_mironov(at)outlook(dot)com> |
|---|---|
| To: | Madyshev Egor <E(dot)Madyshev(at)ftdata(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY) |
| Date: | 2026-02-23 10:41:19 |
| Message-ID: | PH0PR08MB70208182284EBA18F64228798877A@PH0PR08MB7020.namprd08.prod.outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Egor,
Here comes v8 of the patch with all requested changes. Sorry it took a little while to implement...
> 1. Perhaps we should rename functions initRowMethodBin to
> initRowMethodBinary, initPopulateTableText to initPopulateTableTextCopy,
> initPopulateTableBinary to initPopulateTableBinaryCopy,
> initGenerateDataClientSideText to initGenerateDataClientSideTextFrmt,
> initGenerateDataClientSideBinary to initGenerateDataClientSideBinaryFrmt
> for better clarity.
Done
> 2. The --help output currently describes only the 'M' mode. Should we
>also add a description for the 'S' mode for completeness?
Fixed. Thanks for noticing
> 3. I`m wondering if the default value for the 'filler' column in
> initCreateTables is necessary? The current functionality seems
> unaffected, so perhaps we could avoid changing this function to keep the
> diff minimal.
That was done purely for debugging. Fixed
> 4. I noticed that vanilla client functionality in M mode is not
> implemented. Is there a specific reason for this? It seems feasible to
> implement by passing a counter, similar to how it`s done in
> initPopulateTableBinary. If there are reasons not to implement it, in
> mode Mg pgbench should not run mode Sg, it just pg_fatal.
I didn't do it initially to keep original logic intact. To unify new patch
with existing codebase I've added multi-transactional functionality
to client-side data generator and extracted "show progress" code
into own procedure. Now both client-side generators use it.
> 5. In mode client binary format generation c, It would be the right
> thing to do implement write progress of generating data and 'quiet' mode,
> as it already implement in client text format generation g.
Everything is unified now via new procedure showPopulateTableCopyProgress
> 6. In bufferData, when len == 1, we call bufferCharData, which already
> increments bin_copy_buffer_length. However, at the end of bufferCharData
> we increment it again, leading to a double increment.
Fixed. Thank you
> 7. I suggest adding column count in function initPopulateTableBinary,
> and initBranchBinary, initTellerBinary, initAccountBinary, to pass it
> from initAccountBinary to init_row, and use it in functions
> initBranchBinary, initTellerBinary, initAccountBinary. This will
> increase the readability of the code, and remove the magic numbers in
> the addColumnCounter call.
Done
> 8. I think check and install data_generation_type in function
> checkInitSteps is not quite right. In the current realization, pgbench
> allows run data generation many times (dtCdtC...), so i suggest do not
> touching this functionality. My suggestion would be to revert all the
> changes from function checkInitSteps, set the data_generation_type in
> switch in runInitSteps and remove call function checkInitSteps
> from main.
I thought about such solution initially, but then decided to implement extra
procedure where I can count how many times we asked to initialize data.
Since this is legacy code and I can see need for ability to run init multiple
times (eg, get exact timing of different methods for comparison or run
same code multiple times to get average timing), I just show warning.
All in all, thanks a lot for your deep analysis of the code and great
suggestions on how to make better.
Best regards,
Boris
| Attachment | Content-Type | Size |
|---|---|---|
| v8-pgbench-faster-modes.patch | application/octet-stream | 116.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexander Kuzmenkov | 2026-02-23 11:01:08 | Trouble with ROWID_VAR in custom plan node |
| Previous Message | John Naylor | 2026-02-23 10:38:01 | Re: refactor architecture-specific popcount code |