| From: | Filip Janus <fjanus(at)redhat(dot)com> |
|---|---|
| To: | lakshmi <lakshmigcdac(at)gmail(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz> |
| Cc: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Proposal: Adding compression of temporary files |
| Date: | 2026-01-18 15:50:24 |
| Message-ID: | CAFjYY+JDSpOQwYAfTQQ43=BA=d32XfcAdaPVJgHheV9fQBbLWg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
Thank you, Tomas, for the thorough and detailed review!
I'm posting an updated patch set incorporating the changes from your review.
Changes applied from review:
- Simplified BufFileCreateTemp interface
- Improved error handling in BufFileLoadBuffer/BufFileDumpBuffer
- Unified compression header format (CompressHeader struct)
- Added tuplestore integration (compression when EXEC_FLAG_BACKWARD is not
required)
- Various code cleanups and comment improvements
Additional change (not from review):
- Switched from static shared buffer to per-file allocation. The shared
buffer
provided a negligible performance benefit while keeping memory allocated
for the backend's lifetime.
Future work:
- Support for additional compression methods (gzip, zstd)
- Random access and seek operations with compression
-Filip-
út 13. 1. 2026 v 14:34 odesílatel Filip Janus <fjanus(at)redhat(dot)com> napsal:
> Hi,
> Yes, it needs to be rebased. I am working on it. I will post it here soon.
>
>
> -Filip-
>
>
> út 13. 1. 2026 v 13:51 odesílatel lakshmi <lakshmigcdac(at)gmail(dot)com> napsal:
>
>> Hi all,
>> I tried to replicate the temporary file compression issue by applying the
>> two patches shared in the thread on current PostgreSQL master.
>> here is what i observed,
>> 1) patch 1:0001-Add-transparent-compression-for-temporary-files.patch
>> when applying the first patch it ultimately fails to apply due to context
>> mismatches.
>>
>> failures i see are in the following files:
>> src/backend/storage/file/buffile.c
>> src/backend/utils/misc/guc_tables.c
>> src/backend/utils/misc/postgresql.conf.sample
>>
>> 2) The second
>> patch 0002-Add-regression-tests-for-temporary-file-compression.patch
>> ,applies successfully without any issues.
>>
>> Does it mean that the implementation patch needs to be rebased or
>> otherwise adjusted for the current codebase, and if so, what would be the
>> recommended way to proceed?could you please suggest how I should apply the
>> implementation patch in this case?
>>
>>
>> regards
>> lakshmi
>>
>> On Tue, Jan 13, 2026 at 5:01 PM Filip Janus <fjanus(at)redhat(dot)com> wrote:
>>
>>> Rebase after changes introduced in guc_tables.c
>>>
>>> -Filip-
>>>
>>>
>>> út 19. 8. 2025 v 17:48 odesílatel Filip Janus <fjanus(at)redhat(dot)com>
>>> napsal:
>>>
>>>> Fix overlooked compiler warnings
>>>>
>>>> -Filip-
>>>>
>>>>
>>>> po 18. 8. 2025 v 18:51 odesílatel Filip Janus <fjanus(at)redhat(dot)com>
>>>> napsal:
>>>>
>>>>> I rebased the proposal and fixed the problem causing those problems.
>>>>>
>>>>> -Filip-
>>>>>
>>>>>
>>>>> út 17. 6. 2025 v 16:49 odesílatel Andres Freund <andres(at)anarazel(dot)de>
>>>>> napsal:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On 2025-04-25 23:54:00 +0200, Filip Janus wrote:
>>>>>> > The latest rebase.
>>>>>>
>>>>>> This often seems to fail during tests:
>>>>>> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5382
>>>>>>
>>>>>> E.g.
>>>>>>
>>>>>> https://api.cirrus-ci.com/v1/artifact/task/4667337632120832/testrun/build-32/testrun/recovery/027_stream_regress/log/regress_log_027_stream_regress
>>>>>>
>>>>>> === dumping
>>>>>> /tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/regression.diffs
>>>>>> ===
>>>>>> diff -U3
>>>>>> /tmp/cirrus-ci-build/src/test/regress/expected/join_hash_pglz.out
>>>>>> /tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/results/join_hash_pglz.out
>>>>>> ---
>>>>>> /tmp/cirrus-ci-build/src/test/regress/expected/join_hash_pglz.out
>>>>>> 2025-05-26 05:04:40.686524215 +0000
>>>>>> +++
>>>>>> /tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/results/join_hash_pglz.out
>>>>>> 2025-05-26 05:15:00.534907680 +0000
>>>>>> @@ -594,11 +594,8 @@
>>>>>> select count(*) from join_foo
>>>>>> left join (select b1.id, b1.t from join_bar b1 join join_bar b2
>>>>>> using (id)) ss
>>>>>> on join_foo.id < ss.id + 1 and join_foo.id > ss.id - 1;
>>>>>> - count
>>>>>> --------
>>>>>> - 3
>>>>>> -(1 row)
>>>>>> -
>>>>>> +ERROR: could not read from temporary file: read only 8180 of
>>>>>> 1572860 bytes
>>>>>> +CONTEXT: parallel worker
>>>>>> select final > 1 as multibatch
>>>>>> from hash_join_batches(
>>>>>> $$
>>>>>> @@ -606,11 +603,7 @@
>>>>>> left join (select b1.id, b1.t from join_bar b1 join join_bar b2
>>>>>> using (id)) ss
>>>>>> on join_foo.id < ss.id + 1 and join_foo.id > ss.id - 1;
>>>>>> $$);
>>>>>> - multibatch
>>>>>> -------------
>>>>>> - t
>>>>>> -(1 row)
>>>>>> -
>>>>>> +ERROR: current transaction is aborted, commands ignored until end
>>>>>> of transaction block
>>>>>> rollback to settings;
>>>>>> -- single-batch with rescan, parallel-oblivious
>>>>>> savepoint settings;
>>>>>>
>>>>>>
>>>>>> Greetings,
>>>>>>
>>>>>> Andres
>>>>>>
>>>>>>
>>>>>>
| Attachment | Content-Type | Size |
|---|---|---|
| 0002-Add-regression-tests-for-temporary-file-compression.patch | application/octet-stream | 127.6 KB |
| 0001-Add-transparent-compression-for-temporary-files.patch | application/octet-stream | 18.3 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Sami Imseih | 2026-01-18 16:16:16 | Re: Cleaning up PREPARE query strings? |
| Previous Message | Henson Choi | 2026-01-18 15:32:40 | Re: Row pattern recognition |