From: | Filip Janus <fjanus(at)redhat(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Proposal: Adding compression of temporary files |
Date: | 2024-11-28 11:32:15 |
Message-ID: | CAFjYY+J7ZcJAoE+K4AoyuiS_N6w8vVCQ6HQ72mPB=i4C+QbAeQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I've added a regression test for lz4 compression if the server is compiled
with the "--with-lz4" option.
-Filip-
ne 24. 11. 2024 v 15:53 odesílatel Filip Janus <fjanus(at)redhat(dot)com> napsal:
>
>
> -Filip-
>
>
> st 20. 11. 2024 v 1:35 odesílatel Tomas Vondra <tomas(at)vondra(dot)me> napsal:
>
>> Hi,
>>
>> On 11/18/24 22:58, Filip Janus wrote:
>> > ...
>> > Hi all,
>> > Postgresql supports data compression nowadays, but the compression
>> of
>> > temporary files has not been implemented yet. The huge queries can
>> > produce a significant amount of temporary data that needs to
>> > be stored on disk
>> > and cause many expensive I/O operations.
>> > I am attaching a proposal of the patch to enable temporary files
>> > compression for
>> > hashjoins for now. Initially, I've chosen the LZ4 compression
>> > algorithm. It would
>> > probably make better sense to start with pglz, but I realized it
>> late.
>> >
>>
>> Thanks for the idea & patch. I agree this might be quite useful for
>> workloads generating a lot of temporary files for stuff like sorts etc.
>> I think it will be interesting to think about the trade offs, i.e. how
>> to pick the compression level - at some point the compression ratio
>> stops improving while paying more and more CPU time. Not sure what the
>> right choice is, so using default seems fine.
>>
>> I agree it'd be better to start with pglz, and only then add lz4 etc.
>> Firstly, pglz is simply the built-in compression, supported everywhere.
>> And it's also simpler to implement, I think.
>>
>> > # Future possible improvements
>> > Reducing the number of memory allocations within the dumping and
>> > loading of
>> > the buffer. I have two ideas for solving this problem. I would
>> > either add a buffer into
>> > struct BufFile or provide the buffer as an argument from the caller.
>> > For the sequential
>> > execution, I would prefer the second option.
>> >
>>
>> Yes, this would be good. Doing a palloc+pfree for each compression is
>> going to be expensive, especially because these buffers are going to be
>> large - likely larger than 8kB. Which means it's not cached in the
>> memory context, etc.
>>
>> Adding it to the BufFile is not going to fly, because that doubles the
>> amount of memory per file. And we already have major issues with hash
>> joins consuming massive amounts of memory. But at the same time the
>> buffer is only needed during compression, and there's only one at a
>> time. So I agree with passing a single buffer as an argument.
>>
>> > # Future plan/open questions
>> > In the future, I would like to add support for pglz and zstd.
>> > Further, I plan to
>> > extend the support of the temporary file compression also for
>> > sorting, gist index creation, etc.
>> >
>> > Experimenting with the stream mode of compression algorithms. The
>> > compression
>> > ratio of LZ4 in block mode seems to be satisfying, but the stream
>> > mode could
>> > produce a better ratio, but it would consume more memory due to the
>> > requirement to store
>> > context for LZ4 stream compression.
>> >
>>
>> One thing I realized is that this only enables temp file compression for
>> a single place - hash join spill files. AFAIK this is because compressed
>> files don't support random access, and the other places might need that.
>>
>> Is that correct? The patch does not explain this anywhere. If that's
>> correct, the patch probably should mention this in a comment for the
>> 'compress' argument added to BufFileCreateTemp(), so that it's clear
>> when it's legal to set compress=true.
>>
>
> I will add the description there.
>
>
>> Which other places might compress temp files? Surely hash joins are not
>> the only place that could benefit from this, right?
>>
>
> Yes, you are definitely right. I have chosen the hash joins as a POC
> because
> there are no seeks besides seeks at the beginning of the buffer.
> I have focused on hashjoins, but there are definitely also other places
> where
> the compression could be used. I want to add support in other places
> in the feature.
>
>
>> Another thing is testing. If I run regression tests, it won't use
>> compression at all, because the GUC has "none" by default, right? But we
>> need some testing, so how would we do that? One option would be to add a
>> regression test that explicitly sets the GUC and does a hash join, but
>> that won't work with lz4 (because that may not be enabled).
>
>
> Right, it's "none" by default. My opinion is that we would like to test
> every supported compression method, so I will try to add environment
> variable as
> you recommended.
>
>
>>
>> Another option might be to add a PG_TEST_xxx environment variable that
>> determines compression to use. Something like PG_TEST_USE_UNIX_SOCKETS.
>> But perhaps there's a simpler way.
>>
>> > # Benchmark
>> > I prepared three different databases to check expectations. Each
>> > dataset is described below. My testing demonstrates that my patch
>> > improves the execution time of huge hash joins.
>> > Also, my implementation should not
>> > negatively affect performance within smaller queries.
>> > The usage of memory needed for temporary files was reduced in every
>> > execution without a significant impact on execution time.
>> >
>> > *## Dataset A:*
>> > Tables*
>> > *
>> > table_a(bigint id,text data_text,integer data_number) - 10000000
>> rows
>> > table_b(bigint id, integer ref_id, numeric data_value, bytea
>> > data_blob) - 10000000 rows
>> > Query: SELECT * FROM table_a a JOIN table_b b ON a.id <http://
>> > a.id> = b.id <http://b.id>;
>> >
>> > The tables contain highly compressible data.
>> > The query demonstrated a reduction in the usage of the temporary
>> > files ~20GB -> 3GB, based on this reduction also caused the
>> execution
>> > time of the query to be reduced by about ~10s.
>> >
>> >
>> > *## Dataset B:*
>> > Tables:*
>> > *
>> > table_a(integer id, text data_blob) - 1110000 rows
>> > table_b(integer id, text data_blob) - 10000000 rows
>> > Query: SELECT * FROM table_a a JOIN table_b b ON a.id <http://
>> > a.id> = b.id <http://b.id>;
>> >
>> > The tables contain less compressible data. data_blob was generated
>> > by a pseudo-random generator.
>> > In this case, the data reduction was only ~50%. Also, the execution
>> > time was reduced
>> > only slightly with the enabled compression.
>> >
>> > The second scenario demonstrates no overhead in the case of enabled
>> > compression and extended work_mem to avoid temp file usage.
>> >
>> > *## Dataset C:*
>> > Tables
>> > customers (integer,text,text,text,text)
>> > order_items(integer,integer,integer,integer,numeric(10,2))
>> > orders(integer,integer,timestamp,numeric(10,2))
>> > products(integer,text,text,numeric(10,2),integer)
>> >
>> > Query: SELECT p.product_id, p.name <http://p.name>, p.price,
>> > SUM(oi.quantity) AS total_quantity, AVG(oi.price) AS avg_item_price
>> > FROM eshop.products p JOIN eshop.order_items oi ON p.product_id =
>> > oi.product_id JOIN
>> > eshop.orders o ON oi.order_id = o.order_id WHERE o.order_date >
>> > '2020-01-01' AND p.price > 50
>> > GROUP BY p.product_id, p.name <http://p.name>, p.price HAVING
>> > SUM(oi.quantity) > 1000
>> > ORDER BY total_quantity DESC LIMIT 100;
>> >
>> > This scenario should demonstrate a more realistic usage of the
>> database.
>> > Enabled compression slightly reduced the temporary memory usage, but
>> > the execution
>> > time wasn't affected by compression.
>> >
>> >
>> > +------------+-------------------------+-----------------------
>> > +------------------------------+
>> > | Dataset | Compression. | temp_bytes | Execution
>> > Time (ms) |
>> > +------------+-------------------------+-----------------------
>> > +----------------------------- +
>> > | A | Yes | 3.09 GiB
>> > | 22s586ms | work_mem = 4MB
>> > | | No | 21.89 GiB
>> > | 35s | work_mem = 4MB
>> > +------------+-------------------------+-----------------------
>> > +----------------------------------------
>> > | B | Yes | 333 MB
>> > | 1815.545 ms | work_mem = 4MB
>> > | | No | 146 MB
>> > | 1500.460 ms | work_mem = 4MB
>> > | | Yes | 0 MB
>> > | 3262.305 ms | work_mem = 80MB
>> > | | No | 0 MB
>> > | 3174.725 ms | work_mem = 80MB
>> > +-------------+------------------------+------------------------
>> > +-------------------------------------
>> > | C | Yes | 40 MB
>> > | 1011.020 ms | work_mem = 1MB
>> > | | No | 53
>> > MB | 1034.142 ms | work_mem = 1MB
>> > +------------+------------------------+------------------------
>> > +--------------------------------------
>> >
>> >
>>
>> Thanks. I'll try to do some benchmarks on my own.
>>
>> Are these results fro ma single run, or an average of multiple runs?
>
>
> It is average from multiple runs.
>
> Do
>> you maybe have a script to reproduce this, including the data generation?
>
>
> I am attaching my SQL file for database preparation. I also did further
> testing
> with two other machines( see attachment huge_tables.rtf ).
>
>>
>> Also, can you share some information about the machine used for this? I
>> expect the impact to strongly depends on memory pressure - if the temp
>> file fits into page cache (and stays there), it may not benefit from the
>> compression, right?
>>
>
> If it fits into the page cache due to compression, I would consider it as
> a benefit from compression.
> I performed further testing on machines with different memory sizes.
> Both experiments showed that compression was beneficial for execution
> time.
> The execution time reduction was more significant in the case of the
> machine that had
> less memory available.
>
> Tests were performed on:
> MacBook PRO M3 36GB - MacOs
> Virtual machine ARM64 10GB/ 6CPU - Fedora 39
>
>
>>
>> regards
>>
>> --
>> Tomas Vondra
>>
>>
Attachment | Content-Type | Size |
---|---|---|
0001-This-commit-adds-support-for-temporary-files-compres-v3.patch | application/octet-stream | 77.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2024-11-28 11:45:35 | Re: Adding OLD/NEW support to RETURNING |
Previous Message | Kirill Reshke | 2024-11-28 11:19:41 | Re: [PATCH] kNN for btree |