Re: Compressing temporary files

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: "Chainani, Naresh" <nareshkc(at)amazon(dot)com>
Subject: Re: Compressing temporary files
Date: 2021-10-06 15:02:56
Message-ID: a61ee602-1d1c-4c52-1cff-689bae9046d0@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 9/11/21 2:31 PM, Andrey Borodin wrote:
> Hi hackers!
>
> There's a lot of compression discussions nowadays. And that's cool!
> Recently Naresh Chainani in private discussion shared with me the
> idea to compress temporary files on disk. And I was thrilled to find
> no evidence of implementation of this interesting idea.
>
> I've prototyped Random Access Compressed File for fun[0]. The code is
> very dirty proof-of-concept. I compress Buffile by one block at a
> time. There are directory pages to store information about the size
> of each compressed block. If any byte of the block is changed - whole
> block is recompressed. Wasted space is never reused. If compressed
> block is more then BLCSZ - unknown bad things will happen :)
>

Might be an interesting feature, and the approach seems reasonable too
(of course, it's a PoC, so it has rough edges that'd need to be solved).

Not sure if compressing it at the 8kB block granularity is good or bad.
Presumably larger compression blocks would give better compression, but
that's a detail we would investigate later.

> Here are some my observations.
>
> 0. The idea seems feasible. API of fd.c used by buffile.c can easily
> be abstracted for compressed temporary files. Seeks are necessary,
> but they are not very frequent. It's easy to make temp file
> compression GUC-controlled.
>

Hmm. How much more expensive the seeks are, actually? If we compress the
files block by block, then it's decompression of 8kB of data. Of course,
that's not free, but if you compare it to doing less I/O, it may easily
be a significant win.

> 1. Temp file footprint can be easily reduced. For example query
> create unlogged table y as select random()::text t from
> generate_series(0,9999999) g; uses for toast index build 140000000
> bytes of temp file. With patch this value is reduced to 40841704
> (x3.42 smaller).
>

That seems a bit optimistic, really. The problem is that while random()
is random, it means we're only dealing with 10 characters in the text
value. That's pretty redundant, and the compression benefits from that.

But then again, data produced by queries (which we may need to sort,
which generates temp files) is probably redundant too.

> 2. I have not found any evidence of performance improvement. I've
> only benchmarked patch on my laptop. And RAM (page cache) diminished
> any difference between writing compressed block and uncompressed
> block.
>

I expect the performance improvement to be less direct, requiring
contention for resources (memory and I/O bandwidth). If you have
multiple sessions and memory pressure, that'll force temporary files
from page cache to disk. The compression will reduce the memory pressure
(because of less data written to page cache), possibly even eliminating
the need to write dirty pages to disk. And if we still have to write
data to disk, this reduces the amount we have to write.

Of course, it may also reduce the disk space required for temp files,
which is also nice.

> How do you think: does it worth to pursue the idea? OLTP systems
> rarely rely on data spilled to disk. Are there any known good random
> access compressed file libs? So we could avoid reinventing the
> wheel. Maybe someone tried this approach before?
>

I'd say it's worth investigating further.

Not sure about existing solutions / libraries for this problem, but my
guess is the overall approach is roughly what you implemented.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Мельников Антон Андреевич 2021-10-06 15:13:56 Re[2]: [PATCH] Tracking statements entry timestamp in pg_stat_statements
Previous Message Robert Haas 2021-10-06 15:01:25 Re: storing an explicit nonce