Re: libpq compression

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Daniil Zakhlystov <usernamedt(at)yandex-team(dot)ru>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Denis Smirnov <sd(at)arenadata(dot)io>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: libpq compression
Date: 2020-12-22 18:15:23
Message-ID: d889e3a2-d3f4-3f8e-aa81-c142e32ddf57@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/22/20 6:56 PM, Robert Haas wrote:
> On Tue, Dec 22, 2020 at 6:24 AM Daniil Zakhlystov
> <usernamedt(at)yandex-team(dot)ru> wrote:
>> When using bidirectional compression, Postgres resource usage correlates with the selected compression level. For example, here is the Postgresql application memory usage:
>>
>> No compression - 1.2 GiB
>>
>> ZSTD
>> zstd:1 - 1.4 GiB
>> zstd:7 - 4.0 GiB
>> zstd:13 - 17.7 GiB
>> zstd:19 - 56.3 GiB
>> zstd:20 - 109.8 GiB - did not succeed
>> zstd:21, zstd:22 > 140 GiB
>> Postgres process crashes (out of memory)
>
> Good grief. So, suppose we add compression and support zstd. Then, can
> unprivileged user capable of connecting to the database can negotiate
> for zstd level 1 and then choose to actually send data compressed at
> zstd level 22, crashing the server if it doesn't have a crapton of
> memory? Honestly, I wouldn't blame somebody for filing a CVE if we
> allowed that sort of thing to happen. I'm not sure what the solution
> is, but we can't leave a way for a malicious client to consume 140GB
> of memory on the server *per connection*. I assumed decompression
> memory was going to measured in kB or MB, not GB. Honestly, even at
> say L7, if you've got max_connections=100 and a user who wants to make
> trouble, you have a really big problem.
>
> Perhaps I'm being too pessimistic here, but man that's a lot of memory.
>

Maybe I'm just confused, but my assumption was this means there's a
memory leak somewhere - that we're not resetting/freeing some piece of
memory, or so. Why would zstd need so much memory? It seems like a
pretty serious disadvantage, so how could it become so popular?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-12-22 18:19:51 Re: On login trigger: take three
Previous Message Robert Haas 2020-12-22 17:56:59 Re: libpq compression