Re: [HACKERS] [bug-fix] Cannot select big bytea values (~600MB)

From: Anna Akenteva <a(dot)akenteva(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] [bug-fix] Cannot select big bytea values (~600MB)
Date: 2018-02-16 19:00:04
Message-ID: bcaf0824f702f7d7b4a637ee8c36d057@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane writes 2018-02-16 17:58:
> Also, I don't entirely see how this fixes your stated goal of being
> able to select a bytea value whose textual representation exceeds 1GB.
It's not necessarily my goal. My goal is to avoid the confusing
situation where you insert something into a table and suddenly
everything seems to break for no reason and you don't get any
information on what to do next. As I see it, it could be solved with:
a) allowing including big bytea values but making sure that it doesn't
cause problems (which I tried to do with my patch)
b) prohibiting inserting the kind of data that will cause problems
c) informing the user about the issue (maybe documenting this behaviour
or giving a more informative error message)

So far the weird behaviour of big bytea values that I see boils down to:
1) We can't SELECT it
after INSERTing it and there's no clear explanation as to why. It does
make sense that we can insert a 900MB value into a table and then we
can't select it due to its textual representation taking up more than
1GB. It's confusing for whoever uses Postgres though. It doesn't seem to
be documented anywhere (correct me if I'm wrong) and you don't get to
see any hints like "don't worry, you can retrieve the data, but use COPY
in binary format for that".

2) We can't use pg_dump
on a database that has a big bytea value, it will just show the same
error as when we try to select the value. And again, it doesn't explain
anything in the error message and I couldn't find it documented
anywhere. It's weird that it would just allow me to insert a value that
will make pg_dump unusable (although maybe there is a good enough way to
workaround it that I'm not aware of).

> The wire protocol can't support that either, and even if it did,
> I wonder how many client programs could cope. Extremely wide tuple
> values create pain points in many places.
I see how it can create a lot of problems. I do agree that making the
max length bigger doesn't really seem to be a good solution and I see
now how it's hard to implement properly. I don't see other ways to make
it work so far though. If it can't be fixed anytime soon, do you think
that documenting this behavior could be worth it?

--
Anna Akenteva
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-02-16 20:22:34 Re: rename sgml files?
Previous Message Alvaro Herrera 2018-02-16 17:25:29 Re: [HACKERS] [bug-fix] Cannot select big bytea values (~600MB)