RE: BLOB / CLOB support in PostgreSQL

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Vladimir Sitnikov' <sitnikov(dot)vladimir(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: BLOB / CLOB support in PostgreSQL
Date: 2020-09-30 02:31:40
Message-ID: TYAPR01MB29903553D7B67376E9DA4EB3FE330@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

From: Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
> Just in case, I'm PgJDBC committer.

Thank you very much for your great efforts for the wonderful PgJDBC. I saw you active.

# I'd be happy if you send emails in text format so that the reply looks nice. Your email seems to be in HTML.

> and apparently, Andrew is surprised that the database lacks BLOB/CLOB support.

I was a bit surprised too when I first saw Postgres not support blob/clob but bytea, because I had an impression that Postgres is highly SQL standard compliant. I'm for adding blob/clob data types in server.

At the same time, I wonder why Postgres had to add bytea instead of blob. It may be that there are or were some technical issues. They may stand in the way even now.

One thing I can think of is the parameter format (text/binary). libpq's PQexecParams() can specify input format for each parameter, but it can only specify the output format for all returned columns, not for each column. As a consequence, the bytea host variable support added in PG 12 can INSERT 1 GB of binary data, but retrieval of the value fails with an error message like "invalid alloc request." That's because the server allocates twice the size of stored data to convert it into text format, whose size becomes about 2 GB. That exceeds the limit palloc() can allocate.

33.3. Command Execution Functions
https://www.postgresql.org/docs/devel/libpq-exec.html

> The concerns to avoid "Clob maps to text" could be:
> a) Once the behavior is implemented, it is hard to change. That is applications would rely on it (and it becomes a defacto standard), and it would be hard to move to the proper "text with streaming API" datatype.
> b) If we make <clob is text>, then people might start using update/substring APIs (which is the primary motivation for Clob) without realizing there’s full value update behind the scenes. Currently, they can use setString/getString for text, and it is crystal clear that the text is updated fully on every update.

And if we treat clob as a synonym for text (just like the relationship between char and nchar), even when the user writes clob in DDL, pg_dump will output it as text. That makes it a bit harder to use the output for other DBMSs.

Regards
Takayuki Tsunakawa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2020-09-30 02:33:21 RE: Disable WAL logging to speed up data loading
Previous Message David G. Johnston 2020-09-30 02:29:25 NOTIFY docs fixup - emit and deliver consistency

Browse pgsql-jdbc by date

  From Date Subject
Next Message Jorge Solorzano 2020-09-30 17:00:46 [pgjdbc/pgjdbc] 2c02d4: feat: add smallserial metadata (#899)
Previous Message Vladimir Sitnikov 2020-09-29 20:48:24 Re: BLOB / CLOB support in PostgreSQL