Re: Verifying embedded oids in *recv is a bad idea

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: andres(at)anarazel(dot)de
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Verifying embedded oids in *recv is a bad idea
Date: 2016-04-26 04:20:00
Message-ID: 20160426.132000.16866976.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

At Mon, 25 Apr 2016 17:17:13 -0700, Andres Freund <andres(at)anarazel(dot)de> wrote in <20160426001713(dot)hbqdiwvf4mkzkg55(at)alap3(dot)anarazel(dot)de>
> Hi,
>
> for performance reasons it's a good idea to use the binary protocol. But
> doing so between two postgres installations is made unnecessarily hard
> by the choice of embedding and verifying oids in composite and array
> types. When using extensions, even commonly used ones like hstore, the
> datatype oids will often not be the same between systems, even when
> using exactly the same version on the same OS.
>
> Specifically I'm talking about
>
> Datum
> array_recv(PG_FUNCTION_ARGS)
> {
> ...
> element_type = pq_getmsgint(buf, sizeof(Oid));
> if (element_type != spec_element_type)
> {
> /* XXX Can we allow taking the input element type in any cases? */
> ereport(ERROR,
> (errcode(ERRCODE_DATATYPE_MISMATCH),
> errmsg("wrong element type")));
> }
> ...
> }
>
> and
>
> Datum
> record_recv(PG_FUNCTION_ARGS)
> {
> ...
> /* Process each column */
> for (i = 0; i < ncolumns; i++)
> ...
> /* Verify column datatype */
> coltypoid = pq_getmsgint(buf, sizeof(Oid));
> if (coltypoid != column_type)
> ereport(ERROR,
> (errcode(ERRCODE_DATATYPE_MISMATCH),
> errmsg("wrong data type: %u, expected %u",
> coltypoid, column_type)));
> ...
> }
>
>
> given that we're giving up quite some speed and adding complexity to
> make send/recv functions portable, this seems like a shame.
>
> I'm failing to see what these checks are buying us? I mean the text
> representation of a composite doesn't include type information about
> contained columns either, I don't see why the binary version has to?
>
> I'm basically thinking that we should remove the checks on the receiving
> side, but leave the sending of oids in place for backward compat.

FWIW, I think that typsend/typreceive are intended for the use by
COPY FROM/TO and the doc says that the following.

http://www.postgresql.org/docs/devel/static/sql-copy.html

> Binary Format
>
> The binary format option causes all data to be stored/read as
> binary format rather than as text. It is somewhat faster than the
> text and CSV formats, but a binary-format file is less portable
> across machine architectures and PostgreSQL versions. Also, the
> binary format is very data type specific; for example it will not
> work to output binary data from a smallint column and read it
> into an integer column, even though that would work fine in text
> format.

I haven't considered much about the usefulness of this
restriction, but receiving into an array of a different type
easily leads to a crash.

# However, false matching of type oids also would do so.

I suppose that we should reconsider here if removing the checks.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-04-26 04:21:03 Re: Bogus cleanup code in PostgresNode.pm
Previous Message Robert Haas 2016-04-26 03:52:19 Re: postgres_fdw : Not able to update foreign table referring to a local table's view when use_remote_estimate = true