Re: Add support for DEFAULT specification in COPY FROM

From: Israel Barth Rubio <barthisrael(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add support for DEFAULT specification in COPY FROM
Date: 2022-08-17 21:12:04
Message-ID: CAO_rXXDLQCeSX3ZWd2iYuA2oDn+3TYcgPfv_coVGAg1rLyn8LQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Andrew,

Thanks for reviewing this patch.

It is worth noting that DEFAULT will only take place if explicitly
specified, meaning there is
no default value for the option DEFAULT. The usage of \D in the tests was
only a suggestion.
Also, NULL marker will be an unquoted empty string by default in CSV mode.

In any case I have manually tested it and the behavior is compliant to what
we see in NULL
if it is defined to use \N both in text and CSV modes.

- NULL as \N:

postgres=# CREATE TEMP TABLE copy_null (id integer primary key, value text);
CREATE TABLE
postgres=# copy copy_null from stdin with (format text, NULL '\N');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1 \N
>> 2 \\N
>> 3 "\N"
>> \.
COPY 3
postgres=# TABLE copy_null ;
id | value
----+-------
1 |
2 | \N
3 | "N"
(3 rows)

postgres=# TRUNCATE copy_null ;
TRUNCATE TABLE
postgres=# copy copy_null from stdin with (format csv, NULL '\N');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,\N
>> 2,\\N
>> 3,"\N"
>> \.
COPY 3
postgres=# TABLE copy_null ;
id | value
----+-------
1 |
2 | \\N
3 | \N
(3 rows)

- DEFAULT as \D:

postgres=# CREATE TEMP TABLE copy_default (id integer primary key, value
text default 'test');
CREATE TABLE
postgres=# copy copy_default from stdin with (format text, DEFAULT '\D');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1 \D
>> 2 \\D
>> 3 "\D"
>> \.
COPY 3
postgres=# TABLE copy_default ;
id | value
----+-------
1 | test
2 | \D
3 | "D"
(3 rows)

postgres=# TRUNCATE copy_default ;
TRUNCATE TABLE
postgres=# copy copy_default from stdin with (format csv, DEFAULT '\D');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,\D
>> 2,\\D
>> 3,"\D"
>> \.
COPY 3
postgres=# TABLE copy_default ;
id | value
----+-------
1 | test
2 | \\D
3 | \D
(3 rows)

If you do not specify DEFAULT in COPY FROM, it will have no default value
for
that option. So, if you try to load \D in CSV mode, then it will load the
literal value:

postgres=# CREATE TEMP TABLE copy (id integer primary key, value text
default 'test');
CREATE TABLE
postgres=# copy copy from stdin with (format csv);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,\D
>> 2,\\D
>> 3,"\D"
>> \.
COPY 3
postgres=# TABLE copy ;
id | value
----+-------
1 | \D
2 | \\D
3 | \D
(3 rows)

Does that address your concerns?

I am attaching the new patch, containing the above test in the regress
suite.

Best regards,
Israel.

Em ter., 16 de ago. de 2022 às 17:27, Andrew Dunstan <andrew(at)dunslane(dot)net>
escreveu:

>
> On 2022-08-16 Tu 14:12, Israel Barth Rubio wrote:
> > Hello all,
> >
> > With the current implementation of COPY FROM in PostgreSQL we are able to
> > load the DEFAULT value/expression of a column if the column is absent
> > in the
> > list of specified columns. We are not able to explicitly ask that
> > PostgreSQL uses
> > the DEFAULT value/expression in a column that is being fetched from
> > the input
> > file, though.
> >
> > This patch adds support for handling DEFAULT values in COPY FROM. It
> > works
> > similarly to NULL in COPY FROM: whenever the marker that was set for
> > DEFAULT
> > value/expression is read from the input stream, it will evaluate the
> > DEFAULT
> > value/expression of the corresponding column.
> >
> > I'm currently working as a support engineer, and both me and some
> > customers had
> > already faced a situation where we missed an implementation like this
> > in COPY
> > FROM, and had to work around that by using an input file where the
> > column which
> > has a DEFAULT value/expression was removed.
> >
> > That does not solve all issues though, as it might be the case that we
> > just want a
> > DEFAULT value to take place if no other value was set for the column
> > in the input
> > file, meaning we would like to have a column in the input file that
> > sometimes assume
> > the DEFAULT value/expression, and sometimes assume an actual given value.
> >
> > The implementation was performed about one month ago and included all
> > regression
> > tests regarding the changes that were introduced. It was just rebased
> > on top of the
> > master branch before submitting this patch, and all tests are still
> > succeeding.
> >
> > The implementation takes advantage of the logic that was already
> > implemented to
> > handle DEFAULT values for missing columns in COPY FROM. I just
> > modified it to
> > make it available the DEFAULT values/expressions for all columns
> > instead of only
> > for the ones that were missing in the specification. I had to change
> > the variables
> > accordingly, so it would index the correct positions in the new array
> > of DEFAULT
> > values/expressions.
> >
> > Besides that, I also copied and pasted most of the checks that are
> > performed for the
> > NULL feature of COPY FROM, as the DEFAULT behaves somehow similarly.
> >
> >
>
>
> Interesting, and probably useful. I've only had a brief look, but it's
> important that the default marker not be quoted in CSV mode (c.f. NULL)
> -f it is it should be taken as a literal rather than a special value.
> Maybe that's taken care of, but there should at least be a test for it,
> which I didn't see.
>
>
> cheers
>
>
> andrew
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>

Attachment Content-Type Size
v2-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch application/octet-stream 23.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-08-17 21:53:17 Re: [RFC] building postgres with meson - v11
Previous Message Daniel Gustafsson 2022-08-17 21:04:20 Re: TAP output format in pg_regress