| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | jian he <jian(dot)universality(at)gmail(dot)com> |
| Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org> |
| Subject: | Re: Fix bug of COPY (on_error set_null) |
| Date: | 2026-05-18 22:22:15 |
| Message-ID: | D685C94B-9178-4077-A313-80F125FCE196@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On May 16, 2026, at 13:17, jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> On Fri, May 15, 2026 at 9:13 AM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>
>> Hi,
>>
>> I just tested “Add COPY (column list) (on_error set_null) option”. While tracing a normal case, I found a mistake:
>>
>> In BeginCopyFrom(), cstate->domain_with_constraint is allocated using the length of the specified column list, and set using the index in that column list:
>> ```
>> int attr_count = list_length(cstate->attnumlist);
>>
>> /*
>> * When data type conversion fails and ON_ERROR is SET_NULL, we need
>> * ensure that the input column allow null values. ExecConstraints()
>> * will cover most of the cases, but it does not verify domain
>> * constraints. Therefore, for constrained domains, the null value
>> * check must be performed during the initial string-to-datum
>> * conversion (see CopyFromTextLikeOneRow()).
>> */
>> cstate->domain_with_constraint = palloc0_array(bool, attr_count); <== allocate with length of column list from SQL
>>
>> foreach_int(attno, cstate->attnumlist)
>> {
>> int i = foreach_current_index(attno);
>>
>> Form_pg_attribute att = TupleDescAttr(tupDesc, attno - 1);
>>
>> cstate->domain_with_constraint[i] = DomainHasConstraints(att->atttypid, NULL); <= set with index of column list from SQL
>> }
>> ```
>>
>> However, cstate->domain_with_constraint is read in CopyFromTextLikeOneRow(), where it is accessed using the actual attribute number:
>> ```
>> /* Loop to read the user attributes on the line. */
>> foreach(cur, cstate->attnumlist)
>> {
>> int attnum = lfirst_int(cur);
>> int m = attnum - 1;
>>
>> ...
>> if (!cstate->domain_with_constraint[m] ||
>> ```
>>
>> So, if the column list specified in SQL is shorter than the table’s actual attribute list, this may cause an out-of-bounds read.
>>
> Hi.
>
> This appears to be the same issue as reported here:
> https://postgr.es/m/CAHg+QDdej0c0gWJi2FnbirzhgzyZNPiTwC1P5B_-dSNCzq-91A@mail.gmail.com
Thanks for pointing out that. I will review that patch.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nazir Bilal Yavuz | 2026-05-18 22:27:52 | Re: Heads Up: cirrus-ci is shutting down June 1st |
| Previous Message | Ilia Evdokimov | 2026-05-18 22:13:39 | Re: Extended statistics improvement: multi-column MCV missing values |