From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | buschmann(at)nidsa(dot)net |
Subject: | BUG #16300: Text line order corruption with COPY command |
Date: | 2020-03-12 20:04:54 |
Message-ID: | 16300-b952db3f81f7f40d@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 16300
Logged by: Hans Buschmann
Email address: buschmann(at)nidsa(dot)net
PostgreSQL version: 12.2
Operating system: Windows Server 2019 64bit
Description:
A reproducable line order corruption occurs when copying a quite large test
file into Postgres.
I was trying to import and parse a big .xml file (about 41 MB, 643407 lines)
into a simple import table using the following sequence:
create database x86db template=template0 encoding 'UTF8' lc_collate='C';
\c x86db
create table uops_imp2 (
cline varchar
)
;
copy uops_imp2 from 'N:/downloads/uops_info_instructions_200226.xml';
or
copy uops_imp2 from '/usr/local/hb/uops_info_instructions_200226.xml';
This was tested on different machines under Windows Server 2019 64bit and
Fedora 31 x86-64 under Postgres 12.2 respective 12.1:
x86db=# select version ();
version
------------------------------------------------------------
PostgreSQL 12.2, compiled by Visual C++ build 1914, 64-bit
(1 row)
x86db=# select version ();
version
--------------------------------------------------------------------------------------------------------
PostgreSQL 12.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 9.2.1
20190827 (Red Hat 9.2.1-1), 64-bit
(1 row)
The original order of the input lines from the original file was verified
under 2 different editors under Windows:
notepad++ 7.8.5 x64
notepad (as build in), with status line turned on to show line numbers
Here are shown the line 627365 til 627392: (the correct original)
<doc TP="1.0"/>
</architecture>
</instruction>
<instruction asm="VPMADDWD" category="AVX512" cpl="3" evex="1"
extension="AVX512EVEX" iclass="VPMADDWD"
iform="VPMADDWD_ZMMi32_MASKmskw_ZMMi16_MEMi16_AVX512" isa-set="AVX512BW_512"
mask="0" string="VPMADDWD (ZMM, ZMM, M512)" zeroing="0">
<operand idx="1" name="REG0" type="reg" w="1" width="512"
xtype="i32">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
<operand idx="2" name="REG2" r="1" type="reg" width="512"
xtype="i16">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
<operand idx="3" memory-prefix="zmmword ptr" name="MEM0" r="1"
type="mem" width="512" xtype="i16"/>
<architecture name="SKX">
<IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="2.3"/>
<IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="3.0"/>
<measurement TP="0.54" TP_ports="0.50" ports="1*p05+1*p23" uops="2"
uops_retire_slots="1">
<latency cycles="5" start_op="2" target_op="1"/>
<latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_addr_same_reg="14" cycles_addr_same_reg_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
</measurement>
</architecture>
<architecture name="CNL">
<measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
<latency cycles="5" start_op="2" target_op="1"/>
<latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
</measurement>
</architecture>
<architecture name="ICL">
<measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
<latency cycles="5" start_op="2" target_op="1"/>
<latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
</measurement>
<doc TP="1.0"/>
</architecture>
</instruction>
when querying the table by
select * from uops_imp2 offset 627365 limit 27;
I get a different part from the original lines with another line mangled in
between (see ###)
x86db=#
x86db=# select * from uops_imp2 offset 627365 limit 27;
cline
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
<latency cycles="5" start_op="4" target_op="1"/>
</measurement>
<doc TP="1.0"/>
</architecture>
</instruction>
<instruction asm="VPMADDWD" category="AVX512" cpl="3" evex="1"
extension="AVX512EVEX" iclass="VPMADDWD"
iform="VPMADDWD_ZMMi32_MASKmskw_ZMMi16_MEMi16_AVX512" isa-set="AVX512BW_512"
mask="0" string="VPMADDWD (ZMM, ZMM, M512)" zeroing="0">
<operand idx="1" name="REG0" type="reg" w="1" width="512"
xtype="i32">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
### <latency cycles="6" start_op="2" target_op="1"/>
<operand idx="2" name="REG2" r="1" type="reg" width="512"
xtype="i16">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
<operand idx="3" memory-prefix="zmmword ptr" name="MEM0" r="1"
type="mem" width="512" xtype="i16"/>
<architecture name="SKX">
<IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="2.3"/>
<IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="3.0"/>
<measurement TP="0.54" TP_ports="0.50" ports="1*p05+1*p23" uops="2"
uops_retire_slots="1">
<latency cycles="5" start_op="2" target_op="1"/>
<latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_addr_same_reg="14" cycles_addr_same_reg_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
</measurement>
</architecture>
<architecture name="CNL">
<measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
<latency cycles="5" start_op="2" target_op="1"/>
<latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
</measurement>
</architecture>
<architecture name="ICL">
<measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
<latency cycles="5" start_op="2" target_op="1"/>
(27 rows)
In all cases i tried the original order of the lines was not preserved and
the disorder was the same.
The count of all lines seems correct:
x86db=# select count(*) from uops_imp2;
count
--------
643407
(1 row)
The same error occurred when using \copy on the psql client side.
To reproduce, the XML-file is directly downloadable under the following
address:
and choosing the file instructions.xml
I have not further analyzed other regions of line order corruption because
it is very difficult when you cant rely on postgres COPY.
I fear similar problems could occur when restoring a pg_dump file, which
also relies on copy commands.
Thanks in advance
Hans Buschmann
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2020-03-12 20:42:58 | Re: BUG #16300: Text line order corruption with COPY command |
Previous Message | Demarest, Jamie | 2020-03-12 14:35:35 | RE: Postgresql create a core while trying log a message to syslog |