| From: | PG Bug reporting form <noreply(at)postgresql(dot)org> | 
|---|---|
| To: | pgsql-bugs(at)lists(dot)postgresql(dot)org | 
| Cc: | buschmann(at)nidsa(dot)net | 
| Subject: | BUG #16300: Text line order corruption with COPY command | 
| Date: | 2020-03-12 20:04:54 | 
| Message-ID: | 16300-b952db3f81f7f40d@postgresql.org | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-bugs | 
The following bug has been logged on the website:
Bug reference:      16300
Logged by:          Hans Buschmann
Email address:      buschmann(at)nidsa(dot)net
PostgreSQL version: 12.2
Operating system:   Windows Server 2019 64bit
Description:        
A reproducable line order corruption occurs when copying a quite large test
file into Postgres.
I was trying to import and parse a big .xml file (about 41 MB, 643407 lines)
into a simple import table using the following sequence:
create database x86db template=template0 encoding 'UTF8' lc_collate='C';
\c x86db
create table uops_imp2 (
cline varchar
)
;
copy uops_imp2 from 'N:/downloads/uops_info_instructions_200226.xml';
or
copy uops_imp2 from '/usr/local/hb/uops_info_instructions_200226.xml';
This was tested on different machines under Windows Server 2019 64bit and
Fedora 31 x86-64 under Postgres 12.2 respective 12.1:
x86db=# select version ();
                          version
------------------------------------------------------------
 PostgreSQL 12.2, compiled by Visual C++ build 1914, 64-bit
(1 row)
x86db=# select version ();
                                                version
--------------------------------------------------------------------------------------------------------
 PostgreSQL 12.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 9.2.1
20190827 (Red Hat 9.2.1-1), 64-bit
(1 row)
The original order of the input lines from the original file was verified
under 2 different editors under Windows:
notepad++ 7.8.5 x64
notepad (as build in), with status line turned on to show line numbers
Here are shown the line 627365 til 627392: (the correct original)
        <doc TP="1.0"/>
      </architecture>
    </instruction>
    <instruction asm="VPMADDWD" category="AVX512" cpl="3" evex="1"
extension="AVX512EVEX" iclass="VPMADDWD"
iform="VPMADDWD_ZMMi32_MASKmskw_ZMMi16_MEMi16_AVX512" isa-set="AVX512BW_512"
mask="0" string="VPMADDWD (ZMM, ZMM, M512)" zeroing="0">
      <operand idx="1" name="REG0" type="reg" w="1" width="512"
xtype="i32">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
      <operand idx="2" name="REG2" r="1" type="reg" width="512"
xtype="i16">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
      <operand idx="3" memory-prefix="zmmword ptr" name="MEM0" r="1"
type="mem" width="512" xtype="i16"/>
      <architecture name="SKX">
        <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="2.3"/>
        <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="3.0"/>
        <measurement TP="0.54" TP_ports="0.50" ports="1*p05+1*p23" uops="2"
uops_retire_slots="1">
          <latency cycles="5" start_op="2" target_op="1"/>
          <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_addr_same_reg="14" cycles_addr_same_reg_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
        </measurement>
      </architecture>
      <architecture name="CNL">
        <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
          <latency cycles="5" start_op="2" target_op="1"/>
          <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
        </measurement>
      </architecture>
      <architecture name="ICL">
        <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
          <latency cycles="5" start_op="2" target_op="1"/>
          <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
        </measurement>
        <doc TP="1.0"/>
      </architecture>
    </instruction>
when querying the table by
select * from uops_imp2 offset 627365 limit 27;
I get a different part from the original lines with another line mangled in
between (see ###)
x86db=#
x86db=# select * from uops_imp2 offset 627365 limit 27;
                                                                            
                                                       cline                
                                              
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
           <latency cycles="5" start_op="4" target_op="1"/>
         </measurement>
         <doc TP="1.0"/>
       </architecture>
     </instruction>
     <instruction asm="VPMADDWD" category="AVX512" cpl="3" evex="1"
extension="AVX512EVEX" iclass="VPMADDWD"
iform="VPMADDWD_ZMMi32_MASKmskw_ZMMi16_MEMi16_AVX512" isa-set="AVX512BW_512"
mask="0" string="VPMADDWD (ZMM, ZMM, M512)" zeroing="0">
       <operand idx="1" name="REG0" type="reg" w="1" width="512"
xtype="i32">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
###           <latency cycles="6" start_op="2" target_op="1"/>
       <operand idx="2" name="REG2" r="1" type="reg" width="512"
xtype="i16">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
       <operand idx="3" memory-prefix="zmmword ptr" name="MEM0" r="1"
type="mem" width="512" xtype="i16"/>
       <architecture name="SKX">
         <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="2.3"/>
         <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="3.0"/>
         <measurement TP="0.54" TP_ports="0.50" ports="1*p05+1*p23" uops="2"
uops_retire_slots="1">
           <latency cycles="5" start_op="2" target_op="1"/>
           <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_addr_same_reg="14" cycles_addr_same_reg_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
         </measurement>
       </architecture>
       <architecture name="CNL">
         <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
           <latency cycles="5" start_op="2" target_op="1"/>
           <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
         </measurement>
       </architecture>
       <architecture name="ICL">
         <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
           <latency cycles="5" start_op="2" target_op="1"/>
(27 rows)
In all cases i tried the original order of the lines was not preserved and
the disorder was the same.
The count of all lines seems correct:
x86db=# select count(*) from uops_imp2;
 count
--------
 643407
(1 row)
The same error occurred when using \copy on the psql client side.
To reproduce, the XML-file is directly downloadable under the following
address:
and choosing the file instructions.xml
I have not further analyzed other regions of line order corruption because
it is very difficult when you cant rely on postgres COPY.
I fear similar problems could occur when restoring a pg_dump file, which
also relies on copy commands.
Thanks in advance
Hans Buschmann
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David G. Johnston | 2020-03-12 20:42:58 | Re: BUG #16300: Text line order corruption with COPY command | 
| Previous Message | Demarest, Jamie | 2020-03-12 14:35:35 | RE: Postgresql create a core while trying log a message to syslog |