Re: BUG #15121: Multiple UBSAN errors

From: Martin Liška <marxin(dot)liska(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, PG Bug reporting form <noreply(at)postgresql(dot)org>
Subject: Re: BUG #15121: Multiple UBSAN errors
Date: 2018-03-19 08:59:01
Message-ID: CAObPJ3MAY9u1had3QSvYO0TG-iVECqDzF04PvMZSHrmxNknZ8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 19 March 2018 at 01:34, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> On 03/18/2018 08:59 PM, PG Bug reporting form wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference: 15121
>> Logged by: Martin Liška
>> Email address: marxin(dot)liska(at)gmail(dot)com
>> PostgreSQL version: 10.3
>> Operating system: Linux
>> Description:
>>
>> Building current trunk with -fsanitize=undefined I see following errors with
>> make check:
>>
>> clog.c:299:3: runtime error: null pointer passed as argument 1, which is
>> declared to never be null
>> #0 0x65c865 in TransactionIdSetPageStatus
>> /home/marxin/Programming/postgres/src/backend/access/transam/clog.c:299
>> #1 0x65c4a5 in TransactionIdSetTreeStatus
>> /home/marxin/Programming/postgres/src/backend/access/transam/clog.c:190
>> #2 0x680830 in TransactionIdCommitTree
>> /home/marxin/Programming/postgres/src/backend/access/transam/transam.c:262
>> #3 0x68d47d in RecordTransactionCommit
>> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:1290
>> #4 0x68f1fd in CommitTransaction
>> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:2037
>> #5 0x6908cd in CommitTransactionCommand
>> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:2768
>> #6 0x6e297f in BootstrapModeMain
>> /home/marxin/Programming/postgres/src/backend/bootstrap/bootstrap.c:515
>> #7 0x6e275f in AuxiliaryProcessMain
>> /home/marxin/Programming/postgres/src/backend/bootstrap/bootstrap.c:434
>> #8 0xc1964c in main
>> /home/marxin/Programming/postgres/src/backend/main/main.c:220
>> #9 0x7ffff635ca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>> #10 0x4863d9 in _start
>> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/postgres+0x4863d9)
>>
>
> Not sure what this is - the lines don't seem to match to the sources, so
> presumably it's shifted somehow. So hard to say which pointer is it
> complaining about ...

I build current git master:

commit a4678320471380e5159a8d6e89466d74d6ee1739 (HEAD, origin/master,
origin/HEAD)
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Sun Mar 18 15:10:28 2018 -0400

Doc: note that statement-level view triggers require an INSTEAD OF trigger.

If a view lacks an INSTEAD OF trigger, DML on it can only work by rewriting
the command into a command on the underlying base table(s). Then we will
fire triggers attached to those table(s), not those for the view. This
seems appropriate from a consistency standpoint, but nowhere was the
behavior explicitly documented, so let's do that.

There was some discussion of throwing an error or warning if a statement
trigger is created on a view without creating a row INSTEAD OF trigger.
But a simple implementation of that would result in dump/restore ordering
hazards. Given that it's been like this all along, and we hadn't heard
a complaint till now, a documentation improvement seems sufficient.

Per bug #15106 from Pu Qun. Back-patch to all supported branches.

Discussion:
https://postgr.es/m/152083391168.1215.16892140713507052796@wrigleys.postgresql.org

Note that I use following git repository:
git remote -v
origin https://github.com/postgres/postgres.git (fetch)

The memcmp is defined in /usr/include/string.h with:
extern int memcmp (const void *__s1, const void *__s2, size_t __n)
__attribute__ ((__nothrow__ , __leaf__)) __attribute__
((__pure__)) __attribute__ ((__nonnull__ (1, 2)));

>
>> relcache.c:5932:6: runtime error: null pointer passed as argument 1, which
>> is declared to never be null
>> #0 0x140aa86 in write_item
>> /home/marxin/Programming/postgres/src/backend/utils/cache/relcache.c:5932
>> #1 0x140a2e2 in write_relcache_init_file
>> /home/marxin/Programming/postgres/src/backend/utils/cache/relcache.c:5837
>> #2 0x13f7a63 in RelationCacheInitializePhase3
>> /home/marxin/Programming/postgres/src/backend/utils/cache/relcache.c:3887
>> #3 0x14612a5 in InitPostgres
>> /home/marxin/Programming/postgres/src/backend/utils/init/postinit.c:997
>> #4 0x104661a in PostgresMain
>> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:3777
>> #5 0xc19777 in main
>> /home/marxin/Programming/postgres/src/backend/main/main.c:224
>> #6 0x7ffff635ca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>> #7 0x4863d9 in _start
>> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/postgres+0x4863d9)
>>
>
> This is apparently because we call write_item() like this:
>
> /* next, do the access method specific field */
> write_item(rel->rd_options,
> (rel->rd_options ? VARSIZE(rel->rd_options) : 0),
> fp);
>
> and it then does this:
>
> static void
> write_item(const void *data, Size len, FILE *fp)
> {
> if (fwrite(&len, 1, sizeof(len), fp) != sizeof(len))
> elog(FATAL, "could not write init file");
> if (fwrite(data, 1, len, fp) != len)
> elog(FATAL, "could not write init file");
> }
>
> So the second fwrite call may do "fwrite(NULL,1,0,fp)" i.e. it writes 0
> bytes from NULL pointer. Which I guess should work fine, because it does
> not need to access the pointer at all.
>
> I don't know where does the "declared to never be null" comes from.

It's GCC internal builtin that is used. Maybe we can make it more
smart to not print and
error if size == 0. On the other hand compiler can do optimizations
based on nonnull argument
thus I would recommend to do if(size != 0) fwrite..

>
>> pg_crc32c_sse42.c:37:18: runtime error: load of misaligned address
>> 0x7fffffffd484 for type 'const uint64', which requires 8 byte alignment
>> 0x7fffffffd484: note: pointer points here
>> c0 d4 ff ff 01 00 00 00 7f 06 00 00 09 00 00 00 b3 ee bd f7 b3 0a 02 00
>> cf 10 32 01 00 00 00 80
>> ^
>> #0 0x153f045 in pg_comp_crc32c_sse42
>> /home/marxin/Programming/postgres/src/port/pg_crc32c_sse42.c:37
>> #1 0x6ca43d in XLogRecordAssemble
>> /home/marxin/Programming/postgres/src/backend/access/transam/xloginsert.c:780
>> #2 0x6c8d6f in XLogInsert
>> /home/marxin/Programming/postgres/src/backend/access/transam/xloginsert.c:459
>> #3 0x6997bb in XactLogCommitRecord
>> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:5370
>> #4 0x68d3c0 in RecordTransactionCommit
>> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:1225
>> #5 0x68f1fd in CommitTransaction
>> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:2037
>> #6 0x6908cd in CommitTransactionCommand
>> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:2768
>> #7 0x104442d in finish_xact_command
>> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:2498
>> #8 0x104052a in exec_simple_query
>> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:1145
>> #9 0x1046bf1 in PostgresMain
>> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:4144
>> #10 0xc19777 in main
>> /home/marxin/Programming/postgres/src/backend/main/main.c:224
>> #11 0x7ffff635ca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>> #12 0x4863d9 in _start
>> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/postgres+0x4863d9)
>>
>
> This comes from this call in pg_comp_crc32c_sse42
>
> crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) p));
>
> and it's explained in the comment right above it:
>
> /*
> * Process eight bytes of data at a time.
> *
> * NB: We do unaligned accesses here. The Intel architecture allows
> * that, and performance testing didn't show any performance gain
> * from aligning the begin address.
> */
>
> So, not a bug.

Agree with that!

>
>>
>> arrayfuncs.c:3740:17: runtime error: member access within misaligned address
>> 0x0000028b937c for type 'struct ExpandedObjectHeader', which requires 8 byte
>> alignment
>> 0x0000028b937c: note: pointer points here
>> 6f 6f 00 00 80 02 00 00 01 00 00 00 00 00 00 00 19 00 00 00 08 00 00 00
>> 01 00 00 00 40 00 00 00
>> ^
>> #0 0x10d22b0 in array_cmp
>> /home/marxin/Programming/postgres/src/backend/utils/adt/arrayfuncs.c:3740
>> #1 0x10d208a in btarraycmp
>> /home/marxin/Programming/postgres/src/backend/utils/adt/arrayfuncs.c:3724
>> #2 0x14d7fd8 in comparison_shim
>> /home/marxin/Programming/postgres/src/backend/utils/sort/sortsupport.c:53
>> #3 0x8f6bcb in ApplySortComparator
>> ../../../src/include/utils/sortsupport.h:225
>> #4 0x9079c7 in compare_scalars
>> /home/marxin/Programming/postgres/src/backend/commands/analyze.c:2855
>> #5 0x153d1e6 in qsort_arg
>> /home/marxin/Programming/postgres/src/port/qsort_arg.c:140
>> #6 0x904cfa in compute_scalar_stats
>> /home/marxin/Programming/postgres/src/backend/commands/analyze.c:2412
>> #7 0x10ed240 in compute_array_stats
>> /home/marxin/Programming/postgres/src/backend/utils/adt/array_typanalyze.c:250
>> #8 0x8f990f in do_analyze_rel
>> /home/marxin/Programming/postgres/src/backend/commands/analyze.c:579
>> #9 0x8f7c9f in analyze_rel
>> /home/marxin/Programming/postgres/src/backend/commands/analyze.c:310
>> #10 0xa7e1bb in vacuum
>> /home/marxin/Programming/postgres/src/backend/commands/vacuum.c:357
>> #11 0xa7d925 in ExecVacuum
>> /home/marxin/Programming/postgres/src/backend/commands/vacuum.c:141
>> #12 0x104f38e in standard_ProcessUtility
>> /home/marxin/Programming/postgres/src/backend/tcop/utility.c:667
>> #13 0x104e364 in ProcessUtility
>> /home/marxin/Programming/postgres/src/backend/tcop/utility.c:358
>> #14 0x104c6d2 in PortalRunUtility
>> /home/marxin/Programming/postgres/src/backend/tcop/pquery.c:1178
>> #15 0x104cca6 in PortalRunMulti
>> /home/marxin/Programming/postgres/src/backend/tcop/pquery.c:1324
>> #16 0x104afc0 in PortalRun
>> /home/marxin/Programming/postgres/src/backend/tcop/pquery.c:799
>> #17 0x1040463 in exec_simple_query
>> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:1120
>> #18 0x1046bf1 in PostgresMain
>> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:4144
>> #19 0xc19777 in main
>> /home/marxin/Programming/postgres/src/backend/main/main.c:224
>> #20 0x7ffff635ca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>> #21 0x4863d9 in _start
>> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/postgres+0x4863d9)
>>
>
>
> Again, the line numbers don't really match the code I have, but I guess
> it's the same issue as for pg_comp_crc32c_sse42. This is apparently
> related to array serialization, and I guess we have a compact structure
> (intentionally, to make it smaller), and we accept the unaligned access.

Note that building postgresql with -03, I see some array tests failing.

>
>> print.c:916:4: runtime error: null pointer passed as argument 1, which is
>> declared to never be null
>> #0 0x4904da in print_aligned_text
>> /home/marxin/Programming/postgres/src/fe_utils/print.c:916
>> #1 0x4a0ca2 in printTable
>> /home/marxin/Programming/postgres/src/fe_utils/print.c:3235
>> #2 0x4a171f in printQuery
>> /home/marxin/Programming/postgres/src/fe_utils/print.c:3347
>> #3 0x414286 in PrintQueryTuples
>> /home/marxin/Programming/postgres/src/bin/psql/common.c:890
>> #4 0x414d6f in PrintQueryResults
>> /home/marxin/Programming/postgres/src/bin/psql/common.c:1224
>> #5 0x41559d in SendQuery
>> /home/marxin/Programming/postgres/src/bin/psql/common.c:1408
>> #6 0x4356c6 in MainLoop
>> /home/marxin/Programming/postgres/src/bin/psql/mainloop.c:431
>> #7 0x40d248 in process_file
>> /home/marxin/Programming/postgres/src/bin/psql/command.c:3563
>> #8 0x44c8f8 in main
>> /home/marxin/Programming/postgres/src/bin/psql/startup.c:375
>> #9 0x7ffff5feca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>> #10 0x4048f9 in _start
>> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/psql+0x4048f9)
>>
>
> No idea, line numbers shifted again. My guess would be something like
> the fwrite() report, but this time with fputs(). Not sure which of the
> calls, though.

This one is a memset.

Hope you'll be able to locate error in source files.

Martin

>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Martin Liška 2018-03-19 09:04:32 Re: BUG #15121: Multiple UBSAN errors
Previous Message Greg k 2018-03-19 05:55:15 Different behaviour for pg_ctl --wait between pg9.5 and pg10