Re: Regression fails on Alpha True64 V5.0 for

From: "Tegge, Bernd" <tegge(at)repas-aeg(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-ports(at)postgresql(dot)org
Subject: Re: Regression fails on Alpha True64 V5.0 for
Date: 2001-11-19 16:23:39
Message-ID: 5.1.0.14.0.20011119142645.02970800@dragon.dr.repas.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-ports

Following up on my own message. The version with full debugging support
works flawlessly. Compiling with partial debug gives:
--------------------------------------
Core file created by program "postgres"
signal Bus error at [interval_accum:1575 +0x8,0x1201d7848]
Source not available
warning: Files compiled -g3: parameter values probably wrong
dbx) t
> 0 interval_accum(fcinfo = 0x14019b140) ["timestamp.c":1575, 0x1201d7848]
1 advance_transition_function(peraggstate = 0x14019ac60,
newVal = (unallocated - symbol optimized away),
isNull = (unallocated - symbol optimized away))
["nodeAgg.c":283, 0x12010e67c]
2 ExecAgg(node = 0x14019a2d0) ["nodeAgg.c":555, 0x12010eb6c]
3 ExecProcNode(node = 0x14019a2d0, parent = (nil))
["execProcnode.c":347, 0x12010791c]
4 ExecutePlan(estate = 0x14019a970, plan = 0x14019a2d0,
operation = (unallocated - symbol optimized away), numberTuples = 0,
direction = 708, destfunc = 0x14019b1f0)
["execMain.c":976, 0x1201058e4]
5 ExecutorRun(queryDesc = 0x14019b1f0, estate = 0x14019a970,
feature = (unallocated - symbol optimized away), count = 0)
["execMain.c":199, 0x12010491c]
6 ProcessQuery(parsetree = 0x14019a938, plan = 0x14019a970,
dest = (unallocated - symbol optimized away))
["pquery.c":293, 0x120192ba4]
7 pg_exec_query_string(query_string = 0x1401a1060 =
"select avg(f1) from interval_tbl;",
dest = (unallocated - symbol optimized away),
parse_context = 0x140109340)
["postgres.c":782, 0x120190424]
----------------------------
looking at line 1575 of timestamp.c I see :
/*
* XXX memcpy, instead of just extracting a pointer, to work around
* buggy array code: it won't ensure proper alignment of Interval
* objects on machines where double requires 8-byte alignment. That
* should be fixed, but in the meantime...
*/
memcpy(&sumX, DatumGetIntervalP(transdatums[0]), sizeof(Interval));
memcpy(&N, DatumGetIntervalP(transdatums[1]), sizeof(Interval));

I have no idea what array code is buggy but for now that this does not
work with Compaq's compiler at -O4. I will try to find out, what lower
optimization level, if any, will hide the problem.

Regards, Bernd
Am 14:01 16.11.01 -0500 schrieb Tom Lane:
>"Tegge, Bernd" <tegge(at)repas-aeg(dot)de> writes:
> > The interval test fails with the following msg:
> > --- 216,222 ----
> > -- known to change the allowed input syntax for type interval without
> > -- updating pg_aggregate.agginitval
> > select avg(f1) from interval_tbl;
> > ! server closed the connection unexpectedly
>
>This is bad :-(
>
>I tried to reproduce the problem on the Alpha available at
>SourceForge's compile farm. No luck --- regression tests run
>perfectly there, at least with vanilla configuration (I used
>"configure --enable-cassert"). So it doesn't seem hardware-
>specific, but perhaps it depends on the OS.

Did you use gcc or Compaq's Alpha-Compilers ?

warp.dr.repas.de$ uname -a
OSF1 warp.dr.repas.de V5.0 910 alpha
warp.dr.repas.de$ cc -V
cc (cc)
Tru64 UNIX Compiler Driver 5.0
Compaq C V6.1-011 on Digital UNIX V5.0 (Rev. 910)
warp.dr.repas.de$ flex -V
flex version 2.5.4
warp.dr.repas.de$ bison -V
GNU Bison version 1.28

> > No core file and I don't know which of the many error messages in
> > postmaster.log are normal and which are not.
> > I've run the tests with debug enabled, and I see no further output
> > after
> > DEBUG: query: select avg(f1) from interval_tbl;
>
>Is there not even a report of the backend crashing? If the postmaster
>did not log a child-exit message then there's something more than a
>plain old backend crash here.

Interval normally runs in a group of multiple parallel tests. I changed
the setup so that it ran as the only test. The result:
--------------------------------
test interval ...
Unaligned access pid=339539 <postgres> va=0x14019b 1bc pc=0x1201d7844
ra=0x1201d77e0 inst=0x8c000000
FAILED
DEBUG: plan: { AGG :startup_cost 22.50 :total_cost 22.50 :rows 1 :width 12
:qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1186
:restypmod -1 :resname avg :reskey
0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { AGGREG :aggname
avg :basetype 1186 :aggtype 1186 :target { VAR :varno 0 :varattno 1
:vartype 1186 :vartypmod -1 :varlev
elsup 0 :varnoold 1 :varoattno 1} :aggstar false :aggdistinct false }})
:qpqual <> :lefttree { SEQSCAN :startup_cost 0.00 :total_cost 20.00 :rows
1000 :width 12 :qptargetlist ({ T
ARGETENTRY :resdom { RESDOM :resno 1 :restype 1186 :restypmod -1 :resname
<> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR
:varno 1 :varattno 1 :vartype 11
86 :vartypmod -1 :varlevelsup 0 :varnoold 1 :varoattno 1}}) :qpqual <>
:lefttree <> :righttree <> :extprm () :locprm () :initplan <> :nprm
0 :scanrelid 1 } :righttree <> :extprm
() :locprm () :initplan <> :nprm 0 }
DEBUG: ProcessQuery
DEBUG: reaping dead processes
DEBUG: child process (pid 339539) was terminated by signal 10
--------------------------------------
It's been a long time since I have seen unaligned access messages, but
then I haven't done much development on OSF/1 for quite some time.
AFAIR this is only a warning, so the problem may lay elsewhere.
After some rtfm I found the core file and created a backtrace.
No source unfortunately, I'll have to build it with different compiler
option for that.

dbx version 5.0
Type 'help' for help.
Core file created by program "postgres"

signal Bus error at >*[interval_accum, 0x1201d7848] stt $f0, 8(sp)
(dbx) t
> 0 interval_accum(0x0, 0x0, 0x0, 0x100000002, 0x1401c1038) [0x1201d7848]
1 (unknown)() [0x12010e67c]
2 ExecAgg(0x1, 0x0, 0x14019abd8, 0x140175000, 0x1200aae50) [0x12010eb6c]
3 ExecProcNode(0x1200aae50, 0x4a2, 0x1201058e8, 0x14019a970,
0x14019a970) [0x12010791c]
4 (unknown)() [0x1201058e4]
5 ExecutorRun(0x14019a970, 0x2, 0x1, 0x0, 0x0) [0x12010491c]
6 ProcessQuery(0x140081ba8, 0x14019af88, 0x0, 0x140078440, 0x0)
[0x120192ba4]
7 pg_exec_query_string(0x1401a1060, 0x140109340, 0x100000002,
0x1401a1588, 0x1401a1588) [0x120190424]
8 PostgresMain(0x11fffaa08, 0x1400e1ba9, 0x100000005, 0x1400deb20, 0x0)
[0x1201922a4]
9 (unknown)() [0x12016063c]
10 (unknown)() [0x12015fb0c]
11 (unknown)() [0x12015e40c]
12 PostmasterMain(0x0, 0x0, 0x0, 0x0, 0x0) [0x12015e054]
13 main(0x1, 0x0, 0x12940, 0x400000006, 0x12004acc0) [0x12012718c]

> regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 3: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>message can get through to the mailing list cleanly

Responses

Browse pgsql-ports by date

  From Date Subject
Next Message Tom Lane 2001-11-19 16:31:42 Re: Regression fails on Alpha True64 V5.0 for yesterdays cvs
Previous Message Nathan Wilson 2001-11-17 16:23:36 Darwin/MacOSX 10.1.1 newbie