Re: help to identify the reason that extension's C function returns array get segmentation fault

From: 钱新林 <qianxinlin(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: help to identify the reason that extension's C function returns array get segmentation fault
Date: 2017-03-01 02:13:17
Message-ID: CABzpzpfzme=n70E1TdyA=e74wk67MH=pjnbD+0JoHFX+q=TDnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for your clues.

The system I have used to debug the code is x86 64bit based, Ubuntu 1404
and postgres 9.3.13, I have revised the code and it looks like as following:

Datum vquery(PG_FUNCTION_ARGS) {

int array_len = PG_GETARG_INT32(0);
int64 * node_ids;
ArrayType * retarr;
Datum * vals ;

SPI_connect();

//some code to retrieve data from various tables
// node_ids are allocated and filled up

vals = SPI_palloc(array_len * sizeof(Datum));
memset (vals, 0, array_len * sizeof(Datum));

// fill the vals up
for (i = 0 ; i < array_len ; i++)
vals[i] = Int64GetDatum((node_ids[i]));

retarr = construct_array(vals, retcnt, INT8OID, sizeof(int64), true, 'd');

SPI_finish();

PG_RETURN_ARRAYTYPE_P(retarr);
}

It seems to solve the problem, I have tested the code for a while and no
more segmentation faults are reported.

I have built Postgresql with --enable-debug and --enable-cassert, but use
the binary with gdb and get no symbol file loaded. I will take further
researches and use it to facilitate debug. Thanks.

On Tue, Feb 28, 2017 at 12:54 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> =?UTF-8?B?6ZKx5paw5p6X?= <qianxinlin(at)gmail(dot)com> writes:
> > I have written an extension to manage openstreetmap data. There is a C
> > function to perform spatial top k query on several tables and return an
> > array of int8 type as result. The code skeleton of this function is as
> > follows:
>
> There are a remarkable lot of bugs in this code fragment. Many of them
> would not bite you as long as you are running on 64-bit Intel hardware,
> but that doesn't make them not bugs.
>
> > Datum vquery(PG_FUNCTION_ARGS) {
>
> > int array_len = PG_GETARG_INT32(0);
> > long * node_ids;
>
> > SPI_connect();
>
> > //some code to retrieve data from various tables
> > // node_ids are allocated and filled up
>
> > ArrayType * retarr;
> > Datum * vals ;
>
> > vals = palloc0(array_len * sizeof(long));
>
> Datum is not necessarily the same as "long".
>
> > // fill the vals up
> > for (i = 0 ; i < array_len ; i++)
> > vals[i] = Int64GetDatum((node_ids[i]));
>
> int64 is not necessarily the same as "long", either.
>
> > retarr = construct_array(vals, retcnt, INT8OID, sizeof(long), true, 'i');
>
> Again, INT8 is not the same size as "long", and it's not necessarily
> pass-by-val, and it's *certainly* not integer alignment.
>
> > SPI_finish();
>
> > PG_RETURN_ARRAYTYPE_P(retarr);
>
> But I think what's really biting you, probably, is that construct_array()
> made the array in CurrentMemoryContext which at that point was the SPI
> execution context; which would be deleted by SPI_finish. So you're
> returning a dangling pointer. You need to do something to either copy
> the array value out to the caller's context, or build it there in the
> first place.
>
> BTW, this failure would be a lot less intermittent if you were testing
> in a CLOBBER_FREED_MEMORY build. I would go so far as to say you should
> *never* develop or test C code for the Postgres backend without using
> the --enable-cassert configure option for your build. You're simply
> tossing away a whole lot of debug support if you don't.
>
> regards, tom lane
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-03-01 02:14:04 Re: Creation of "Future" commit fest, named 2017-07
Previous Message Michael Paquier 2017-03-01 02:08:55 Re: Backend crash on non-exclusive backup cancel