Re: Does anyone know of any issues around ARRAY UNNEST

From: Belinda Cussen <belinda(dot)cussen(at)servian(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: Does anyone know of any issues around ARRAY UNNEST
Date: 2011-11-28 00:15:26
Message-ID: CA+QS4W5uTq1KbWOy+V0wmH1508emS32_eKB8Mm0d43xJtcXSWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi there,
I've managed to produce this fault consistently now.

I have a procedure which is passing in an array as a variable - rather than
using an array within a table column as in your example Pavel.
I then update a table based on the ids which are passed in in this array.

I have created a simplified script which can be called concurrently (ie we
have python jobs which call the procedure at the same time - or as in this
test, we can open different command line sessions to call the procedure).
It consistently causes a segfault. It doesn't seem to matter how big the
array is. We have created a core dump and that too points to the array
datatype.

Below is the simplified code:

CREATE TABLE foo_1 (id int primary key,media_uri TEXT);
INSERT INTO foo_1(id) SELECT i FROM generate_series(1,1000000) g(i);

CREATE OR REPLACE FUNCTION bb_crash_db_5 () RETURNS TEXT AS $$
DECLARE
v_activity_id_list INTEGER ARRAY;
BEGIN

SELECT ARRAY(SELECT id FROM foo_1 ORDER BY id LIMIT 100000) INTO
v_activity_id_list;
UPDATE foo_1
SET media_uri = 'a'
WHERE id IN (SELECT activity_id FROM UNNEST (v_activity_id_list)
activity_id)
;
return 'success';

END;
$$ LANGUAGE plpgsql;

----------------------------------------------------------
I then open 2 command lines and run:
select bb_crash_db_5();

----------------------------------------------------------
I think the issue may be to do with trying to update the same row at the
same time. (In the production system we generate the list of ids to update
outside of the procedure and are assuming that they are unique). I would
expect a lock or a deadlock rather than a seg fault though?

We're a bit nervous of arrays now. Are they safe to use as variables?

Below is the (simplified) production procedure:

CREATE OR REPLACE FUNCTION bb_fn (v_activity_source_id_list TEXT ARRAY,
v_large_media_uri_list TEXT ARRAY ) RETURNS TEXT AS $$

DECLARE

v_source_id_count INTEGER;
v_uri_count INTEGER;
v_source_key INTEGER;

BEGIN

-----------------------------------------------------------
-- SET large_media_uri
-- using generate series to match array indeces
-----------------------------------------------------------
UPDATE activity
SET media_uri = a.media_uri
,record_last_updated_tstamp = CURRENT_TIMESTAMP
FROM (SELECT col1.SOURCE_ID[gs.ser] source_id
,col2.MEDIA_URI[gs.ser] media_uri
FROM (SELECT v_activity_source_id_list) As COL1(source_id)
,(select v_large_media_uri_list) AS COL2(media_uri)
,generate_series(1,v_uri_count) AS gs(ser)) a

WHERE activity_source_id_value = a.source_id;

RETURN 'success';

END;
$$ LANGUAGE plpgsql;
--------------------------------------------------------------------------------------------

core dump:
-----------------------------------------------------------

root(at)ip-10-94-47-103:/tmp# gdb /usr/lib/postgresql/9.1/bin/postgres 13295
GNU gdb (GDB) 7.1-ubuntu
Reading symbols from /usr/lib/postgresql/9.1/bin/postgres...Reading symbols
from /usr/lib/debug/usr/lib/postgresql/9.1/bin/postgres...done.
done.
Attaching to program: /usr/lib/postgresql/9.1/bin/postgres, process 13295
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
[Thread debugging using libthread_db enabled]
0x00007f970f452472 in recv () from /lib/libc.so.6
(gdb) set pagination off
(gdb) set logging file debuglog.txt
(gdb) set logging on
Copying output to debuglog.txt.
(gdb) handle SIGUSR1 nostop
Signal Stop Print Pass to program Description
SIGUSR1 No Yes Yes User defined signal 1
(gdb) handle SIGUSR2 nostop
Signal Stop Print Pass to program Description
SIGUSR2 No Yes Yes User defined signal 2
(gdb) cont
Continuing.

Program received signal SIGSEGV, Segmentation fault.
pg_detoast_datum (datum=0xb20) at fmgr.c:2233
2233 fmgr.c: No such file or directory.
in fmgr.c
(gdb) bt
#0 pg_detoast_datum (datum=0xb20) at fmgr.c:2233
#1 0x00007f97113c7269 in EvalPlanQualFetchRowMarks
(epqstate=0x7f9713a0ebd0) at execMain.c:2100
#2 0x00007f97113c76bc in EvalPlanQual (estate=0x7f9713a0e870,
epqstate=0x7f9713a0ebd0, relation=<value optimized out>, rti=1,
tid=0x7fff72e23af0, priorXmax=<value optimized out>) at execMain.c:1721
#3 0x00007f97113e03ca in ExecUpdate (node=0x7f9713a0eb30) at
nodeModifyTable.c:587
#4 ExecModifyTable (node=0x7f9713a0eb30) at nodeModifyTable.c:838
#5 0x00007f97113c98e8 in ExecProcNode (node=0x7f9713a0eb30) at
execProcnode.c:371
#6 0x00007f97113c8712 in ExecutePlan (queryDesc=0x7f97139ea7a0,
direction=24, count=0) at execMain.c:1439
#7 standard_ExecutorRun (queryDesc=0x7f97139ea7a0, direction=24, count=0)
at execMain.c:313
#8 0x00007f97113eae49 in _SPI_pquery (plan=<value optimized out>,
paramLI=<value optimized out>, snapshot=<value optimized out>,
crosscheck_snapshot=<value optimized out>, read_only=0 '\000',
fire_triggers=<value optimized out>, tcount=<value optimized out>) at
spi.c:2110
#9 _SPI_execute_plan (plan=<value optimized out>, paramLI=<value optimized
out>, snapshot=<value optimized out>, crosscheck_snapshot=<value optimized
out>, read_only=0 '\000', fire_triggers=<value optimized out>,
tcount=<value optimized out>) at spi.c:1922
#10 0x00007f97113eb20c in SPI_execute_plan_with_paramlist
(plan=0x7f97139fe030, params=<value optimized out>, read_only=0 '\000',
tcount=<value optimized out>) at spi.c:423
#11 0x00007f94fe41c48d in exec_stmt_execsql (estate=0x7fff72e23fb0,
stmt=0x7f97139e3e08) at pl_exec.c:3036
#12 0x00007f94fe41e75e in exec_stmt (estate=0x7fff72e23fb0, stmts=<value
optimized out>) at pl_exec.c:1342
#13 exec_stmts (estate=0x7fff72e23fb0, stmts=<value optimized out>) at
pl_exec.c:1241
#14 0x00007f94fe41fdff in exec_stmt_block (estate=<value optimized out>,
block=<value optimized out>) at pl_exec.c:1179
#15 0x00007f94fe420972 in plpgsql_exec_function (func=0x7f97139b32c8,
fcinfo=0x7f97139cebb0) at pl_exec.c:319
#16 0x00007f94fe4159e3 in plpgsql_call_handler (fcinfo=<value optimized
out>) at pl_handler.c:122
#17 0x00007f97113cf09e in ExecMakeFunctionResult (fcache=0x7f97139ceb40,
econtext=<value optimized out>, isNull=0x7f97139cf4c8 "", isDone=<value
optimized out>) at execQual.c:1824
#18 0x00007f97113ca33e in ExecTargetList (projInfo=<value optimized out>,
isDone=<value optimized out>) at execQual.c:5104
#19 ExecProject (projInfo=<value optimized out>, isDone=<value optimized
out>) at execQual.c:5319
#20 0x00007f97113e154b in ExecResult (node=0x7f97139ce840) at
nodeResult.c:155
#21 0x00007f97113c98f8 in ExecProcNode (node=0x7f97139ce840) at
execProcnode.c:367
#22 0x00007f97113c8712 in ExecutePlan (queryDesc=0x7f9713990110,
direction=24, count=0) at execMain.c:1439
#23 standard_ExecutorRun (queryDesc=0x7f9713990110, direction=24, count=0)
at execMain.c:313
#24 0x00007f971149d837 in PortalRunSelect (portal=0x7f971398e100,
forward=<value optimized out>, count=0, dest=0x7f97138b3db0) at pquery.c:943
#25 0x00007f971149eca0 in PortalRun (portal=<value optimized out>,
count=<value optimized out>, isTopLevel=<value optimized out>, dest=<value
optimized out>, altdest=<value optimized out>, completionTag=<value
optimized out>) at pquery.c:787
#26 0x00007f971149af23 in exec_simple_query (query_string=<value optimized
out>) at postgres.c:1018
#27 0x00007f971149bff5 in PostgresMain (argc=<value optimized out>,
argv=<value optimized out>, username=<value optimized out>) at
postgres.c:3926
#28 0x00007f971145ae03 in BackendRun () at postmaster.c:3600
#29 BackendStartup () at postmaster.c:3285
#30 ServerLoop () at postmaster.c:1454
#31 0x00007f971145d87c in PostmasterMain (argc=327879232,
argv=0x7f97138a5360) at postmaster.c:1115
#32 0x00007f97113fc693 in main (argc=7, argv=0x7f97138a5330) at main.c:199

On 28 November 2011 04:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Belinda Cussen <belinda(dot)cussen(at)servian(dot)com(dot)au> writes:
> > The code works ok on my database too when I call the procedure only once
> > sequentially. I hit the segmentation fault consistently when I try to
> call
> > the proc concurrently.
>
> What do you mean by "concurrently"? There is no multithreading within
> Postgres backends, and it's really hard to believe that unnest would be
> affected by what's happening in other server processes.
>
> regards, tom lane
>

--
[image: Servian Logo] *Belinda Cussen* | Servian Pty
Ltd<http://www.servian.com.au/> |
*m:* 0466 309 169 | *t:* 02 9376 0700 | f*:* 02 9376 0730

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2011-11-28 03:30:17 Re: Does anyone know of any issues around ARRAY UNNEST
Previous Message Tom Lane 2011-11-27 17:16:57 Re: Does anyone know of any issues around ARRAY UNNEST