Re: Weird performance issue with custom function with a for loop.

From: Nicos Panayides <nicos(at)magneta(dot)com(dot)cy>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Weird performance issue with custom function with a for loop.
Date: 2011-02-01 10:09:28
Message-ID: 4D47DBD8.6050501@magneta.com.cy
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 01/31/2011 08:48 PM, Tom Lane wrote:
> Nicos Panayides<nicos(at)magneta(dot)com(dot)cy> writes:
>
>> the following function takes forever to execute as is. I see 'Inserting
>> original actions in temporary table' and nothing after that. If i
>> replace orig_user_id in the FOR loop with 1811 (the same orig_user_id
>> passed as the function parameter) it returns immediately correctly (the
>> table has indices so it's very fast).
>>
> It seems likely that you're getting a different plan for the generic
> case because that user id isn't representative of the overall average
> for the column. You could investigate by explaining a parameterized
> query:
>
> PREPARE foo (bigint) AS
> SELECT ... WHERE game_round_actions.user_id = $1 ... ;
> EXPLAIN EXECUTE foo(1811);
>
> (To really see exactly what's happening, you'd probably need to
> parameterize for each of the plpgsql variables used in the query;
> I'm suspicious that the BETWEEN might be contributing to the
> issue as well.)
>
> Possibly increasing the stats target for the user id column would help,
> but it's hard to be sure without knowing what its distribution is like.
>
> regards, tom lane
>
I tried the prepared statement with both $1 and 1811 for user_id and
here's the plans I got:

"Sort (cost=51704688.71..51704689.50 rows=314 width=57)"
" Sort Key: game_round_actions.action_time, game_round_actions.action_id"
" -> Nested Loop (cost=0.00..51704675.69 rows=314 width=57)"
" -> Seq Scan on game_round_actions (cost=0.00..51702078.26
rows=314 width=53)"
" Filter: ((action_time >= $2) AND (action_time <= $3) AND
(sub_action_id = 0) AND (user_id = $1))"
" -> Index Scan using "PK_game_table" on game_tables
(cost=0.00..8.26 rows=1 width=12)"
" Index Cond: (game_tables.table_id =
game_round_actions.table_id)"
" Filter: (game_tables.game_type_id <> ANY ($4))"

"Sort (cost=226660.58..226661.33 rows=300 width=57)"
" Sort Key: game_round_actions.action_time, game_round_actions.action_id"
" -> Nested Loop (cost=0.00..226648.24 rows=300 width=57)"
" -> Index Scan using i_session on game_round_actions
(cost=0.00..224166.97 rows=300 width=53)"
" Index Cond: ((action_time >= $2) AND (action_time <= $3))"
" Filter: (user_id = 1811)"
" -> Index Scan using "PK_game_table" on game_tables
(cost=0.00..8.26 rows=1 width=12)"
" Index Cond: (game_tables.table_id =
game_round_actions.table_id)"
" Filter: (game_tables.game_type_id <> ANY ($4))"

Here's the table definition:

CREATE TABLE game_round_actions
(
table_id bigint NOT NULL,
round_id integer NOT NULL,
action_id integer NOT NULL,
seat_id integer NOT NULL,
action_desc character varying(20) NOT NULL,
action_area character varying(100),
amount numeric(16,6),
action_value character varying(100),
action_time timestamp without time zone NOT NULL DEFAULT
CURRENT_TIMESTAMP AT TIME ZONE 'UTC',
user_id bigint,
sub_action_id integer NOT NULL, -- Sub action id is 0 for the root
actions. >0 for generated actions.

CONSTRAINT "PK_game_round_actions" PRIMARY KEY (table_id, round_id,
action_id, sub_action_id),
CONSTRAINT fk_game_round_actions_round FOREIGN KEY (table_id, round_id)
REFERENCES game_rounds (table_id, round_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_game_round_actions_user FOREIGN KEY (table_id, user_id)
REFERENCES game_table_users (table_id, user_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);

CREATE INDEX i_session
ON game_round_actions
USING btree
(action_time)
WHERE user_id <> 0 AND sub_action_id = 0;

The table contains 1 655 528 000 rows (estimated) and there are about
10000 unique user_ids. The data spans about 2 years.
Shouldn't postgres realise that in both cases user_id is compared
against a constant value and chose the same plan?

How do I increase the stats target for the column?

--
Regards,

Nicos Panayides
IT Manager

Magneta Technologies Ltd
Tel: +357 22721919, 22317400
Fax: +357 22721917
Web: http://www.magneta.eu

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Yngve Nysaeter Pettersen 2011-02-01 12:32:39 Select for update with offset interferes with concurrent transactions
Previous Message Pavel Stehule 2011-02-01 07:13:40 Re: Serial Vs Sequence