Seqscan slowness and stored procedures

From: Ivan Voras <ivoras(at)freebsd(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Seqscan slowness and stored procedures
Date: 2012-05-26 21:38:50
Message-ID: jprihk$ou3$1@dough.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

I have a SQL function (which I've pasted below) and while testing its
code directly (outside a function), this is the "normal", default plan:

http://explain.depesz.com/s/vfP (67 ms)

and this is the plain with enable_seqscan turned off:

http://explain.depesz.com/s/EFP (27 ms)

Disabling seqscan results in almost 2.5x faster execution.

However, when this code is wrapped in a function, the execution time is
closer to the second case (which is great, I'm not complaining):

edem=> explain analyze select * from document_content_top_voted(36);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Function Scan on document_content_top_voted (cost=0.25..10.25
rows=1000 width=188) (actual time=20.644..20.821 rows=167 loops=1)
Total runtime: 21.236 ms
(2 rows)

I assume that the difference between the function execution time and the
direct plan with seqscan disabled is due to SQL parsing and planning.

Since the plan is compiled-in for stored procedures, is the planner in
that case already running under the assumption that seqscans must be
disabled (or something to that effect)?

Would tweaking enable_seqscan and other planner functions during the
CREATE FUNCTION have an effect on the stored plan?

Do the functions need to be re-created when the database is fully
populated, to adjust their stored plans with regards to new selectivity
situation on the indexes?

----

The SQL function is:

-- Retrieves document chunks of a specified document which have the most
votes

DROP FUNCTION IF EXISTS document_content_top_voted(INTEGER);
CREATE OR REPLACE FUNCTION document_content_top_voted(document_id INTEGER)
RETURNS TABLE
(chunk_id INTEGER, seq INTEGER, content TEXT, ctime INTEGER, log
TEXT,
nr_chunk_upvotes INTEGER, nr_chunk_downvotes INTEGER,
nr_seq_changes INTEGER, nr_seq_comments INTEGER,
user_login VARCHAR, user_public_name VARCHAR, user_email VARCHAR)
AS $$
WITH documents_top_chunks AS (
SELECT
(SELECT
chunk_id
FROM
documents_chunks_votes_total
WHERE
documents_id=$1 AND
documents_chunks_votes_total.seq=documents_seqs.seq AND votes=
(SELECT
max(votes)
FROM
documents_chunks_votes_total
WHERE
documents_id=$1 AND
documents_chunks_votes_total.seq=documents_seqs.seq)
ORDER BY
chunk_id DESC
LIMIT 1) AS chunk_id, seq AS doc_seq
FROM
documents_seqs
WHERE
documents_id = $1
ORDER BY seq
) SELECT
chunk_id, doc_seq, content, documents_chunks.ctime,
documents_chunks.log,
COALESCE((SELECT SUM(vote) FROM documents_chunks_votes WHERE
documents_chunks_id=chunk_id AND vote=1)::integer, 0) AS nr_chunk_upvotes,
COALESCE((SELECT SUM(vote) FROM documents_chunks_votes WHERE
documents_chunks_id=chunk_id AND vote=-1)::integer, 0) AS
nr_chunk_downvotes,
(SELECT COUNT(*) FROM documents_chunks WHERE documents_id=$1 AND
seq=doc_seq)::integer AS nr_seq_changes,
(SELECT COUNT(*) FROM documents_seq_comments WHERE
documents_seq_comments.documents_id=$1 AND seq=doc_seq)::integer AS
nr_seq_comments,
users.login, users.public_name, users.email
FROM
documents_chunks
JOIN documents_top_chunks ON documents_chunks.id =
documents_top_chunks.chunk_id
JOIN users ON users.id=creator_uid
ORDER BY doc_seq
$$ LANGUAGE SQL;

(comments on improving the efficiency of the SQL code are also appreciated)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2012-05-27 03:28:32 Re: Seqscan slowness and stored procedures
Previous Message Tatsuo Ishii 2012-05-26 01:18:40 Re: pg_dump and thousands of schemas