Explaining an EXPLAIN.

From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Explaining an EXPLAIN.
Date: 2009-06-10 14:43:28
Message-ID: alpine.DEB.2.00.0906101541500.4337@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Alright. I have finally worked out why I was being censored. Majordomo
doesn't like subject lines beginning with the word "help". It by default
sends your message off to the moderators and doesn't tell you. Now follows
my original mail:

Hi. I thought by now I would be fairly good at understanding EXPLAIN
ANALYSE results, but I can't quite figure this one out. Perhaps someone
could help me.

EXPLAIN ANALYSE SELECT *
FROM GeneGoAnnotation a1, GOAnnotation a2, OntologyTermRelations a3
WHERE a1.GoAnnotation = a2.id AND a2.ontologyTermId = a3.OntologyTerm;

QUERY PLAN
---------------------------------------------------------------------------------------
Nested Loop (cost=0.00..673587.67 rows=330437962 width=95)
(actual time=0.056..1924645.797 rows=344491124 loops=1)
-> Merge Join (cost=0.00..28369.58 rows=361427 width=87)
(actual time=0.039..4620.912 rows=361427 loops=1)
Merge Cond: (a1.goannotation = a2.id)
-> Index Scan using genegoannotation__goannotation on genegoannotation a1
(cost=0.00..9710.32 rows=361427 width=8)
(actual time=0.015..840.547 rows=361427 loops=1)
-> Index Scan using goannotation_pkey on goannotation a2
(cost=0.00..13133.12 rows=403323 width=79)
(actual time=0.014..1427.179 rows=403323 loops=1)
-> Index Scan using ontologytermrelations__ontologyterm on ontologytermrelations a3
(cost=0.00..1.20 rows=47 width=8)
(actual time=0.022..1.908 rows=953 loops=361427)
Index Cond: (a3.ontologyterm = a2.ontologytermid)
Total runtime: 2524647.064 ms
(8 rows)

If I look at the actual results of the outer-most join, the nested loop,
then I can take the number rows=344491124 and divide it by loops=361427 to
get rows=953. Clearly this means that on average each index scan on a3
returned 953 rows.

However, if I apply the same logic to the estimated results, it all falls
apart. The total estimated number of rows is remarkably accurate, as is
the estimated number of loops (results from the merge join). However the
average number of rows expected to be returned from the index scan is only
47. I don't know how the planner is getting its accurate final estimate of
rows=330437962, because it is not from multiplying rows=361427 by rows=47.
That would only give 16987069 rows.

Any ideas/explanations?

Matthew

--
To be or not to be -- Shakespeare
To do is to be -- Nietzsche
To be is to do -- Sartre
Do be do be do -- Sinatra

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2009-06-10 14:43:43 EXPLAIN understanding? (restarted from Censorship)
Previous Message Tom Lane 2009-06-10 14:02:59 Re: Censorship