Skip site navigation (1) Skip section navigation (2)

Statistics with temporary tables, optimizer question

From: Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>
To: "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>
Subject: Statistics with temporary tables, optimizer question
Date: 2010-09-23 13:33:28
Message-ID: 4C9B5728.8090207@vmsinfo.com (view raw or flat)
Thread:
Lists: pgsql-novice
I have a table with 45 million rows, partitioned on a date field. Each of
the partitions has a primary key. There are 3 partitions so far, 15
million records each.

I retrieve up to 200k document id's (primary key for each partition) from
the application and put them into a temporary table which I then join to
the partitioned monster and some other tables. I discovered a strange
thing: the optimizer chooses hash join, with a full table scan of all
underlying tables unless I create a primary key on the temporary table,
in which case the appropriate nested loops join is chosen.

What makes optimizer do that and why? I can get by the problem by
disabling hash join in the postgresql.conf but I don't like that
solution. Not even lowering random page cost to the same cost as
sequential page cost helps.

My question is how does the optimizer calculate stats for the temporary
tables? I am probably not expected to do a vacuum analyze on the
temporary table, after finishing the inserts? How exactly does the 
optimizer deal
with the temporary tables?

Postgresql is 8.4.4 on 64 bit Red Hat 5.5

-- 
Mladen Gogala 
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com 


Responses

pgsql-novice by date

Next:From: Josh KupershmidtDate: 2010-09-23 14:26:28
Subject: Re: Statistics with temporary tables, optimizer question
Previous:From: Rob RichardsonDate: 2010-09-23 12:49:08
Subject: Re: Table transfer

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group