Re: Index on two columns not used

From: Péter Kovács <peter(dot)kovacs(at)chemaxon(dot)hu>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Arnaud Lesauvage <thewild(at)freesurf(dot)fr>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Index on two columns not used
Date: 2006-10-21 10:22:25
Message-ID: 4539F4E1.7000003@chemaxon.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Sorry for the amateurish question, but what are "heap tuples"?

Also, my understanding is that the following statement applies only for
composite indexes: "PostgreSQL can't use the values stored in the index
to check the join condition". I assume that PostgreSQL will be able to
use single-column-indexes for join conditions. Is this correct?

Thank you,
Peter

Heikki Linnakangas wrote:
> Arnaud Lesauvage wrote:
>> I have two table with a 2-column index on both of them.
>> In the first table, the first colum of the index is the primary key,
>> the second one is an integer field.
>> In the second table, the two columns are the primary key.
>> When I join these two tables, the 2-column index of the first table
>> is not used.
>> Why does the query planner think that this plan is better ?
>>
>> ALTER TABLE geo.subcities_names
>> ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid,
>> language_id);
>>
>> CREATE INDEX subcities_gid_language_id
>> ON geo.subcities
>> USING btree
>> (gid, official_language_id);
>>
>> EXPLAIN ANALYZE
>> SELECT * FROM geo.subcities sc, geo.subcities_names scn
>> WHERE sc.gid = scn.subcity_gid AND sc.official_language_id =
>> scn.language_id;
>
> My theory:
>
> There's no additional restrictions besides the join condition, so the
> system has to scan both tables completely. It chooses to use a full
> index scan instead of a seq scan to be able to do a merge join.
> Because it's going to have to scan the indexes completely anyway, it
> chooses the smallest index which is subcities_pkey.
>
> You'd think that the system could do the merge using just the indexes,
> and only fetch the heap tuples for matches. If that were the case,
> using the 2-column index would indeed be a good idea. However,
> PostgreSQL can't use the values stored in the index to check the join
> condition, so all the heap tuples are fetched anyway. There was just
> recently discussion about this on this list:
> http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php.
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Magnus Hagander 2006-10-21 12:40:45 Re: pgBench on Windows
Previous Message Tom Lane 2006-10-21 05:02:06 Re: VACUUM Performance