Index Analysis: Filters

From: Thomas F(dot)O'Connell <tfo(at)sitening(dot)com>
To: "PgSQL - General" <pgsql-general(at)postgresql(dot)org>
Subject: Index Analysis: Filters
Date: 2004-09-29 21:45:47
Message-ID: EC781F6A-1260-11D9-8537-000D93AE0944@sitening.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'm interested to know a little bit more about the postgres
implementation of indexes. I'm specifically wondering what it means in
the output of EXPLAIN when a filter is applied.

I'm trying to decide whether it makes sense to use indexes on
expressions rather than relying on a left-anchored LIKE for date
filtering.

Here's what I've got:

WHERE some_date LIKE '<year>-<month>%' *

And what I'm wondering is whether it would be faster to add indexes on
expressions for something like:

WHERE EXTRACT( year from some_date ) = '<year>'
AND EXTRACT( month from some_date ) = '<month>'

In practice, the point in the implementation has other parameters, so
it ends up looking something like:

SELECT COUNT( * )
FROM some_table AS st, some_other_table AS sot
WHERE st.id = '<id>'
AND st.some_date LIKE '<year>-<month>%'
AND st.other_id = sot.other_id

Here's what I get from an EXPLAIN:

QUERY
PLAN
------------------------------------------------------------------------
----------------------------------------------------------------
Aggregate (cost=4749.19..4749.19 rows=1 width=0)
-> Nested Loop (cost=0.00..4749.17 rows=9 width=0)
-> Index Scan using st_id_idx on emma_mailings em
(cost=0.00..147.56 rows=1 width=8)
Index Cond: (emma_account_id = 1::bigint)
Filter: ((some_date)::text ~~ '2004-06%'::text)
-> Index Scan using sot_other_id_idx on some_other_table sot
(cost=0.00..3164.42 rows=114975 width=8)
Index Cond: ("outer".other_id = sot.other_id)

Then I try:

SELECT COUNT( * )
FROM some_table AS st, some_other_table AS sot
WHERE st.id = '<id>'
AND EXTRACT( year FROM st.some_date ) = '<year>'
AND EXTRACT( month FROM st.some_date ) = '<month>'
AND st.other_id = sot.other_id

When I first added indexes on the EXTRACT expressions on
some_table.some_date, I basically (the numbers are fudged because now
the new indexes are working) saw:

QUERY
PLAN
------------------------------------------------------------------------
----------------------------------------------------------------
Aggregate (cost=4749.19..4749.19 rows=1 width=0)
-> Nested Loop (cost=0.00..4749.17 rows=9 width=0)
-> Index Scan using st_id_idx on emma_mailings em
(cost=0.00..147.56 rows=1 width=8)
Index Cond: (emma_account_id = 1::bigint)
Filter: ((date_part('year'::text, emma_mailing_start_ts)
= 2004::double precision) AND (date_part('year'::text,
emma_mailing_start_ts) = 6::double precision))
-> Index Scan using sot_other_id_idx on some_other_table sot
(cost=0.00..3164.42 rows=114975 width=8)
Index Cond: ("outer".other_id = sot.other_id)

Now, though, it seems to be using the expression indexes. I guess my
main question is what it means to apply a Filter to an Index Scan, and
whether the LIKE filter would be slower than the EXTRACT filter. In the
queries I've tested, EXPLAIN ANALYZE resulted in almost identical
runtimes until the indexes on expressions kicked in. Once that
happened, the new indexes were much faster.

A related question might be what might've happened between the times I
tested when the indexes weren't working and when they were. I've been
testing on a low-traffic development server, and I ANALYZED after
adding the indexes. It was in a new session, though, that the indexes
on EXTRACT actually kicked in.

-tfo

* <> = generic pseudocode placeholder for variable/constant data

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-09-29 21:46:17 Re: dangling permission on tables after drop user.
Previous Message Alvaro Herrera 2004-09-29 21:35:58 Re: dangling permission on tables after drop user.