Re: Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?

From: Denis Papathanasiou <denis(dot)papathanasiou(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Subject: Re: Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?
Date: 2010-08-25 21:02:28
Message-ID: 4C7584E4.6000901@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> we need examples of your explain analyze. I don't want to waste my time
> reading theoretical reasoning :)

Here's an actual 'explain analyze' example:

alerts=> CREATE INDEX node_val_tsv_idx ON node USING
gin(to_tsvector('english', val));
CREATE INDEX
alerts=> explain analyze select item_pk from node where
tag='primaryIssuer.entityType' and val @@ plainto_tsquery('Limited
Partnership');
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on node (cost=204.26..5792.92 rows=4 width=16)
(actual time=2.952..131.868 rows=953 loops=1)
Recheck Cond: (tag = 'primaryIssuer.entityType'::text)
Filter: (val @@ plainto_tsquery('Limited Partnership'::text))
-> Bitmap Index Scan on node_tag_idx (cost=0.00..204.26 rows=3712
width=0) (actual time=1.628..1.628 rows=3631 loops=1)
Index Cond: (tag = 'primaryIssuer.entityType'::text)
Total runtime: 133.345 ms
(6 rows)

alerts=> DROP INDEX node_val_tsv_idx;
DROP INDEX
alerts=> explain analyze select item_pk from node where
tag='primaryIssuer.entityType' and val @@ plainto_tsquery('Limited
Partnership');
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on node (cost=204.26..5792.92 rows=4 width=16)
(actual time=2.938..93.239 rows=953 loops=1)
Recheck Cond: (tag = 'primaryIssuer.entityType'::text)
Filter: (val @@ plainto_tsquery('Limited Partnership'::text))
-> Bitmap Index Scan on node_tag_idx (cost=0.00..204.26 rows=3712
width=0) (actual time=1.614..1.614 rows=3631 loops=1)
Index Cond: (tag = 'primaryIssuer.entityType'::text)
Total runtime: 94.696 ms
(6 rows)

The table this is run against is defined like this:

CREATE TABLE node (
pk uuid primary key,
item_pk uuid not null references item (pk),
tag text not null,
val text
);

In addition to the gin/ts_vector index on node.val shown above, there
are two other explicit indices on this table:

CREATE INDEX node_tag_idx ON node (tag);
CREATE INDEX node_val_idx ON node (val);

The reason for the node_val_idx index is that there will be cases where
the query phrase is known exactly, so the where clause in the select
statement will be just "val = 'Limited Partnership'".

> btw, Be sure you use the same search configuration as in create index or
> index will not be used at all.

Is this indeed the problem here?

The explain output references "val @@ plainto_tsquery()" but as a
filter, whereas the tag portion of the statement mentions node_tag_idx
as the index it used.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Oleg Bartunov 2010-08-25 21:17:34 Re: Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?
Previous Message Bill Christensen 2010-08-25 20:51:16 Re: Problem with dumps