ERROR: GIN indexes do not support whole-index scans

From: "Kevin Flanagan" <kevin-f(at)linkprior(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: ERROR: GIN indexes do not support whole-index scans
Date: 2010-05-20 20:30:03
Message-ID: 021d01caf85b$3a567f20$af037d60$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Could anyone advise as to how to avoid this error? I'll describe the table
and query below.

The database contains a table 'tinytm_segments', which has two text columns,
'source_text' and 'target_text'. These are used to store sentences and their
translations. The language of the text is specified with typical
two-character identifiers ('en', 'fr' etc.) stored in two further columns,
'source_lang_code' and 'target_lang_code'. Translation in either direction
can be stored, so for a given row, source_text may contain English and
target_text French (with the corresponding values in source_lang_code and
target_lang_code), or the other way round.

The application needs to search for (say) French sentences containing a
given substring and retrieve any English translation found (or whatever
other language combination and direction). To perform better with large
datasets, full text indices are defined, such as these:

-- Index English text

CREATE INDEX tu_target_text_en_idx ON tinytm_segments USING
gin(to_tsvector('english', target_text)) where target_lang_code = 'en';

CREATE INDEX tu_source_text_en_idx ON tinytm_segments USING
gin(to_tsvector('english', source_text)) where source_lang_code = 'en';

-- Index French text

CREATE INDEX tu_source_text_fr_idx ON tinytm_segments USING
gin(to_tsvector('french', source_text)) where source_lang_code = 'fr';

CREATE INDEX tu_target_text_fr_idx ON tinytm_segments USING
gin(to_tsvector('french', target_text)) where target_lang_code = 'fr';

To retrieve (say) sentences that have been translated from French, where the
French contains a given substring, a query like this can then be issued:

SELECT * FROM tinytm_segments WHERE

source_lang_code='fr' AND

to_tsvector('french', source_text) @@ plainto_tsquery('french', 'rien du
tout') AND lower(source_text) LIKE '%rien du tout%'

However, that will return sentences translated into whatever language. The
error occurs when trying to retrieve only sentences translated from French
into English, using a query like this:

SELECT * FROM tinytm_segments WHERE

source_lang_code='fr' AND

to_tsvector('french', source_text) @@ plainto_tsquery('french', 'rien du
tout') AND lower(source_text) LIKE '%rien du tout%'

AND target_lang_code='en'

Why would adding "target_lang_code='en'" cause this error?

Environment: PostgreSQL 8.4 on Windows (installed with one-click installer),
default text search config used.

Thanks in advance for any information.

Kevin.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2010-05-20 20:39:26 Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Previous Message Andres Freund 2010-05-20 20:27:48 [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up