Re: Fulltext index

From: Sam Mason <sam(at)samason(dot)me(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Fulltext index
Date: 2008-11-10 11:26:13
Message-ID: 20081110112613.GZ2459@frubble.xen.chris-lamb.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Nov 10, 2008 at 09:14:21AM +0100, Andreas Kraftl wrote:
> Am Sat, 08 Nov 2008 09:44:17 +0100 schrieb Andreas Kraftl:
> > How can I create a full text index over b?
>
> thanks for the answers. But nothing matches my problem.

I'm not sure what's wrong with Oleg's suggestion--he's the guy who wrote
most of the code so his suggestions should be reasonable! I'm just
learning about this stuff myself, so it may be somewhat sub-optimal.
That said, I got things working when doing the following:

CREATE TABLE test (
lang TEXT,
text TEXT
);

INSERT INTO test VALUES
('german', 'hallo welt'),
('english', 'hello world');

CREATE INDEX idx ON test USING gin(tsvector_concat(
to_tsvector('german', CASE lang WHEN 'german' THEN text ELSE '' END),
to_tsvector('english', CASE lang WHEN 'english' THEN text ELSE '' END)));

"text" as a column name gets a bit confusing to read, but I'm trying
to follow your names. Also my version of PG didn't seem to know that
the '||' operator knows how to concat tsvectors, so I had to spell out
tsvector_concat in full. Querying is a bit awkward, but works:

SELECT *
FROM test
WHERE tsvector_concat(
to_tsvector('english', CASE lang WHEN 'english' THEN text ELSE '' END),
to_tsvector('german', CASE lang WHEN 'german' THEN text ELSE '' END))
@@ to_tsquery('english', 'hello');

Putting most of the above into a query would work, as would having PG
automatically maintaining a column of type TSVECTOR.

> I read the manual again and decide me for an other way.
> I change my table that it looks like:
>
> lang | text
> ----------------------
> german | hallo welt
> english | hello world

What types do these columns have? if "lang" is of type REGCONFIG all
works for me:

CREATE TABLE test (
lang REGCONFIG,
text TEXT
);

INSERT INTO test VALUES
('german', 'hallo welt'),
('english', 'hello world');

CREATE INDEX idx ON test USING gin(to_tsvector(lang, text));

SELECT *
FROM test
WHERE to_tsvector(lang, text) @@ to_tsquery('english', 'hello');

This all seems much easier than having "lang" as a TEXT column.

> Now I have no idea. My experience with databases and postgresql are too
> less to decide if this is a bug, or myself is the bug ;-).

I think the awkward thing is that text/strings are visually indistin-
guishable from arbitrary literals in SQL. The 'english' that's going
into the to_tsquery() call above is actually of type REGCONFIG, but it
looks like a string literal. I think that could be why you were getting
confused before.

Hope that all makes sense and helps a bit!

Sam

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andrus 2008-11-10 11:35:35 Optimizing IN queries
Previous Message Christian Schröder 2008-11-10 11:22:47 Database recovery