Term positions in GIN fulltext index

From: Yoann Moreau <yoann(dot)moreau(at)univ-avignon(dot)fr>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Term positions in GIN fulltext index
Date: 2011-11-03 15:52:23
Message-ID: 4EB2B8B7.1060806@univ-avignon.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,
I'm using a GIN index for a text column on a big table. I use it to rank
the rows, but I also need to get the term positions for each document of a
subset of documents for one or more terms. I suppose these positions are stored
in the index as the to_tsvector shows them : 'lexeme':{positions}

I've searched and asked on general postgresql mailing list, and I assume
there is no simple way to get these term positions.

For example, for 2 rows of a 'docs' table with a text column 'text' (indexed with GIN) :
'I get lexemes and I get term positions.'
'Did you get the positions ?'

I'd need a function like this :
select term_positions(text, 'get') from docs;
id_doc | positions
--------+-----------
1 | {2,6}
2 | {3}

I'd like to add this function in my database, for experimental purpose.
I got a look at the source code but didn't find some code example using the GIN index ;
I can not figure out where the GIN index is read as a tsvector
or where the '@@' operator gets the matching tsvectors for the terms of the tsquery.

Any help about where to start reading would be very welcome :)

Regards,
Yoann Moreau

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Adrian Klaver 2011-11-03 15:53:22 Re: Strange problem with create table as select * from table;
Previous Message Tom Lane 2011-11-03 15:37:54 Re: Strange problem with create table as select * from table;