Full text indexing on documents outside the DB

From: "Leo" <fleovey(at)jus(dot)gov(dot)ar>
To: <pgsql-novice(at)postgresql(dot)org>
Subject: Full text indexing on documents outside the DB
Date: 2008-02-14 14:02:23
Message-ID: 000601c86f12$3ac45920$232401c8@leo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hello, I have installed the new 8.3 Postgres and all works fine for me.
It took some time to reload the old tsearch2 tables, but they are OK now.

My question is: How can I create the tsvector data for large text files WITHOUT loading the text into a column of the DB?
I did it using 2 tables (one with the text inserted with a perl program) and copying the tsvectors and removing the table with the text, but it is not easy to add data this way.
The text files are "frozen" so I don't need a trigger to update the tsvector, all I store is a URL of the text.
So what I am looking for is a way to point to a file and create the tsvector.
Maybe somebody can help me?
BTW the dictionary does a very nice job ignoring all the HTML junk in the files.
Thanks in advance

Browse pgsql-novice by date

  From Date Subject
Next Message Nicholas Hemley 2008-02-15 12:07:06 installation of postgres 8.3 on CentOs 5.1
Previous Message Sachin Srivastava 2008-02-14 06:38:13 How to test changes done to the code