Re: Replacement for Oracle Text

From: Stephen Davies <sdavies(at)sdc(dot)com(dot)au>
To: Bruce Momjian <bruce(at)momjian(dot)us>, s d <daku(dot)sandor(at)gmail(dot)com>
Cc: Postgresql General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Replacement for Oracle Text
Date: 2016-02-20 00:10:43
Message-ID: 56C7AF03.30701@sdc.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 20/02/16 00:24, Bruce Momjian wrote:
> On Fri, Feb 19, 2016 at 02:49:16PM +0100, s d wrote:
>> On 19 February 2016 at 14:19, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> > Ah, no. That's not possible
>> >
>> >
>> > ...not possible, Yet.
>> >
>> > PostgreSQL grows by adding the features people need and its changing
>> rapidly.
>>
>> I wonder if PLPerl could be used to extract the words from a PDF
>> document and create a tsvector column from it.
>>
>> I don't know about PLPerl(I'm pretty sure it could be used for this purpose,
>> though.). On the other hand I've written code for this in Python which should
>> be easy to adapt for PLPython, if necessary.
>
> Right, so you would write a PL/Perl or PL/Python trigger function that
> would populate the tsvector column on every INSERT or UPDATE.
>
FWIW, I just use pdftotext in my CGI.

--
=============================================================================
Stephen Davies Consulting P/L Phone: 08-8177 1595
Adelaide, South Australia. Mobile:040 304 0583

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sridhar N Bamandlapally 2016-02-20 04:40:46 Re: JDBC behaviour
Previous Message Oleg Bartunov 2016-02-19 19:23:34 Re: Replacement for Oracle Text