Re: tsearch2 dictionary that indexes substrings?

From: Tilmann Singer <tils-pgsql(at)tils(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: tsearch2 dictionary that indexes substrings?
Date: 2007-04-23 17:16:59
Message-ID: 20070423171659.GB27485@tils.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

* Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> [20070420 11:32]:
> >If I understand it correctly such a dictionary would require to write
> >a custom C component - is that correct? Or could I get away with
> >writing a plpgsql function that does the above and hooking that
> >somehow into the tsearch2 config?
>
> You need to write C-function, see example in
> http://www.sai.msu.su/~megera/postgres/fts/doc/fts-intdict-xmp.html

Thanks.

My colleague who speaks more C than me came up with the code below
which works fine for us. Will the memory allocated for lexeme be freed
by the caller?

Til

/*
* Dictionary for partials of a word, ie. foo => {f,fo,foo}
*
* Based on the tsearch2/gendict/config.sh generator
*
* Author: Sean Treadway
*
* This code is released under the terms of the PostgreSQL License.
*/
#include "postgres.h"

#include "dict.h"
#include "common.h"

#include "subinclude.h"
#include "ts_locale.h"

#define is_utf8_continuation(c) ((unsigned char)(c) >= 0x80 && (unsigned char)(c) <= 0xBF)

PG_FUNCTION_INFO_V1(dlexize_partial);
Datum dlexize_partial(PG_FUNCTION_ARGS);
Datum
dlexize_partial(PG_FUNCTION_ARGS) {
char* in = (char*)PG_GETARG_POINTER(1);

char* utxt = pnstrdup(in, PG_GETARG_INT32(2)); /* palloc */
char* txt = lowerstr(utxt); /* palloc */
int txt_len = strlen(txt);

int results = 0;
int i = 0;

/* may overallocate, that's ok */
TSLexeme *res = palloc(sizeof(TSLexeme)*(txt_len+1));

for (i = 1; i <= txt_len; i++) {
/* skip UTF8 control codes until EOS */
if (!is_utf8_continuation(txt[i])) {
res[results++].lexeme = pnstrdup(txt, i);
}
}

res[results].lexeme=NULL;

pfree(utxt);
pfree(txt);

/* Receiver must free res memory and res[].lexeme */
PG_RETURN_POINTER(res);
}

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-04-23 17:23:09 Re: Help debugging a hung postgresql client connection
Previous Message Venkatraju T.V. 2007-04-23 15:16:55 Help debugging a hung postgresql client connection