Re: Tsvector editing functions

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tsvector editing functions
Date: 2015-11-27 11:38:26
Message-ID: 565840B2.3080100@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


1 Please, make patch compilable with current master.
cd ../../../src/include/catalog && '/usr/local/bin/perl' ./duplicate_oids
3315
3316

2 lexin = TextDatumGetCString(PG_GETARG_DATUM(1))
lexin_len = strlen(lexin)

Why do you use C-string instead of just text? Suppose, much better:
t = PG_GETARG_TEXT_P(1)
lexin = VARDATA(t)
lexin_len = VARSIZE_ANY_EXHDR(t)

3 Why do you use linear search in tsvector instead of binary search? It could
produce a performance impact

4 Again, using BuildTupleFromCStrings() call is not very optimal

5 printing weights as numbers is not consistent with other usage of weigth's in
FTS. Lexem's weight are mentioned as one of A,B,C,D and default weight is a D.

Teodor Sigaev wrote:
>>> There is patch that adds some editing routines for tsvector type.
> ...
>> When submitting a patch, it's a good idea to explain why someone would
>> want the feature you are adding. Maybe that's obvious to you, but it
>> isn't clear to me why we'd want this.
>>
>
> Some examples:
> tsvector delete(tsvector, text)
> remove wronlgy indexed word (may, be a stop word)
> text[] to_array(tsvector)
> In my practice, I needed it to work with smlar module.
> tsvector to_tsvector(text[])
> Converts list of tags to tsvector, because search in tsvector is more
> flexible and fast than array's equivalents
> set unnest(tsvector)
> Count some complicated statistics.
>
> That functions mostly needed in utility processing rather in workflow.
>
>

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2015-11-27 12:13:44 Re: Tsvector editing functions
Previous Message Tomas Vondra 2015-11-27 11:17:49 silent data loss with ext4 / all current versions