Re: json(b)_to_tsvector with numeric values

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, Oleg Bartunov <obartunov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json(b)_to_tsvector with numeric values
Date: 2018-04-06 14:25:17
Message-ID: 3b4bdd17-75d6-1687-bbf8-8a934abc0b25@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

1) I don't like jsonb_all_to_tsvector too.. What if we will accept new variant
to index? Let me suggest:

tsvector jsonb_to_tsvector([regclass,] jsonb, text[])

where text[] arg is actually a flags, array contains any combination of literals
'numeric', 'string', 'boolean' (and even 'key' to index keys_ to point which
types should be indexed. More than it, may be, it should a jsonb type for
possible improvements in future. For now, it shouldbe a jsonb array type with
string elements described above, example:

select jsonb_to_tsvector('{"a": "aaa in bbb ddd ccc", "b":123}',
'["numeric", "boolean"]');

Form jsonb_to_tsvector('...', '["string"]) is effectively the same as current
to_tsvector(jsonb)

2)
Now it fails, and I see something strange in resuling tsvector: 'true':9,13 and
'fals':9,13 - I don't see any bool keys in input json.

% more /home/teodor/pgsql/src/test/regress/regression.diffs
*** /home/teodor/pgsql/src/test/regress/expected/jsonb.out 2018-04-06
16:34:59.424481000 +0300
--- /home/teodor/pgsql/src/test/regress/results/jsonb.out 2018-04-06
16:36:48.095411000 +0300
***************
*** 4132,4138 ****
select jsonb_all_to_tsvector('english', '{"a": "aaa in bbb ddd ccc", "b":
123, "c": 456}'::jsonb);
jsonb_all_to_tsvector
--------------------------------------------------------------
! '123':7 '456':11 'aaa':1 'bbb':3 'ccc':5 'ddd':4 'true':9,13
(1 row)

-- ts_vector corner cases
--- 4132,4138 ----
select jsonb_all_to_tsvector('english', '{"a": "aaa in bbb ddd ccc", "b":
123, "c": 456}'::jsonb);
jsonb_all_to_tsvector
--------------------------------------------------------------
! '123':7 '456':11 'aaa':1 'bbb':3 'ccc':5 'ddd':4 'fals':9,13
(1 row)

-- ts_vector corner cases

Dmitry Dolgov wrote:
>> On 4 April 2018 at 16:09, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>>
>>>> Hm, seems, it's useful feature, but I suggest to make separate function
>>>> jsonb_any_to_tsvector and add support for boolean too (if you know better
>>>> name for function, do not hide it). Changing behavior of existing
>>>> function
>>>> is not obvious for users and, seems, should not backpatched.
>>>
>>>
>>> What do you think about having not a separate function, but a flag
>>> argument to
>>> the existing one (like `create` in `jsonb_set`), that will have false as
>>> default value? The result would be the same, but without an extra function
>>> with
>>> almost the same implementation.
>>
>>
>> tsvector jsonb_to_tsvector(jsonb[, bool]) ?
>> Agreed. Second arg should be optional.
>
> Unfortunately, this idea with a flag argument can't be implemented easily
> (related discussion is here [1]). So I've modified the patch accordingly to
> your original suggestion about having separate functions
> `json(b)_all_to_tsvector`.
>
> 1: https://www.postgresql.org/message-id/flat/CA%2Bq6zcVJ%2BWx%2B-%3DkkN5UC0T-LtsJWnx0g9S0xSnn3jUWkriufDA%40mail.gmail.com
>

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-04-06 14:25:19 Re: [HACKERS] Subscription code improvements
Previous Message Michael Paquier 2018-04-06 14:20:08 Re: PATCH: Configurable file mode mask