Re: [PATCH] Opclass parameters

From: Nikolay Shaplov <dhyan(at)nataraj(dot)su>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, obartunov(at)gmail(dot)com
Subject: Re: [PATCH] Opclass parameters
Date: 2018-11-20 11:15:45
Message-ID: 12473406.MrkCg5crFA@x200m
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

В письме от 15 ноября 2018 18:26:43 пользователь Nikita Glukhov написал:

> Attached 2nd version of the patches. Nothing has changed since March,
> this is just a rebased version.
>
> CREATE INDEX syntax and parameters storage method still need discussion.
I've played around a bit with you patch and come to some conclusions, I'd like
to share. They are almost same as those before, but now there are more
details.

Again some issues about storing opclass options in pg_inedx:

1. Having both indoption and indoptions column in pg_index will make someone's
brain explode for sure. If not, it will bring troubles when people start
confusing them.

2. Now I found out how do you store option values for each opclass: text[] of
indoptions in pg_index is not the same as text[] in
reloptions in pg_catalog (and it brings more confusion). In reloption each
member of the array is a single option.

reloptions | {fillfactor=90,autovacuum_enabled=false}

In indoptions, is a whole string of options for one of the indexed attributes,
each array item has all options for one indexed attribute. And this string
needs furthermore parsing, that differs from reloption parsing.

indoptions | {"{numranges=150}","{numranges=160}"}

This brings us to the following issues:

2a. pg_index stores properties of index in general. Properties of each indexed
attributes is stored in pg_attribute table. If we follow this philosophy
it is wrong to add any kind of per-attribute array values into pg_index. These
values should be added to pg_attribute one per each pg_attribute entry.

2b. Since you've chosen method of storage that differs from one that is used
in reloptions, that will lead to two verstions of code that processes the
attributes. And from now on, if we accept this, we should support both of them
and keep them in sync. (I see that you tried to reuse as much code as
possible, but still you added some more that duplicate current reloptions
functionality.)

I know that relotions code is not really suitable for reusing. This was the
reason why I started solving oplcass option task with rewriting reloptions
code,to make it 100% reusable for any kind of options. So I would offer you
again to join me as a reviewer of that code. This will make opclass code more
simple and more sensible, if my option code is used...

3. Speaking of sensible code

Datum
g_int_options(PG_FUNCTION_ARGS)
{
Datum raw_options = PG_GETARG_DATUM(0);
bool validate = PG_GETARG_BOOL(1);
relopt_int siglen =
{ {"numranges", "number of ranges for compression", 0, 0, 9,
RELOPT_TYPE_INT },
G_INT_NUMRANGES_DEFAULT, 1, G_INT_NUMRANGES_MAX };
relopt_gen *optgen[] = { &siglen.gen };
int offsets[] = { offsetof(IntArrayOptions, num_ranges) };
IntArrayOptions *options =
parseAndFillLocalRelOptions(raw_options, optgen, offsets, 1,
sizeof(IntArrayOptions), validate);

PG_RETURN_POINTER(options);
}

It seems to be not a very nice hack.
What would you do if you would like to have both int, real and boolean options
for one opclass? I do not know how to implement it using this code.
We have only int opclass options for now, but more will come and we should be
ready for it.

4. Now getting back to not adding opclass options wherever we can, just
because we can:

4a. For inrarray there were no opclass options tests added. I am sure there
should be one, at least just to make sure it still does not segfault when you
try to set one. And in some cases more tests can be needed. To add and review
them one should be familiar with this opclass internals. So it is good when
different people do it for different opclasses

4b. When you add opclass options instead of hardcoded values, it comes to
setting minimum and maximum value. Why do you choose 1000 as maximum
for num_ranges in gist_int_ops in intarray? Why 1 as minimum? All these
decisions needs careful considerations and can't be done for bunch of
opclasses just in one review.

4c. Patch usually take a long path from prototype to final commit. Do you
really want to update all these opclasses code each time when some changes
in the main opclass option code is made? ;-)

So I would suggest to work only with intarray and add other opclasses later.

5. You've been asking about SQL grammar

> CREATE INDEX idx ON tab USING am (
> {expr {opclass | DEFAULT} ({name=value} [,...])} [,...]
> );

As for me I do not really care about it. For me all the solutions is
acceptable. But looking at is i came to one notion:

I've never seen before DEFAULT keyword to be used in this way. There is logic
in such usage, but I did not remember any practical usage case.
If there are such usages (I can easily missed it) or if it is somehow
recommended in SQL standard -- let it be. But if none above, I would suggest
to use WITH keyword instead. As it is already used for reloptions. As far as I
remember in my prototype I used "WITH OPTIONS" but did if just because did not
find my way through yac with single "WITH". So ideal way for me would be

create index ON test USING GIST (i WITH (sig_len_int=22));

But as I said it is not thing of importance for me. Just an observation.

--
Do code for fun.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-11-20 11:20:59 Re: Shared Memory: How to use SYSV rather than MMAP ?
Previous Message Masahiko Sawada 2018-11-20 11:14:10 Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2