Re: [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM

From: Nikolay Shaplov <dhyan(at)nataraj(dot)su>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM
Date: 2017-03-17 10:22:25
Message-ID: 3123308.RiobK7ytJP@x200m
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

В письме от 12 марта 2017 22:29:30 пользователь Nikolay Shaplov написал:

Another rebase.

> I've rebased the patch again.
>
> Alvaro, I am waiting for you! ;-)
>
> > I've rebased the patch, so it can be applied to current master. See
> > attachment.
> >
> > Alvaro in private letter told that he will review the patch some time
> > later. So I am waiting.
> >
> > Also added this patch to commitfest:
> > https://commitfest.postgresql.org/13/992/
> >
> > > (Please see Attribution notice at the bottom of the letter)
> > >
> > > Hi! Some time ago I've send a proposal, about removing all am-related
> > > code
> > > from src/backend/access/common/reloptions.c and move it into somewhere
> > > inside am code. It was in
> > > https://www.postgresql.org/message-id/4636505.Bu9AKW1Kzc%40nataraj-amd64
> > > thread.
> > >
> > > Now I have a patch that is ready
> > >
> > > Before explaining what have been done, I need to define some concepts,
> > > so
> > > it would be easy to speak about them in the further explanation.
> > >
> > > Reloptions values in postgres actually exists in four representations:
> > > DefList*, TextArray[], Bytea and so called Values.
> > >
> > > - DefList representation is a list of DefElem, it is the representation
> > > in
> > > which reloptions comes from syntax analyzer, when relation is created or
> > > altered, and also this representation that is needed for pg_dump to dump
> > > a
> > > create statement that creates this relation.
> > >
> > > - TextArray[] representation is a way, how reloptions are stored in
> > > pg_catalog. It is TEXT[] attribute in the pg_catalog table and from the
> > > poinf o view of postgres code it is Datum array of TEXT Datums, when
> > > read
> > > into the memory
> > >
> > > - Bytea (or I will also call it Binary) representation, is a
> > > representation with which relation code really works. It has C-structure
> > > inside in which fixed-length options are sored, some space is reserved
> > > for storing varlen values after the structure, and certainly it has
> > > BYTEA
> > > header in front of all of it. It is cached, it is fast to access. It is
> > > a
> > > good stuff.
> > >
> > > - Values (or in some cases RawValues) -- internal representation of
> > > option
> > > parser. Technically it is List of option_value structures (old name for
> > > it
> > > is relopt_value). This representation is used for parsing and validating
> > > options. Actually all the convertations between representations listed
> > > above are dove through Values representation. Values are called
> > > RawValues
> > > when Values List were already created, but it was not parsed yet, and
> > > all
> > > option values are in raw state: represented as a text.
> > >
> > >
> > > DefList and TextArray representations are converted in both ways(always
> > > through Values representation): we need DefList -> TextArray to create
> > > or
> > > change entry in the pg_catalog, and need TextArray -> DefList for
> > > pg_dump
> > >
> > > Bytea representation is converted from TextArray representation (also
> > > through Values representation). When relation code, tries to get cached
> > > Bytea representation for certain relation, and none is actually cached,
> > > then cache code calls extractRelOptions function that fetches entry from
> > > pg_catalog for that relation, gets reloptions TextArray[], converts it
> > > into Bytea and cache it.
> > >
> > >
> > > Each reloption has it's own definition. This definitions tells how this
> > > option should be parsed, validated and converted. Before my patch this
> > > information were stored in relopt_gen and relopt_parse_elt structures.
> > > In
> > > my patch all this information were moved into option_definition_basic
> > > structure (that is actually old relopt_gen + relopt_parse_elt)
> > >
> > > The set of options definition I would call a options catalog. Before my
> > > patch there was one global options catalog (actually it was divided into
> > > parts by data types and has another part for dynamically added options,
> > > but
> > > nevertheless, it was global). After my patch there would be a separate
> > > options catalog for each relation type. For AM-relation this catalog
> > > would
> > > be available via AM-handler. for other relation types it would be still
> > > kept in reloption.c
> > >
> > >
> > >
> > > Now I am ready to tell what actually have been done in this patch:
> > >
> > > 1. I moved options definitions from global lists from reloptions.c
> > > [type name]RelOpts to lists that are kept inside AM code. One list per
> > > AM.
> > > heap options, toast options and options for all other relation types
> > > that
> > > are not accessible through AccessMethod, all were kept inside
> > > reloptions.c, but now it is one list for each relation type, not a
> > > global
> > > one
> > >
> > > 2. Each AccessMethod had amoptions function that was responsible for
> > > converting reloptions from TextArray[] representation into Bytea
> > > representaions. This function used functions from reloptions.c and also
> > > had
> > > an array of relopt_parse_elt that defines how parsed reloption data
> > > should
> > > be packed inside Bytea chunk. I've moved data from relopt_parse_elt
> > > array
> > > into option definitions that are stored in the catalog (since catalog
> > > are
> > > now defined inside the AccessMethod we have access to a structure of
> > > Bytea
> > > representation at the place where we define option catalog)
> > >
> > > 3. Speaking of amoptions function, I've completely removed it, since we
> > > do
> > > not need it. For developers of Access Methods who still want to do some
> > > custom action with just parsed data, I've added postprocess_fun to
> > > option
> > > catalog. If this function is defined, it is called by convertation
> > > function
> > > right after Bytea representation were created. In postprocess_fun
> > > developer
> > > can do some custom validation, or change just parsed data. This feature
> > > is
> > > used in bloom index.
> > >
> > > 4. Instead of amoptions function I've added amrelopt_catalog function to
> > > the Access Method. This function returns option definition catalog for
> > > an
> > > Access Method. The catalog is used for processing options.
> > >
> > > 5. As before, relation cache calls extractRelOptions when it needs
> > > options
> > > that were not previously cached.
> > > Before my patch, it created Bytea representations for non-AM relations,
> > > and
> > > called amoptions AM function to get Bytea representation for AM
> > > relation.
> > > In by patch, it now gets option catalog, (using local functions for
> > > non-AM
> > > relations, and using amrelopt_catalog function for AM-relations), and
> > > then
> > > use this catalog to convert options from TextArray into Bytea
> > > representation.
> > >
> > > 6. I've added some more featues:
> > >
> > > - Now you can set OPTION_DEFINITION_FLAG_FORBID_ALTER flag in definition
> > > of
> > > the option, and then postgres will refuse to change option value using
> > > ALTER command. In many indexes, some options are used only for index
> > > creation. After this it's value is saved in MetaPage, and used from
> > > there.
> > > Nevertheless user is allowed to change option value, though it affects
> > > nothing, one need to recreate index to really change such value. Now
> > > using
> > > this flag we can prevent user from getting an illusion that he
> > > successfully
> > > changed option value.
> > >
> > > - After applying my patch if you try to set toast. option for table that
> > > does not have toast, you will get an error. Before postgres will accept
> > > toast option for a table without a toast, but this value will be lost.
> > > This
> > > is bad behavior, so now postgres will throw an error in this case.
> > >
> > > - I've noticed that all string relation options in postgres are
> > > technically
> > > enum options. So I've added enum option type. This will save some bytes
> > > of
> > > the memory and some tacts of the CPU. I also left string option type
> > > (through it is not used now). A. Korotkov said that it might be needed
> > > later for storing patterns, xml/json paths or something like that.
> > >
> > > - Added some more tests that will trigger more options code. Though I am
> > > dreaming about more advanced test system, that will allow to check that
> > > certain value is received somewhere deep in the code.
> > >
> > > 7. And now the most controversial change. I've got rid of StdRdOptions
> > > structure, in which options for heap, toast, nbtree, hash and spgist
> > > were
> > > stored in Bytea representation. In indexes only fillfactor value were
> > > actually used from the whole big structure. So I've created a separate
> > > structure for each relation type. HeapOptions and ToastOptions are very
> > > similar to each other. So common part of creating these catalogs were
> > > moved
> > > to
> > > add_autovacuum_options function that are called while creating each
> > > catalog.
> > >
> > > Now to the controversial part: in src/include/utils/rel.h had a lot of
> > > Relation* macroses that used StdRdOptions structure. I've changed them
> > > into
> > > View* Heap* and Toast* analogues, and changed the code to use these
> > > macroses instead of Relation* one. But this part of code I least sure
> > > of.
> > > I'd like to ask experienced developers to double check it.
> > >
> > > 8. I've moved all options-abstract code into options.c and options.h
> > > file,
> > > and left in reloptions.c and reloptions.h only the code that concerns
> > > relation options. Actually the main idea of this patch was to get this
> > > abstract code in order to use it for custom attribute options later. All
> > > the rest just a side effects :-)
> > >
> > > So, before adding this to commitfest I want somebody to give a general
> > > look
> > > at it, if the code is OK.
> > >
> > > Alvaro Herrera, you once said that you can review the patch...
> > >
> > > The patch is available in the attachment. It can be applied to current
> > > master
> > >
> > > You can also see latest version on my github
> > > https://github.com/dhyannataraj/postgres/tree/reloption_fix
> > > at the reloption_fix branch.
> > >
> > >
> > > ATTRIBUTION NOTICE: I wrote this patch, when I was an employee in
> > > Postgres
> > > Professional company. So this patch should be attributed as patch from
> > > Postgres Pro (or something like that). For some reasons I did not manage
> > > to
> > > commit this patch while I left that job. But I got a permission to
> > > commit
> > > it even if I left the company.
> > > So my contribution to this patch as a independent developer was final
> > > code
> > > cleanup, and writing some more comments. All the rest was a work of
> > > Postgres Pro employee Nikolay Shaplov.

--
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.

Attachment Content-Type Size
reloptions6c.diff text/x-patch 217.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-03-17 10:28:18 Re: Partition-wise join for join between (declaratively) partitioned tables
Previous Message Michael Banck 2017-03-17 10:18:07 Re: [patch] reorder tablespaces in basebackup tar stream for backup_label