Proposal: Global Index

From: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Geoghengan <pg(at)bowt(dot)ie>, heikki(dot)linnakangas(at)iki(dot)fi
Subject: Proposal: Global Index
Date: 2019-10-30 08:38:29
Message-ID: CALtqXTcurqy1PKXzP9XO=ofLLA5wBSo77BnUnYVEZpmcA3V0ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

A global index by very definition is a single index on the parent table
that maps to many
underlying table partitions. The parent table itself does not have any
underlying storage,
so it must, therefore, retrieve the data satisfying index constraints from
the underlying tables.
In very crude terms, it is an accumulation of data from table partitions so
that data spanning
across multiple partitions are accessed in one go as opposed to
individually querying each
partition.

For the initial version of this work, we are only considering to build
b-tree global indexes.

- Partitioned Index (Index Partitioning)
When global indexes become too large, then those are partitioned to keep
the performance
and maintenance overhead manageable. These are not within the scope of this
work.

- Local Index
A local index is an index that is local to a specific table partition; i.e.
it doesn’t span across
multiple partitions. So, when we create an index on a parent table, it will
create a separate
index for all its partitions. Unfortunately, PostgreSQL uses the
terminology of “partitioned index”
when it refers to local indexes. This work with fix this terminology for
PostgreSQL so that the
nomenclature remains consistent with other DBMS.

- Why We Need Global Index?
A global index is expected to give two very important upgrades to the
partitioning feature set in
PostgreSQL. It is expected to give a significant improvement in
read-performance for queries
targeting multiple local indexes of partitions. It also adds a unique
constraint across partitions.

- Unique Constraint
Data uniqueness is a critical requirement for building an index. For global
indexes that span across
multiple partitions, uniqueness will have to be enforced on index
column(s). This effectively translates
into a unique constraint.

- Performance
Currently, the pseudo index created on the parent table of partitions does
not contain any
data. Rather, it dereferences to the local indexes when an index search is
required. This
means that multiple indexes will have to be evaluated and data to be
combined thereafter.
However, with the global indexes, data will reside with global index
declared on the parent
table. This avoids the need for multi-level index lookups. So read
performance is expected
to be significantly higher in cases. There will however be a negative
performance impact
during write (insert/update) of data. This is discussed in more detail
later on.

- Creating a GLOBAL Index - Syntax
A global index may be created with the addition of a “GLOBAL” keyword to
the index statement.
Alternatively, one could specify the “LOCAL” keyword to create local
indexes on partitions.
We are suggesting to call this set of keywords: “partition_index_type”. By
default,
partition_index_type will be set as LOCAL. Here is a sample of the create
index syntax.

CREATE Index idx parent (columns) [GLOBAL | LOCAL];

Note: There is no shift/reduced by adding these options.

- Pointing Index to Tuple
Currently, CTID carries a page and offset information for a known heap
(table name). However,
in the context of global indexes, this information within an index is
insufficient. Since the index is
expected to carry tuples from multiple partitions (heaps), CTID alone will
not be able to link an index
node to a tuple. This requires carrying additional data for the heap name
to be stored with each
index node.

How this should be implemented is a point to be discussed. A few
possibilities are listed below:

-- Expand CTID to include a relfilenode id. In PostgreSQL-Conf Asia, Bruce
suggested having the OID
instead of relfilenode as relfilenode can be duplicated across tablespaces.
-- Using OID introduces another complication where we would need to
query catalog for OID to
heap mapping.

-- The second option is to have a variable-length CTID. We can reserve some
top-level bit for segregation of
Global CTID or Standard CTID. Robert Haas suggested in PostgreSQL-EU to
discuss this with Peter Geoghegan.
-- I discussed it with Peter and he believes that it is a very
invasive approach that requires a whole lot of
the effort to get committed.

-- Heikki pointed me to include heap specific information using the INCLUDE
keyword so that heap information
is stored with each index node as data.
-- We (Peter and I) also discussed that option and this looks a more
easy and non-invasive option.

- Optimizer
The challenge with optimizer is a selection between local and global
indexes when both are present.
How do we:

-- Evaluate the cost of scanning a global index?
-- When should the LOCAL index be preferred over the GLOBAL index and vice
versa?
-- Should we hit a GLOBAL index when the query is targeting a
couple of partitions only?
-- We need to consider the sizes of those partitions being hit and
the sizes of partitions not being hit.
-- Bruce suggested that we prioritize a GLOBAL index in the first version
so that in every case,
the GLOBAL index is utilized.

- Write Performance and Vacuum
There will be some write performance degradation because every change in
partition tables must
propagate upwards to the GLOBAL index on the parent table. This can be
thought of as another index
on a table, however, the [slight] performance degradation will be due to
the fact that the GLOBAL
index may carry a much bigger dataset with data from multiple partitions
resulting in a higher tree
traversal and update time. This applies to both write and vacuum processes.

It is still an open question though on how this will be handled within the
code and how we can better
optimize this process.

I have a POC patch and working on finalizing the patch, Hamid Akhtar is
also working with me on this
work.

--
Ibrar Ahmed

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-10-30 08:43:04 Re: Problem with synchronous replication
Previous Message Fujii Masao 2019-10-30 08:21:17 Re: Problem with synchronous replication