Re: Proposal: Adjacent B-Tree index

From: Dilshod Urazov <urazofficial(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal: Adjacent B-Tree index
Date: 2024-02-20 19:19:21
Message-ID: CAHc0=pj-wH+zkrHsgGSA7-x6Of_NQWvae02oqXKzJPw6RaCa9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> only 1 index lookup is needed.
Sorry, must be "only lookups of 1 index are needed".

--
Dilshod Urazov

вт, 20 февр. 2024 г. в 21:09, Dilshod Urazov <urazofficial(at)gmail(dot)com>:

> > I'm not sure why are two indexes not sufficient here?
>
> Did I write that they are not sufficient? The whole point is that in
> relational DBMSs which are widely used
> to store graphs we can optimize storage in such cases. Also we can
> optimize traversals e.g. if we want to
> get all nodes that are adjacent to a given node with id = X in an oriented
> graph
>
> SELECT id, label
> FROM Nodes
> JOIN Edges ON Nodes.id = Edges.target
> WHERE Edges.source = X;
>
> only 1 index lookup is needed.
>
> > The entry could've been removed because (e.g.)
> > test's b column was updated thus inserting a new index entry for the
> > new HOT-chain's TID.
>
> If test'b column was updated and HOT optimization took place no new index
> entry is created. Index tuple
> pointing to old heap tuple is valid since now it is pointing to HOT-chain.
>
> --
> Dilshod Urazov
>
> пн, 19 февр. 2024 г. в 22:32, Matthias van de Meent <
> boekewurm+postgres(at)gmail(dot)com>:
>
>> On Mon, 19 Feb 2024 at 18:48, Dilshod Urazov <urazofficial(at)gmail(dot)com>
>> wrote:
>> >
>> > - Motivation
>> >
>> > A regular B-tree index provides efficient mapping of key values to
>> tuples within a table. However, if you have two tables connected in some
>> way, a regular B-tree index may not be efficient enough. In this case, you
>> would need to create an index for each table. The purpose will become
>> clearer if we consider a simple example which is the main use-case as I see
>> it.
>>
>> I'm not sure why are two indexes not sufficient here? PostgreSQL can
>> already do merge joins, which would have the same effect of hitting
>> the same location in the index at the same time between all tables,
>> without the additional overhead of having to scan two table's worth of
>> indexes in VACUUM.
>>
>> > During the vacuum of A an index tuple pointing to a dead tuple in A
>> should be cleaned as well as all index tuples for the same key.
>>
>> This is definitely not correct. If I have this schema
>>
>> table test (id int primary key, b text unique)
>> table test_ref(test_id int references test(id))
>>
>> and if an index would contain entries for both test and test_ref, it
>> can't just remove all test_ref entries because an index entry with the
>> same id was removed: The entry could've been removed because (e.g.)
>> test's b column was updated thus inserting a new index entry for the
>> new HOT-chain's TID.
>>
>> > would suffice for this new semantics.
>>
>> With the provided explanation I don't think this is a great idea.
>>
>> Kind regards,
>>
>> Matthias van de Meent.
>>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2024-02-20 19:29:29 Re: Schema variables - new implementation for Postgres 15
Previous Message Tomas Vondra 2024-02-20 18:56:09 Re: Missing Group Key in grouped aggregate