Re: When creating index, why pointing to old version of tuple

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Tender Wang <tndrwang(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: When creating index, why pointing to old version of tuple
Date: 2025-08-01 06:47:12
Message-ID: CAEoWx2=ZtL24NDdiKidaRoqkeLBhp-d88QimEwNj5WVvRMz8Zw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> If the index points to the newest version of the tuple, how do old
transactions read the old version of the tuple using an index scan for old
transactions?

Say there is a long transaction x1, it is on-going.

And transaction x2 started later than x1 started, and x2 created an index.
Should x1 be visible to the new index?

My understanding is, the old transaction x1 cannot use the new index to
scan for the old version of the tuple. If you read my example, in the
index, the key is the new age value (99), while the old age value is 4,
thus using the old value will not hit the index entry.

Chao Li (Evan)
------------------------------
HighGo Software Inc.
https://www.highgo.com/

Tender Wang <tndrwang(at)gmail(dot)com> 于2025年8月1日周五 14:37写道:

>
>
> Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> 于2025年8月1日周五 14:16写道:
>
>> Hi Community,
>>
>> Let me use a small example to demonstrate my observation.
>>
>> Step 1: create a simple table, insert a tuple and update it.
>>
>> create table ta (id int, name varchar(32), age int);
>> insert into ta values(1, 'aa', 4);
>> update ta set age=99 where id=1;
>>
>> Step 2: with pageinspect, we can the 2 version of the tuple:
>>
>> SELECT * FROM heap_page_items(get_raw_page('ta', 0));
>> lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid |
>> t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
>>
>> ----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+----------------------------
>> 1 | 8152 | 1 | 36 | 765 | 765 | 0 | (0,2) |
>> 16387 | 34 | 24 | | |
>> \x010000000761610004000000
>> 2 | 8112 | 1 | 36 | 765 | 0 | 2 | (0,2) |
>> 32771 | 10242 | 24 | | |
>> \x010000000761610063000000
>> (2 rows)
>>
>> The old version's ctid now points to (0,2) which is expected.
>>
>> Step 3: create a index on the table
>>
>> create index idx_ta_age on ta(age);
>>
>> Step 4: view the index page
>>
>> evantest=# SELECT * FROM bt_page_items('idx_ta_age', 1);
>> itemoffset | ctid | itemlen | nulls | vars | data |
>> dead | htid | tids
>>
>> ------------+-------+---------+-------+------+-------------------------+------+-------+------
>> 1 | (0,1) | 16 | f | f | 63 00 00 00 00 00 00 00 |
>> f | (0,1) |
>> (1 row)
>>
>> Here comes my question, why the index entry's ctid points to the old
>> version tuple?
>>
>> I understand that, for updated tuples, old version's ctid points to new
>> version, that builds a chain of all versions. But my confusion is that,
>> when an index is created, older transactions and in-progress transactions
>> won't see the newly created index. So, it should be ok for the index to
>> point to the newest version of tuple version that is visible to the index.
>>
>> Can anyone please explain me about that?
>>
>>
> If the index points to the newest version of the tuple, how do old
> transactions read the old version of the tuple using an index scan for old
> transactions?
> Pointing to the old version is friendly if the table is often updated.
> This way, we don't need to update the index tuple.
>
>
> --
> Thanks,
> Tender Wang
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2025-08-01 06:56:33 Trivial patch to fix a typo
Previous Message Tender Wang 2025-08-01 06:36:51 Re: When creating index, why pointing to old version of tuple