Re: Why to index a "Recently DEAD" tuple when creating index

From: Alex <zhihui(dot)fan1213(at)gmail(dot)com>
To: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Why to index a "Recently DEAD" tuple when creating index
Date: 2019-06-10 11:00:20
Message-ID: CAKU4AWpF_jc+WLbM=QTH+4k7NMAD0ffrfwsFVFNRO-437izv-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks! Appreciate it for your time!

On Mon, Jun 10, 2019 at 5:34 PM Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
wrote:

> On Mon, Jun 10, 2019 at 2:12 PM Alex <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> > On Mon, Jun 10, 2019 at 4:10 PM Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
> wrote:
> >> I think what I'm trying to say is different.
> >>
> >> For my case, the sequence is as following:
> >> 1. Transaction A has deleted a tuple, say t1 and got committed.
> >> 2. Index A has been created successfully.
> >> 3. Now, transaction B starts and use the index A to fetch the tuple
> >> t1. While doing visibility check, transaction B gets to know that t1
> >> has been deleted by a committed transaction A, so it can't see the
> >> tuple. But, it creates a dependency edge that transaction A precedes
> >> transaction B. This edge is required to detect a serializable conflict
> >> failure.
> >>
> >> If you don't create the index entry, it'll not be able to create that
> edge.
> >
> >
> > Thanks, I got the difference now, but still not get the necessity of it.
> > 1. Assume we don't index it, in which situation we can get a wrong
> result?
>
> Consider the following sequence of three different transactions X,A and B:
>
> 1. Transaction X reads a tuple t2.
> 2. Transaction A updates the tuple t2, deletes a tuple t1 and gets
> committed. So, there transaction X precedes transaction A, i.e., X <-
> A.
> 3. Index A is created successfully.
> 4. Transaction B starts and use the index A to fetch tuple t1. But,
> it's already deleted by the committed transaction A. So, transaction A
> precedes transaction B, i.e., A<-B.
> 5. At this point you've a dangerous structure X<-A<-B (definition of
> dangerous structure src/backend/storage/lmgr/README-SSI) in the graph
> which can produce an anomaly. For example now, if X tries to update
> another tuple previously read by B, you'll have a dependency B<-X.
> But, you already have X<-B which leads to serializable conflict.
> Postgres tries to resolve this anomaly by rolling back one of the
> transaction.
>
> In your case, it'll be difficult to detect.
>
> > 2. If we only support "Read Committed" isolation level, is there a
> safe way to not index such data?
> >
> I can't think of a case where the RECENTLY_DELETED tuple needs to be
> indexed in "Read Committed" case. So, your suggestion likely to work
> logically in "Read committed" isolation level. But, I'm not sure
> whether you'll encounter any assertion failures in vacuum path or
> concurrent index paths.
>
>
> --
> Thanks & Regards,
> Kuntal Ghosh
> EnterpriseDB: http://www.enterprisedb.com
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kuntal Ghosh 2019-06-10 11:07:52 Re: [pg_rewind] cp: cannot stat ‘pg_wal/RECOVERYHISTORY’: No such file or directory
Previous Message Matsumura, Ryo 2019-06-10 09:52:10 RE: [PATCH] memory leak in ecpglib