From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Heap WARM Tuples - Design Draft |
Date: | 2016-08-04 17:05:53 |
Message-ID: | 20160804170553.GL1702@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Aug 4, 2016 at 04:29:09PM +0530, Pavan Deolasee wrote:
> Write Amplification Reduction Method (WARM)
> ====================================
>
> A few years back, we developed HOT to address the problem associated with MVCC
> and frequent updates and it has served us very well. But in the light of Uber's
> recent technical blog highlighting some of the problems that are still
> remaining, especially for workloads where HOT does not help to the full extent,
> Simon, myself and others at 2ndQuadrant discussed several ideas and what I
> present below is an outcome of that. This is not to take away credit from
> anybody else. Others may have thought about similar ideas, but I haven’t seen
> any concrete proposal so far.
HOT was a huge win for Postgres and I am glad you are reviewing
improvements.
> This method succeeds in reducing the write amplification, but causes other
> issues which also need to be solved. WARM breaks the invariant that all tuples
> in a HOT chain have the same index values and so an IndexScan would need to
> re-check the index scan conditions against the visible tuple returned from
> heap_hot_search(). We must still check visibility, so we only need to re-check
> scan conditions on that one tuple version.
>
> We don’t want to force a recheck for every index fetch because that will slow
> everything down. So we would like a simple and efficient way of knowing about
> the existence of a WARM tuple in a chain and do a recheck in only those cases,
> ideally O(1). Having a HOT chain contain a WARM tuple is discussed below as
> being a “WARM chain”, implying it needs re-check.
In summary, we are already doing visibility checks on the HOT chain, so
a recheck if the heap tuple matches the index value is only done at most
on the one visible tuple in the chain, not ever tuple in the chain.
> 2. Mark the root line pointer (or the root tuple) with a special
> HEAP_RECHECK_REQUIRED flag to tell us about the presence of a WARM tuple in the
> chain. Since all indexes point to the root line pointer, it should be enough to
> just mark the root line pointer (or tuple) with this flag.
Yes, I think #2 is the easiest. Also, if we modify the index page, we
would have to WAL the change and possibly WAL log the full page write
of the index page. :-(
> Approach 2 seems more reasonable and simple.
>
> There are only 2 bits for lp_flags and all combinations are already used. But
> for LP_REDIRECT root line pointer, we could use the lp_len field to store this
> special flag, which is not used for LP_REDIRECT line pointers. So we are able
> to mark the root line pointer.
Uh, as I understand it, we only use LP_REDIRECT when we have _removed_
the tuple that the ctid was pointing to, but it seems you would need to
set HEAP_RECHECK_REQUIRED earlier than that.
Also, what is currently in the lp_len field for LP_REDIRECT? Zeros, or
random data? I am asking for pg_upgrade purposes.
> One idea is to somehow do this as part of the vacuum where we collect root line
> pointers of WARM chains during the first phase of heap scan, check the indexes
> for all such tuples (may be combined with index vacuum) and then clear the heap
> flags during the second pass, unless new tuples are added to the WARM chain. We
> can detect that by checking that all tuples in the WARM chain still have XID
> less than the OldestXmin that VACUUM is using.
Yes, it seems natural to clear the ctid HEAP_RECHECK_REQUIRED flag where
you are adjusting the HOT chain anyway.
> It’s important to ensure that the flag is set when it is absolutely necessary,
> while having false positives is not a problem. We might do a little wasteful
> work if the flag is incorrectly set. Since flag will be set only during
> heap_update() and the operation is already WAL logged, this can be piggybacked
> with the heap_update WAL record. Similarly, when a root tuple is pruned to a
> redirect line pointer, the operation is already WAL logged and we can piggyback
> setting of line pointer flag with that WAL record.
>
> Flag clearing need not be WAL logged, unless we can piggyback that to some
> existing WAL logging.
Agreed, good point.
Very nice!
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2016-08-04 17:11:17 | Re: Heap WARM Tuples - Design Draft |
Previous Message | Simon Riggs | 2016-08-04 17:03:34 | Re: Lossy Index Tuple Enhancement (LITE) |