## Re: Not HOT enough

From: Bruce Momjian Simon Riggs Tom Lane ,Alvaro Herrera ,Robert Haas ,Pg Hackers Re: Not HOT enough 2012-08-17 00:59:16 20120817005916.GJ30286@momjian.us (view raw, whole thread or download thread mbox) 2011-11-22 20:23:34 from Simon Riggs  2011-11-22 21:21:50 from Robert Haas   2011-11-22 23:40:29 from Simon Riggs    2011-11-23 00:25:34 from Simon Riggs     2011-11-23 14:00:12 from Robert Haas      2011-11-23 14:25:10 from Simon Riggs       2011-11-23 15:15:55 from Robert Haas        2011-11-23 15:20:44 from Alvaro Herrera         2011-11-23 15:28:38 from Robert Haas          2011-11-23 15:35:47 from Alvaro Herrera           2011-11-23 16:01:18 from Robert Haas         2011-11-23 16:14:04 from Simon Riggs          2011-11-23 17:00:55 from Alvaro Herrera          2011-11-23 17:01:40 from Tom Lane           2011-11-23 17:58:25 from Simon Riggs            2011-11-23 18:15:58 from Tom Lane             2011-11-23 19:33:18 from Simon Riggs              2012-08-17 00:59:16 from Bruce Momjian               2012-08-17 17:21:47 from Robert Haas          2011-11-23 17:07:11 from Pavan Deolasee     2011-11-23 18:30:18 from Tom Lane      2011-11-23 19:57:06 from Robert Haas       2011-11-23 20:14:45 from Simon Riggs       2011-11-23 20:15:36 from Tom Lane        2011-11-23 20:32:07 from Simon Riggs        2011-11-23 20:45:16 from Robert Haas         2011-11-23 21:36:46 from Simon Riggs          2011-11-23 21:55:29 from Tom Lane           2011-11-23 22:43:15 from Simon Riggs            2011-11-23 22:47:39 from Robert Haas             2011-11-23 23:00:37 from Simon Riggs pgsql-hackers
```Did we want to apply this?

---------------------------------------------------------------------------

On Wed, Nov 23, 2011 at 07:33:18PM +0000, Simon Riggs wrote:
> On Wed, Nov 23, 2011 at 6:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> >> The real question is do we favour HOT cleanup on those small 8 tables,
> >> or do we favour HOT cleanup of every other table?
> >
> > No, the real question is why not think a little harder and see if we can
> > come up with a solution that doesn't involve making some cases worse to
> > make others better.
>
> Slightly modified patch attached.
>
> When we access a shared relation and the page is prunable we re-derive
> the cutoff value using GetOldestXmin.
>
> That code path is rarely taken. In particular, pg_shdepend is only
> accessed during object creation/alter/drop, so this isn't a path we
> can't spare a small amount little extra on, just like the trade-off
> we've taken to make locking faster for DML while making DDL a little
> slower.
>
> If this is still unacceptable, then I'll have to look at reducing
> impact on pg_shdepend from temp tables - which is still a plan.
>
> --
>  PostgreSQL Development, 24x7 Support, Training & Services

> diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
> index 61f2ce4..5feaedc 100644
> --- a/src/backend/access/heap/pruneheap.c
> +++ b/src/backend/access/heap/pruneheap.c
> @@ -16,13 +16,13 @@
>
>  #include "access/heapam.h"
>  #include "access/transam.h"
> +#include "catalog/catalog.h"
>  #include "pgstat.h"
>  #include "storage/bufmgr.h"
>  #include "utils/rel.h"
>  #include "utils/tqual.h"
>
> -
>  /* Working data for heap_page_prune and subroutines */
>  typedef struct
>  {
> @@ -72,6 +72,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer, TransactionId OldestXmin)
>  {
>  	Page		page = BufferGetPage(buffer);
>  	Size		minfree;
> +	Transactionid	CutoffXmin = OldestXmin;
>
>  	/*
>  	 * Let's see if we really need pruning.
> @@ -91,6 +92,18 @@ heap_page_prune_opt(Relation relation, Buffer buffer, TransactionId OldestXmin)
>  		return;
>
>  	/*
> +	 * We choose to set RecentGlobalXmin only for the current database, which
> +	 * means we cannot use it to prune shared relations when reading them with
> +	 * MVCC snapshots. By making that choice it allows our snapshots to be
> +	 * smaller and faster, and the RecentGlobalXmin will be further forward
> +	 * and offer better pruning of non-shared relations. So when accessing a
> +	 * shared relation and we see the page is prunable (above) we get an
> +	 * OldestXmin across all databases.
> +	 */
> +	if (IsSharedRelation(relation->rd_id))
> +		CutoffXmin = GetOldestXmin(true, true);
> +
> +	/*
>  	 * We prune when a previous UPDATE failed to find enough space on the page
>  	 * for a new tuple version, or when free space falls below the relation's
>  	 * fill-factor target (but not less than 10%).
> @@ -124,7 +137,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer, TransactionId OldestXmin)
>  																 * needed */
>
>  			/* OK to prune */
> -			(void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore);
> +			(void) heap_page_prune(relation, buffer, CutoffXmin, true, &ignore);
>  		}
>
>  		/* And release buffer lock */
> diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
> index 1a48485..60153f4 100644
> --- a/src/backend/storage/ipc/procarray.c
> +++ b/src/backend/storage/ipc/procarray.c
> @@ -55,6 +55,7 @@
>  #include "storage/spin.h"
>  #include "utils/builtins.h"
>  #include "utils/snapmgr.h"
> +#include "utils/tqual.h"
>
>
>  /* Our shared memory area */
> @@ -1185,6 +1186,8 @@ GetMaxSnapshotSubxidCount(void)
>   *		RecentGlobalXmin: the global xmin (oldest TransactionXmin across all
>   *			running transactions, except those running LAZY VACUUM).  This is
>   *			the same computation done by GetOldestXmin(true, true).
> + *			For MVCC snapshots we examine transactions running only in our
> + *			database, ignoring longer running transactions in other databases.
>   *
>   * Note: this function should probably not be called with an argument that's
>   * not statically allocated (see xip allocation below).
> @@ -1200,9 +1203,12 @@ GetSnapshotData(Snapshot snapshot)
>  	int			count = 0;
>  	int			subcount = 0;
>  	bool		suboverflowed = false;
> +	bool		allDbs = false;
>
>  	Assert(snapshot != NULL);
>
> +	allDbs = !IsMVCCSnapshot(snapshot);
> +
>  	/*
>  	 * Allocating space for maxProcs xids is usually overkill; numProcs would
>  	 * be sufficient.  But it seems better to do the malloc while not holding
> @@ -1278,6 +1284,12 @@ GetSnapshotData(Snapshot snapshot)
>  			if (proc->vacuumFlags & PROC_IN_VACUUM)
>  				continue;
>
> +			/* MVCC snapshots ignore other databases */
> +			if (!allDbs &&
> +				proc->databaseId != MyDatabaseId &&
> +				proc->databaseId != 0)		/* always include WalSender */
> +				continue;
> +
>  			/* Update globalxmin to be the smallest valid xmin */
>  			xid = proc->xmin;	/* fetch just once */
>  			if (TransactionIdIsNormal(xid) &&

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian  <bruce(at)momjian(dot)us>        http://momjian.us
EnterpriseDB                             http://enterprisedb.com

+ It's impossible for everything to be true. +

```

### pgsql-hackers by date

 Next: From: Bruce Momjian Date: 2012-08-17 01:02:10 Subject: Re: Avoiding repeated snapshot computation Previous: From: Bruce Momjian Date: 2012-08-17 00:46:56 Subject: Re: RFC: list API / memory allocations