Re: A little COPY speedup

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: A little COPY speedup
Date: 2007-03-02 00:53:21
Message-ID: 27963.1172796801@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

I wrote:
> Barring objections, I'll tweak this as above and apply.

I've applied the attached modified version of this patch. It seemed
better to me to centralize the handling of this flag bit in PageAddItem
and PageRepairFragmentation, instead of having it in the callers as you
did. This means that the bit applies to all pages not only heap pages,
but at least for the moment that has no downside. I note that GIN
indexes do use PageAddItem with offsetNumber = InvalidOffsetNumber,
so they will get some performance benefit too.

regards, tom lane

Index: doc/src/sgml/storage.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/storage.sgml,v
retrieving revision 1.14
diff -c -r1.14 storage.sgml
*** doc/src/sgml/storage.sgml 31 Jan 2007 20:56:19 -0000 1.14
--- doc/src/sgml/storage.sgml 2 Mar 2007 00:48:25 -0000
***************
*** 427,434 ****
The first 20 bytes of each page consists of a page header
(PageHeaderData). Its format is detailed in <xref
linkend="pageheaderdata-table">. The first two fields track the most
! recent WAL entry related to this page. They are followed by three 2-byte
! integer fields
(<structfield>pd_lower</structfield>, <structfield>pd_upper</structfield>,
and <structfield>pd_special</structfield>). These contain byte offsets
from the page start to the start
--- 427,434 ----
The first 20 bytes of each page consists of a page header
(PageHeaderData). Its format is detailed in <xref
linkend="pageheaderdata-table">. The first two fields track the most
! recent WAL entry related to this page. Next is a 2-byte field
! containing flag bits. This is followed by three 2-byte integer fields
(<structfield>pd_lower</structfield>, <structfield>pd_upper</structfield>,
and <structfield>pd_special</structfield>). These contain byte offsets
from the page start to the start
***************
*** 437,448 ****
The last 2 bytes of the page header,
<structfield>pd_pagesize_version</structfield>, store both the page size
and a version indicator. Beginning with
! <productname>PostgreSQL</productname> 8.1 the version number is 3;
<productname>PostgreSQL</productname> 8.0 used version number 2;
<productname>PostgreSQL</productname> 7.3 and 7.4 used version number 1;
prior releases used version number 0.
! (The basic page layout and header format has not changed in these versions,
! but the layout of heap row headers has.) The page size
is basically only present as a cross-check; there is no support for having
more than one page size in an installation.

--- 437,449 ----
The last 2 bytes of the page header,
<structfield>pd_pagesize_version</structfield>, store both the page size
and a version indicator. Beginning with
! <productname>PostgreSQL</productname> 8.3 the version number is 4;
! <productname>PostgreSQL</productname> 8.1 and 8.2 used version number 3;
<productname>PostgreSQL</productname> 8.0 used version number 2;
<productname>PostgreSQL</productname> 7.3 and 7.4 used version number 1;
prior releases used version number 0.
! (The basic page layout and header format has not changed in most of these
! versions, but the layout of heap row headers has.) The page size
is basically only present as a cross-check; there is no support for having
more than one page size in an installation.

***************
*** 470,478 ****
</row>
<row>
<entry>pd_tli</entry>
! <entry>TimeLineID</entry>
! <entry>4 bytes</entry>
! <entry>TLI of last change</entry>
</row>
<row>
<entry>pd_lower</entry>
--- 471,485 ----
</row>
<row>
<entry>pd_tli</entry>
! <entry>uint16</entry>
! <entry>2 bytes</entry>
! <entry>TimeLineID of last change (only its lowest 16 bits)</entry>
! </row>
! <row>
! <entry>pd_flags</entry>
! <entry>uint16</entry>
! <entry>2 bytes</entry>
! <entry>Flag bits</entry>
</row>
<row>
<entry>pd_lower</entry>
Index: src/backend/storage/page/bufpage.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/storage/page/bufpage.c,v
retrieving revision 1.71
diff -c -r1.71 bufpage.c
*** src/backend/storage/page/bufpage.c 21 Feb 2007 20:02:17 -0000 1.71
--- src/backend/storage/page/bufpage.c 2 Mar 2007 00:48:26 -0000
***************
*** 39,44 ****
--- 39,45 ----
/* Make sure all fields of page are zero, as well as unused space */
MemSet(p, 0, pageSize);

+ /* p->pd_flags = 0; done by above MemSet */
p->pd_lower = SizeOfPageHeaderData;
p->pd_upper = pageSize - specialSize;
p->pd_special = pageSize - specialSize;
***************
*** 73,78 ****
--- 74,80 ----
/* Check normal case */
if (PageGetPageSize(page) == BLCKSZ &&
PageGetPageLayoutVersion(page) == PG_PAGE_LAYOUT_VERSION &&
+ (page->pd_flags & ~PD_VALID_FLAG_BITS) == 0 &&
page->pd_lower >= SizeOfPageHeaderData &&
page->pd_lower <= page->pd_upper &&
page->pd_upper <= page->pd_special &&
***************
*** 165,178 ****
else
{
/* offsetNumber was not passed in, so find a free slot */
! /* look for "recyclable" (unused & deallocated) ItemId */
! for (offsetNumber = 1; offsetNumber < limit; offsetNumber++)
{
! itemId = PageGetItemId(phdr, offsetNumber);
! if (!ItemIdIsUsed(itemId) && ItemIdGetLength(itemId) == 0)
! break;
}
- /* if no free slot, we'll put it at limit (1st open slot) */
}

if (offsetNumber > limit)
--- 167,193 ----
else
{
/* offsetNumber was not passed in, so find a free slot */
! /* if no free slot, we'll put it at limit (1st open slot) */
! if (PageHasFreeLinePointers(phdr))
! {
! /* look for "recyclable" (unused & deallocated) ItemId */
! for (offsetNumber = 1; offsetNumber < limit; offsetNumber++)
! {
! itemId = PageGetItemId(phdr, offsetNumber);
! if (!ItemIdIsUsed(itemId) && ItemIdGetLength(itemId) == 0)
! break;
! }
! if (offsetNumber >= limit)
! {
! /* the hint is wrong, so reset it */
! PageClearHasFreeLinePointers(phdr);
! }
! }
! else
{
! /* don't bother searching if hint says there's no free slot */
! offsetNumber = limit;
}
}

if (offsetNumber > limit)
***************
*** 413,425 ****
pfree(itemidbase);
}

return (nline - nused);
}

/*
* PageGetFreeSpace
* Returns the size of the free (allocatable) space on a page,
! * deducted by the space needed for a new line pointer.
*/
Size
PageGetFreeSpace(Page page)
--- 428,446 ----
pfree(itemidbase);
}

+ /* Set hint bit for PageAddItem */
+ if (nused < nline)
+ PageSetHasFreeLinePointers(page);
+ else
+ PageClearHasFreeLinePointers(page);
+
return (nline - nused);
}

/*
* PageGetFreeSpace
* Returns the size of the free (allocatable) space on a page,
! * reduced by the space needed for a new line pointer.
*/
Size
PageGetFreeSpace(Page page)
Index: src/include/catalog/catversion.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/catalog/catversion.h,v
retrieving revision 1.388
diff -c -r1.388 catversion.h
*** src/include/catalog/catversion.h 20 Feb 2007 17:32:17 -0000 1.388
--- src/include/catalog/catversion.h 2 Mar 2007 00:48:26 -0000
***************
*** 53,58 ****
*/

/* yyyymmddN */
! #define CATALOG_VERSION_NO 200702202

#endif
--- 53,58 ----
*/

/* yyyymmddN */
! #define CATALOG_VERSION_NO 200703011

#endif
Index: src/include/storage/bufpage.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/storage/bufpage.h,v
retrieving revision 1.71
diff -c -r1.71 bufpage.h
*** src/include/storage/bufpage.h 21 Feb 2007 20:02:17 -0000 1.71
--- src/include/storage/bufpage.h 2 Mar 2007 00:48:26 -0000
***************
*** 90,95 ****
--- 90,96 ----
*
* pd_lsn - identifies xlog record for last change to this page.
* pd_tli - ditto.
+ * pd_flags - flag bits.
* pd_lower - offset to start of free space.
* pd_upper - offset to end of free space.
* pd_special - offset to start of special space.
***************
*** 98,105 ****
* The LSN is used by the buffer manager to enforce the basic rule of WAL:
* "thou shalt write xlog before data". A dirty buffer cannot be dumped
* to disk until xlog has been flushed at least as far as the page's LSN.
! * We also store the TLI for identification purposes (it is not clear that
! * this is actually necessary, but it seems like a good idea).
*
* The page version number and page size are packed together into a single
* uint16 field. This is for historical reasons: before PostgreSQL 7.3,
--- 99,107 ----
* The LSN is used by the buffer manager to enforce the basic rule of WAL:
* "thou shalt write xlog before data". A dirty buffer cannot be dumped
* to disk until xlog has been flushed at least as far as the page's LSN.
! * We also store the 16 least significant bits of the TLI for identification
! * purposes (it is not clear that this is actually necessary, but it seems
! * like a good idea).
*
* The page version number and page size are packed together into a single
* uint16 field. This is for historical reasons: before PostgreSQL 7.3,
***************
*** 119,125 ****
/* XXX LSN is member of *any* block, not only page-organized ones */
XLogRecPtr pd_lsn; /* LSN: next byte after last byte of xlog
* record for last change to this page */
! TimeLineID pd_tli; /* TLI of last change */
LocationIndex pd_lower; /* offset to start of free space */
LocationIndex pd_upper; /* offset to end of free space */
LocationIndex pd_special; /* offset to start of special space */
--- 121,129 ----
/* XXX LSN is member of *any* block, not only page-organized ones */
XLogRecPtr pd_lsn; /* LSN: next byte after last byte of xlog
* record for last change to this page */
! uint16 pd_tli; /* least significant bits of the TimeLineID
! * containing the LSN */
! uint16 pd_flags; /* flag bits, see below */
LocationIndex pd_lower; /* offset to start of free space */
LocationIndex pd_upper; /* offset to end of free space */
LocationIndex pd_special; /* offset to start of special space */
***************
*** 130,140 ****
typedef PageHeaderData *PageHeader;

/*
* Page layout version number 0 is for pre-7.3 Postgres releases.
* Releases 7.3 and 7.4 use 1, denoting a new HeapTupleHeader layout.
* Release 8.0 uses 2; it changed the HeapTupleHeader layout again.
* Release 8.1 uses 3; it redefined HeapTupleHeader infomask bits.
! * Release 8.3 uses 4; it changed the HeapTupleHeader layout again.
*/
#define PG_PAGE_LAYOUT_VERSION 4

--- 134,157 ----
typedef PageHeaderData *PageHeader;

/*
+ * pd_flags contains the following flag bits. Undefined bits are initialized
+ * to zero and may be used in the future.
+ *
+ * PD_HAS_FREE_LINES is set if there are any not-LP_USED line pointers before
+ * pd_lower. This should be considered a hint rather than the truth, since
+ * changes to it are not WAL-logged.
+ */
+ #define PD_HAS_FREE_LINES 0x0001 /* are there any unused line pointers? */
+
+ #define PD_VALID_FLAG_BITS 0x0001 /* OR of all valid pd_flags bits */
+
+ /*
* Page layout version number 0 is for pre-7.3 Postgres releases.
* Releases 7.3 and 7.4 use 1, denoting a new HeapTupleHeader layout.
* Release 8.0 uses 2; it changed the HeapTupleHeader layout again.
* Release 8.1 uses 3; it redefined HeapTupleHeader infomask bits.
! * Release 8.3 uses 4; it changed the HeapTupleHeader layout again, and
! * added the pd_flags field (by stealing some bits from pd_tli).
*/
#define PG_PAGE_LAYOUT_VERSION 4

***************
*** 299,313 ****
((((PageHeader) (page))->pd_lower - SizeOfPageHeaderData) \
/ sizeof(ItemIdData)))

#define PageGetLSN(page) \
(((PageHeader) (page))->pd_lsn)
#define PageSetLSN(page, lsn) \
(((PageHeader) (page))->pd_lsn = (lsn))

#define PageGetTLI(page) \
(((PageHeader) (page))->pd_tli)
#define PageSetTLI(page, tli) \
! (((PageHeader) (page))->pd_tli = (tli))

/* ----------------------------------------------------------------
* extern declarations
--- 316,342 ----
((((PageHeader) (page))->pd_lower - SizeOfPageHeaderData) \
/ sizeof(ItemIdData)))

+ /*
+ * Additional macros for access to page headers
+ */
#define PageGetLSN(page) \
(((PageHeader) (page))->pd_lsn)
#define PageSetLSN(page, lsn) \
(((PageHeader) (page))->pd_lsn = (lsn))

+ /* NOTE: only the 16 least significant bits are stored */
#define PageGetTLI(page) \
(((PageHeader) (page))->pd_tli)
#define PageSetTLI(page, tli) \
! (((PageHeader) (page))->pd_tli = (uint16) (tli))
!
! #define PageHasFreeLinePointers(page) \
! (((PageHeader) (page))->pd_flags & PD_HAS_FREE_LINES)
! #define PageSetHasFreeLinePointers(page) \
! (((PageHeader) (page))->pd_flags |= PD_HAS_FREE_LINES)
! #define PageClearHasFreeLinePointers(page) \
! (((PageHeader) (page))->pd_flags &= ~PD_HAS_FREE_LINES)
!

/* ----------------------------------------------------------------
* extern declarations

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message FAST PostgreSQL 2007-03-02 01:19:39 Re: [pgsql-patches] pg_get_domaindef
Previous Message FAST PostgreSQL 2007-03-02 00:19:43 Re: [HACKERS]