From: | PFC <lists(at)peufeu(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Faster Updates |
Date: | 2006-06-03 15:12:14 |
Message-ID: | op.taksiop4cigqcu@apollo13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
Sometimes people complain that UPDATE is slow in postgres. UPDATE...
- generates dead tuples which must be vacuumed.
- needs to hit all indexes even if only one column was modified.
From what I know UPDATE creates a new copy of the old row with the
relevant C/TID's, then indexes it. On COMMIT the old version becomes dead
but stays in the table and indexes until VACUUM.
I propose a simple idea, which may be idiotic, but who knows.
When a row is UPDATED, instead of storing a new copy of the entire row,
only a differential is stored. The old row stays in the page anyway, so we
might as well only store the binary encoded equivalent of "Use the row
version number X and change column A to value Y".
This is possible only if the differential fits in the free space on the
page.
In this case, a lot less dead space is generated. VACUUM would
consolidate the differentials for commited transactions into a new base
value for this row.
While reading the page looking for a specific version of a row, all
differences would need to be consolidated. This adds overhead, but it
might be a win.
With this method, it could be possible to avoid updating the indexes for
unmodified columns. This is a big win.
What do you think ?
From | Date | Subject | |
---|---|---|---|
Next Message | Tino Wildenhain | 2006-06-03 15:26:55 | Re: COPY (query) TO file |
Previous Message | PFC | 2006-06-03 14:59:29 | Re: COPY (query) TO file |