From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Hannu Krosing <hannu(at)krosing(dot)net> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, jd(at)commandprompt(dot)com, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hot Standby (v9d) |
Date: | 2009-02-03 14:55:45 |
Message-ID: | 49885AF1.6070209@anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 02/03/2009 02:26 PM, Hannu Krosing wrote:
>> I don't see any way around the fact that when a tuple is removed, it's
>> gone and can't be accessed by queries. Either you don't remove it, or
>> you kill the query.
> Actually we came up with a solution to this - use filesystem level
> snapshots (like LVM2+XFS or ZFS), and redirect backends with
> long-running queries to use fs snapshot mounted to a different
> mountpoint.
Isn't that really, really expensive?
A single write on the master logical volume yields writes of PE size
for _every_ single snapshot (the first time the block is touched) -
considering that there could quite many such snapshots I don't think
that this is really feasible - io quite possible might be saturated.
The default PE size is 4MB - but on most bigger systems it is set to a
bigger size, so its just getting worse for bigger systems.
Sure, one might say, that this is an LVM deficiency - but I do knot know
of any snapshot-able block layer doing it that way.
Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Hiroshi Saito | 2009-02-03 14:57:54 | Re: pgevent warnings on mingw |
Previous Message | Kevin Grittner | 2009-02-03 14:47:50 | Re: add_path optimization |