From: | Craig James <cjames(at)emolecules(dot)com> |
---|---|
To: | David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: Re: pg_basebackup bug: base backup is double the size of the database |
Date: | 2015-01-22 15:46:18 |
Message-ID: | CAFwQ8re4_9x4i3UW9q3wd1Pq0LPgbPN4jyBuUBiQN5RHNSSJQQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
On Wed, Jan 21, 2015 at 10:02 PM, David G Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> Craig James-2 wrote
> > We've encountered a serious bug with pg_basebackup. It seems to be
> > following hard links and duplicating all files in the tablespaces rather
> > than preserving links.
>
> This entire sentence doesn't make sense to me. How does one "follow" a
> hard-link? A soft-link yes but a hard-link is an alias to actual data.
> I'm
> not sure directory hard-linking is even allowed or used so following in
> that
> sense don't compute...
>
See the man page for rsync, the -H option, which explains it better:
-H, --hard-links
This tells rsync to look for hard-linked files in the
transfer
and link together the corresponding files on the receiving
side.
Without this option, hard-linked files in the transfer
are
treated as though they were separate files.
> My guess is that pg_basebackup is using (or doing the equivalent of)
> > rsync(1) without the --hard-links option, and that these hard links were
> > created by pg_upgrade when we went from 8.4.17 to 9.3.5.
>
> And how, exactly, did you perform the pg_upgrade. As mentioned down-thread
> pg_upgrade does use hard links; specifically to avoid duplication of data
> (in exchange you lose the ability to easily fall back to the old database
> version). I'm doubtful that it, by itself, is contributing to this problem
> but again my experience in this area is limited. But what you have shown
> us
> to this point is far from conclusive.
>
I'm pretty sure I understand how this happened, but it's speculation.
This database live in /data/postgres-9.3, but PGDATA points to /postgres,
which is a symbolic link to /data/postgres, which is a symbolic link to
postgres-9.3. The tablespace are all in /data/postgres-9.3/tablespaces, but
in the pg_tblspc directory, it's symbolic links to /postgres/tablespaces
(which in fact resolve correctly), for example:
# ls -l /data/postgres-9.3/main/pg_tblspc/16747
lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28
/data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/
Normally when pg_upgrade runs, you end up with two parallel directory
hierarchies, and $PGDATA points to the new one when you're done. But
because of the way our symbolic links work, both the new and the old
directories are in the /data/postgres-9.3/tablespaces directory. You can't
simply delete the old $PGDATA directory, because that would erase the
entire database.
I'll have to dig around to prove to myself that this is the case.
Craig
From | Date | Subject | |
---|---|---|---|
Next Message | Craig James | 2015-01-22 15:49:12 | Re: Re: pg_basebackup bug: base backup is double the size of the database |
Previous Message | John Scalia | 2015-01-22 13:59:23 | Re: Back to my pg_upgrade stoppage problem |