| From: | Bruce Momjian <bruce(at)momjian(dot)us> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: RelationCreateStorage can orphan files |
| Date: | 2010-09-16 22:41:50 |
| Message-ID: | 201009162241.o8GMfo719644@momjian.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > I notice that RelationCreateStorage() creates the main fork on disk
> > before writing (let alone flushing) WAL. So if PG gets killed at that
> > point, we end up with an orphaned file on disk. I think that we could
> > even extend the relation a few times before WAL gets written, so I
> > don't even think it's necessarily a zero-size file. We could perhaps
> > avoid this by writing and flushing a WAL record that includes the
> > creating XID before touching the disk; when we replay the record, we
> > create the file but then delete it if the XID fails to commit before
> > recovery ends. But I guess maybe our feeling is that it's just not
> > worth taking a performance hit for this?
>
> That design is intentional. If the file create fails, and you've
> already written a WAL record that says you created it, you are flat
> out screwed. You can't even PANIC --- if you do, then the replay of
> the WAL record will likely fail and PANIC again, leaving the database
> dead in the water.
>
> Orphaned files, in contrast, are completely non-dangerous --- the worst
> they can do is waste a little bit of disk space. That's a cheap price
> to pay for not having an unrecoverable database after a create failure.
>
> This is essentially the same reason why CREATE DATABASE and related
> commands xlog directory copy operations only after completing them.
> That potentially wastes much more than a few blocks; but it's still
> non-dangerous, and far safer than the alternative.
Is this documented in a C comment somewhere? Obviously not in a place
Robert found.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2010-09-16 22:57:02 | Re: RelationCreateStorage can orphan files |
| Previous Message | Kevin Grittner | 2010-09-16 22:35:10 | Re: Serializable Snapshot Isolation |