Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "John Smith" <sodgodofall(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables
Date: 2008-02-29 10:02:12
Message-ID: 47C7D824.5000707@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

John Smith wrote:
> Architecture: Intel Core 2 Duo
> OS: linux-2.6.20-gentoo-r8
> Filesystem: ext3
> Postgres v8.2.3 compiled with gcc 4.1.1-r3
> RAM - 2GB
> Shared buffers - 24MB
> [All other Postgres configuration parameters are default values]
>
> Problem description:
> COPY into temp table fails using a specific combination of
> create/insert on temp tables, prepare/commit in subsequent
> transactions. The "could not open relation" error occurs reliably.
>
> Steps to reproduce:
>
> Existing schema (scripts to create and populate these tables are
> uploaded to http://upload2.net/page/download/rfsLfnuVlYjEcCJ/basetables.tgz.html
> ):

I can't get that link to work. Can you please email me the files
offlist? Or upload somewhere else if they're too big for email.

> Observations:
> 1. The size of the data seems to matters. If the amount of data being
> inserted is dropped to just one or two records per table, the error
> doesn't happen.
> 2. The order of columns for the select into temp2 matters. Changing
> the order can cause the error to go away.
> 3. If the prepare/commit is replaced with a "commit;" the error goes away.
> 4. Removing "temp3" or "temp4" from the transaction causes one run of
> the above statements to succeed, but if the sequence is issued in the
> same PSQL session, the second one will fail.
> 5. Given the current dataset, the error always occurs on line 926 of
> the COPY (even if the values at line 926 are changed).
> 6. <tablespace>/<database>/<oid> typically always corresponds to that
> of temp2 on my system.

I think I see what's happening here. We have restricted two-phase commit
so that you're not supposed to be able to PREPARE TRANSACTION if the
transaction has touched any temporary tables. That's because the 2nd
phase commit can be performed from another backend, and another backend
can't mess with another backend's temporary tables.

However in this case, where you CREATE and DROP the temporary table in
the same transaction, we don't detect that, and let the PREPARE
TRANSACTION to finish. The detection relies on the lock manager, but
we're not holding any locks on the dropped relation.

I think we could in fact allow CREATE+DROP in same transaction, and
remove the table immediately at PREPARE TRANSACTION, but what happens
right now is that we store the relfilenode of the temp table to the
two-phase state file in pg_twophase, for deletion at COMMIT/ROLLBACK
PREPARED. But we don't store the fact that it's a temporary table, and
therefore we try to unlink it like a normal table, and fail to purge the
temp buffers of that table which causes problems later.

Attached is a simple patch to fix that by disallowing
CREATE+DROP+PREPARE TRANSACTION more reliably. It'd still be nice to
debug the full test case of yours to verify that that's what's
happening, though.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
2pc-temp-table-fix-1.patch text/x-diff 5.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-02-29 11:28:49 Re: CREATE TABLE, load and freezing
Previous Message Tom Lane 2008-02-29 06:37:48 Re: Batch update of indexes on data loading