Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables

From: "Gurjeet Singh" <singh(dot)gurjeet(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "John Smith" <sodgodofall(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables
Date: 2008-02-29 13:10:07
Message-ID: 65937bea0802290510p36c5e789sd985523711ee34a2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Plausible theory, and nice explanation....

Try the following link (I had to wait for 50 sec for the link to appear, but
I guess the trade-off of getting knowledge in return is worth it :) )

http://www5.upload2.net/download/77fa86e16a02e52fd5439c76e148d231/47c7fdce/rfsLfnuVlYjEcCJ/basetables.tgz

Not sending attachment in this mail; that may cause the mail to be not
accepted by the list. I will try to send the attachment in the next mail, to
retain it in the mailing list for historica purposes.

Thanks and best regards,
--
gurjeet[(dot)singh](at)EnterpriseDB(dot)com
singh(dot)gurjeet(at){ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune *
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco

http://gurjeet.frihost.net

Mail sent from my BlackLaptop device

On Fri, Feb 29, 2008 at 3:32 PM, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
wrote:

> John Smith wrote:
> > Architecture: Intel Core 2 Duo
> > OS: linux-2.6.20-gentoo-r8
> > Filesystem: ext3
> > Postgres v8.2.3 compiled with gcc 4.1.1-r3
> > RAM - 2GB
> > Shared buffers - 24MB
> > [All other Postgres configuration parameters are default values]
> >
> > Problem description:
> > COPY into temp table fails using a specific combination of
> > create/insert on temp tables, prepare/commit in subsequent
> > transactions. The "could not open relation" error occurs reliably.
> >
> > Steps to reproduce:
> >
> > Existing schema (scripts to create and populate these tables are
> > uploaded to
> http://upload2.net/page/download/rfsLfnuVlYjEcCJ/basetables.tgz.html
> > ):
>
> I can't get that link to work. Can you please email me the files
> offlist? Or upload somewhere else if they're too big for email.
>
> > Observations:
> > 1. The size of the data seems to matters. If the amount of data being
> > inserted is dropped to just one or two records per table, the error
> > doesn't happen.
> > 2. The order of columns for the select into temp2 matters. Changing
> > the order can cause the error to go away.
> > 3. If the prepare/commit is replaced with a "commit;" the error goes
> away.
> > 4. Removing "temp3" or "temp4" from the transaction causes one run of
> > the above statements to succeed, but if the sequence is issued in the
> > same PSQL session, the second one will fail.
> > 5. Given the current dataset, the error always occurs on line 926 of
> > the COPY (even if the values at line 926 are changed).
> > 6. <tablespace>/<database>/<oid> typically always corresponds to that
> > of temp2 on my system.
>
> I think I see what's happening here. We have restricted two-phase commit
> so that you're not supposed to be able to PREPARE TRANSACTION if the
> transaction has touched any temporary tables. That's because the 2nd
> phase commit can be performed from another backend, and another backend
> can't mess with another backend's temporary tables.
>
> However in this case, where you CREATE and DROP the temporary table in
> the same transaction, we don't detect that, and let the PREPARE
> TRANSACTION to finish. The detection relies on the lock manager, but
> we're not holding any locks on the dropped relation.
>
> I think we could in fact allow CREATE+DROP in same transaction, and
> remove the table immediately at PREPARE TRANSACTION, but what happens
> right now is that we store the relfilenode of the temp table to the
> two-phase state file in pg_twophase, for deletion at COMMIT/ROLLBACK
> PREPARED. But we don't store the fact that it's a temporary table, and
> therefore we try to unlink it like a normal table, and fail to purge the
> temp buffers of that table which causes problems later.
>
> Attached is a simple patch to fix that by disallowing
> CREATE+DROP+PREPARE TRANSACTION more reliably. It'd still be nice to
> debug the full test case of yours to verify that that's what's
> happening, though.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zdenek Kotala 2008-02-29 13:55:36 Why we panic in pglz_decompress
Previous Message Heikki Linnakangas 2008-02-29 13:07:11 Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables