Re: including PID or backend ID in relpath of temp rels

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: including PID or backend ID in relpath of temp rels
Date: 2010-05-04 19:03:04
Message-ID: 20100504190304.GD3565@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas escribió:
> On Tue, May 4, 2010 at 2:06 PM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
> > Robert Haas escribió:
>
> Hey, thanks for writing back! I just spent the last few hours
> thinking about this and beating my head against the wall.

:-)

> >> [smgr.c,inval.c] Do we need to call CacheInvalidSmgr for temporary
> >> relations?  I think the only backend that can have an smgr reference
> >> to a temprel other than the owning backend is bgwriter, and AFAICS
> >> bgwriter will only have such a reference if it's responding to a
> >> request by the owning backend to unlink the associated files, in which
> >> case (I think) the owning backend will have no reference.
> >
> > Hmm, wasn't there a proposal to have the owning backend delete the files
> > instead of asking the bgwriter to?
>
> I did propose that upthread; it may have been proposed previously
> also. This might be worth doing independently of the rest of the patch
> (which I'm starting to fear is doomed, cue ominous soundtrack) since
> it would reduce the chance of orphaning data files and possibly
> simplify the logic also.

+1 for doing it separately, but hopefully that doesn't mean the rest of
this patch is doomed ...

> >> [dbsize.c] As with relcache.c, there's a problem if we're asked for
> >> the size of a temporary relation that is not our own: we can't call
> >> relpath() without knowing the ID of the owning backend, and there's no
> >> way to acquire that information for pg_class.  I guess we could just
> >> refuse to answer the question in that case, but that doesn't seem real
> >> cool.  Or we could physically scan the directory for files that match
> >> a suitably constructed wildcard, I suppose.
> >
> > I don't very much like the wildcard idea; but I don't think it's
> > unreasonable to refuse to provide a file size.  If the owning backend
> > has still got part of the table in local buffers, you'll get a
> > misleading answer, so perhaps it's best to not give an answer at all.
> >
> > Maybe this problem could be solved if we could somehow force that
> > backend to write down its local buffers, in which case it'd be nice to
> > have a solution to the dbsize problem.
>
> I'm sure we could add some kind of signaling mechanism that would tell
> all backends to flush their local buffers, but I'm not too sure it
> would help this case very much, because you likely wouldn't want to
> wait for all the backends to complete that process before reporting
> results.

Hmm, I was thinking in the pg_relation_size function -- given this new
mechanism you could get an accurate size of temp tables for other
backends. I wasn't thinking in the pg_database_size function, and
perhaps it's better to *not* include temp tables in that report at all.

> >> [syncscan.c] It seems we pursue this optimization even for temprels; I
> >> can't think of why that would be useful in practice.  If it's useless
> >> overhead, should we skip it?  This is really independent of this
> >> project; just a side thought.
> >
> > Maybe recently used buffers are more likely to be in the OS page cache,
> > so perhaps it's not good to disable it.
>
> I don't get it. If the whole relation fits in the page cache, it
> doesn't much matter where you start a seqscan. If it doesn't,
> starting where the last one ended is anti-optimal.

Err, I was thinking that a syncscan started a bunch of pages earlier
than the point where the previous scan ended, but yeah, that's a bit
silly. Maybe we should just ignore syncscan in temp tables altogether,
as you propose.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Kaltenbrunner 2010-05-04 19:34:53 Re: testing HS/SR - 1 vs 2 performance
Previous Message Kevin Grittner 2010-05-04 18:45:48 Re: Reg: SQL Query for Postgres 8.4.3