Re: Sequential scans

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sequential scans
Date: 2007-05-03 19:04:47
Message-ID: 1178219087.28383.286.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2007-05-03 at 19:27 +0100, Heikki Linnakangas wrote:
> I understand that the data structure keeps track of relations being
> scanned, with one entry per such relation. I think that's very good
> design, and I'm not suggesting to change that.
>
> But what's the right size for that? We don't know how many large tables
> there is in the database, so we can't use that. A hard-coded constant
> maybe? What would it be? I figured we could use NBackends, because there
> can't realistically be more than one large seq scan per backend going on
> at any point in time. That's an upper bound, in any real world
> application there's far less than that.

Ok, I understand you now. We'll choose the maximum size of the
table/list as the max_connections on startup.

> > Part of the reason I did this is because, with a Sync Scan, it seems
> > like there may be possible reasons to do multiple seq scans concurrently
> > in the same backend. This is completely contrived, but consider:
> >
> > SELECT * FROM bigtable UNION ALL SELECT * FROM bigtable;
> >
> > I'm not trying to argue for such a plan to exist right now, but if there
> > is a possibility that we'd want to create such plans in the future, I
> > think it's better to track by relation rather than by backend.
>
> Those two seqscans would be executed one after each other, not in
> parallel, so you would still have just one seq scan at any given point
> in time.

Right, I was just throwing out a hypothetical situation (if highly
contrived) in which it may help to do multiple scans at once. I think
this can be safely ignored for now though, since the planner would never
generate such a plan, and I can't think of a reasonable use case.

> >>> 100 is of course arbitrary, and that could grow or shrink automatically
> >>> at runtime.
> >> Except that it can't shrink, because we don't have any convenient moment
> >> to remove an entry from the list, other than dropping the least recently
> >> used entry out of the list when inserting a new one. And since it's in
> >> shared mem, the allocated size is fixed at boot up, so we can't grow it
> >> either.
> >>
> >
> > It can shrink when a new scan looks through the list and passes by
> > entries that don't need to be in the list.
>
> How do you identify an entry that doesn't need to be there?
>

Some type of clock-like algorithm where a counter was decremented when
an element is passed up and incremented when it's chosen might work.
It's probably easier to just use a hash table though.

> > I don't want to over-complicate things, so if you think having a hash
> > table is a better, simpler, or more readable design, I'll go with that
> > instead of the list-only approach. The only reason I suggested the list-
> > only approach was to see if the hash table was really gaining anything
> > for us. In practice, it will be quite rare that more than 10 different
> > big relations are being scanned at once, and I don't know how much of a
> > gain a hash table is when there are (in all but exceptional cases) only
> > a few entries.
>
> How about implementing the list-only approach first, since you're going
> to need the LRU list anyway, and we can add consider adding the hash
> table later?
>

Sounds good. Will do.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Ryan 2007-05-03 19:12:20 Re: [HACKERS] Feature freeze progress report
Previous Message Tom Lane 2007-05-03 19:04:07 Re: RETURN QUERY in PL/PgSQL?