Re: INHERITS and planning

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Edmund Dengler <edmundd(at)eSentire(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: INHERITS and planning
Date: 2005-06-10 06:10:14
Message-ID: 18614.1118383814@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Edmund Dengler <edmundd(at)eSentire(dot)com> writes:
> Is there an issue when a large number of INHERITS tables exist for
> planning?

Well, there are a number of issues whenever a single query references
a whole lot of tables in any fashion. It's only with Neil Conway's
rewrite of the List package in 8.0 that we had any hope of less than
O(N^2) behavior for practically any measure of query complexity N.
I have been spending some time over the past week or so attacking
other O(N^2) behaviors, but it's not a finished project yet.

I tried to reproduce your issue by doing

create table p1 (f1 int, f2 bigint);

create table c0() inherits (p1);
create table c1() inherits (p1);
create table c2() inherits (p1);
...
create table c2298() inherits (p1);
create table c2299() inherits (p1);

and then profiling

select * from p1;

With no data in the tables, of course this is just measuring planning
time and executor startup/shutdown overhead. But I suppose that you
don't have a whole lot of data in the tables either, because the data
fetching stage is surely pretty linear and you'd not be complaining
about overhead if there were much data to be read.

What I see in the profile is

% cumulative self self total
time seconds seconds calls s/call s/call name
42.04 15.58 15.58 9214 0.00 0.00 list_nth_cell
20.29 23.10 7.52 34524302 0.00 0.00 SHMQueueNext
8.34 26.19 3.09 29939 0.00 0.00 LockCountMyLocks
5.64 28.28 2.09 2960617 0.00 0.00 AllocSetAlloc
2.37 29.16 0.88 2354 0.00 0.00 AllocSetCheck
2.29 30.01 0.85 302960 0.00 0.00 hash_search
2.13 30.80 0.79 2902873 0.00 0.00 MemoryContextAlloc

The list_nth operations are all coming from rt_fetch() macros, so we
could probably fix that by replacing rangetable Lists by arrays. This
seems doable, but also tedious and bug-prone; there are too many places
that figure they can randomly add onto rtable lists.

What I'm more interested in at the moment are the next two entries,
SHMQueueNext and LockCountMyLocks --- it turns out that almost all the
SHMQueueNext calls are coming from LockCountMyLocks, which is invoked
during LockAcquire. This is another O(N^2) loop, and it's really a
whole lot nastier than the rangetable ones, because it executes with the
LockMgrLock held.

I spent a little time trying to see if we could avoid doing
LockCountMyLocks altogether, but it didn't look very promising. What
I am thinking though is that we could implement LockCountMyLocks as
either a scan through all the proclocks attached to the target proc
(the current way) or as a scan through all the proclocks attached to
the target lock (proclocks are threaded both ways). There is no hard
upper bound on the number of locks a proc holds, whereas there is a
bound of MaxBackends on the number of procs linked to a lock. (Well,
theoretically it could be 2*MaxBackends considering the possibility
of session locks, but that could only happen if all the backends are
trying to vacuum the same relation.) So it seems like it might be a win
to scan over the per-lock list instead. But I'm very unsure about
what the *average* case is, instead of the worst case.

I'm also thinking that the shared memory lock structure may be
overdesigned now that we've introduced the backend-private LocalLock
table --- in particular, it's not clear why we still include transaction
IDs in PROCLOCKTAG rather than leaving the backend to track all the
different reasons why it wants to hold a lock. If we could get rid of
that then LockCountMyLocks reduces to a single PROCLOCK hashtable
lookup.

Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message frederic.germaneau 2005-06-10 06:18:31 IMPORTANT NOTIFICATION
Previous Message Ilja Golshtein 2005-06-10 05:52:53 Re: CREATE TEMP TABLE AS SELECT/ GET DIAGNOSTICS ROW_COUNT

Browse pgsql-hackers by date

  From Date Subject
Next Message Jamie Deppeler 2005-06-10 06:49:52 Make Problems
Previous Message Bruce Momjian 2005-06-10 04:02:00 Re: [HACKERS] PGPASSWORD and client tools