Skip site navigation (1) Skip section navigation (2)

Re: add_path optimization

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: add_path optimization
Date: 2009-02-28 02:35:34
Message-ID: 1268.1235788534@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
[ This patch is on the 2009-First commitfest list, but there was earlier
discussion to the effect that it might be a good idea to include in 8.4,
so I made some time to look at it. ]

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I've been doing some benchmarking and profiling on the PostgreSQL
> query analyzer, and it seems that (at least for the sorts of queries
> that I typically run) the dominant cost is add_path().  I've been able
> to find two optimizations that seem to help significantly:

I got around to testing this patch a bit today, and I'm afraid it
doesn't look very good from here.  The test case I used was a psql
script with 10000 repetitions of

explain select * from information_schema.table_constraints
  join information_schema.referential_constraints
  using (constraint_catalog,constraint_schema,constraint_name);

which isn't terribly creative, but those two views incorporate multiple
joins so I figured this was a reasonably middle-of-the-road planner
exercise.

I first tried just the compare_fuzzy_path_costs() change and really
couldn't measure any reliable difference.  oprofile output for CVS HEAD
looks like

PU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) 
with a unit mask of 0x01 (mandatory) count 100000
samples  %        image name               symbol name
149294    9.7519  postgres                 AllocSetAlloc
85437     5.5808  postgres                 SearchCatCache
45520     2.9734  postgres                 copyObject
39525     2.5818  postgres                 add_path
35878     2.3436  postgres                 expression_tree_walker
29733     1.9422  libc-2.8.so              memcpy
25560     1.6696  postgres                 MemoryContextAllocZeroAligned
20879     1.3638  libc-2.8.so              strlen
20521     1.3404  postgres                 add_paths_to_joinrel
20054     1.3099  postgres                 AllocSetFree
20023     1.3079  postgres                 cost_mergejoin
19363     1.2648  postgres                 nocachegetattr
17689     1.1554  libc-2.8.so              vfprintf
17335     1.1323  postgres                 generate_join_implied_equalities
16085     1.0507  postgres                 MemoryContextAlloc

and with the compare_fuzzy_path_costs() change

CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) 
with a unit mask of 0x01 (mandatory) count 100000
samples  %        image name               symbol name
144882    9.6951  postgres                 AllocSetAlloc
84205     5.6348  postgres                 SearchCatCache
43956     2.9414  postgres                 copyObject
37418     2.5039  postgres                 add_path
34865     2.3331  postgres                 expression_tree_walker
28718     1.9217  libc-2.8.so              memcpy
23643     1.5821  postgres                 MemoryContextAllocZeroAligned
20835     1.3942  libc-2.8.so              strlen
20065     1.3427  postgres                 cost_mergejoin
19672     1.3164  postgres                 AllocSetFree
18992     1.2709  postgres                 add_paths_to_joinrel
18802     1.2582  postgres                 nocachegetattr
17109     1.1449  libc-2.8.so              __printf_fp
16901     1.1310  libc-2.8.so              vfprintf
16159     1.0813  postgres                 hash_search_with_hash_value
16039     1.0733  postgres                 generate_join_implied_equalities
15554     1.0408  postgres                 MemoryContextAlloc
14981     1.0025  postgres                 expression_tree_mutator

Note that the compiler is inlining compare_fuzzy_path_costs in both
cases.

I then added the add_similar_paths change, and got this:

CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) 
with a unit mask of 0x01 (mandatory) count 100000
samples  %        image name               symbol name
142645    9.6668  postgres                 AllocSetAlloc
83161     5.6357  postgres                 SearchCatCache
44660     3.0265  postgres                 copyObject
35762     2.4235  postgres                 expression_tree_walker
28427     1.9264  postgres                 add_path
27970     1.8955  libc-2.8.so              memcpy
23317     1.5801  postgres                 MemoryContextAllocZeroAligned
20685     1.4018  libc-2.8.so              strlen
19646     1.3314  postgres                 add_paths_to_joinrel
19129     1.2963  postgres                 AllocSetFree
18065     1.2242  postgres                 cost_mergejoin
17768     1.2041  postgres                 nocachegetattr
17249     1.1689  postgres                 generate_join_implied_equalities
16784     1.1374  libc-2.8.so              vfprintf
15740     1.0667  libc-2.8.so              __printf_fp
15642     1.0600  postgres                 MemoryContextAlloc
15194     1.0297  postgres                 expression_tree_mutator
15009     1.0171  postgres                 hash_search_with_hash_value
...
7746      0.5249  postgres                 add_similar_paths

where at least you can see a change in add_path's percentage, although
overall runtime did not change measurably; and in any case whatever was
saved seems to have gone into add_similar_paths.

It gets worse though: add_similar_paths is flat *wrong*.  It compares
each successive path to only the current cheapest-total candidate,
and thus is quite likely to seize on the wrong cheapest-startup path.
I tried fixing its loop logic like so:

    for (i = 1; i < npath; ++i)
    {
        if (compare_fuzzy_path_costs(cheapest_total, paths[i], TOTAL_COST) > 0)
            cheapest_total = paths[i];
        if (compare_fuzzy_path_costs(cheapest_startup, paths[i], STARTUP_COST) > 0)
            cheapest_startup = paths[i];
    }

(obviously, this is with compare_fuzzy_path_costs reverted to its form
in HEAD) and got this:

CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) 
with a unit mask of 0x01 (mandatory) count 100000
samples  %        image name               symbol name
143675    9.6105  postgres                 AllocSetAlloc
83809     5.6060  postgres                 SearchCatCache
43837     2.9323  postgres                 copyObject
35790     2.3940  postgres                 expression_tree_walker
29870     1.9980  postgres                 compare_fuzzy_path_costs
28521     1.9078  libc-2.8.so              memcpy
23448     1.5685  postgres                 MemoryContextAllocZeroAligned
21049     1.4080  libc-2.8.so              strlen
18343     1.2270  postgres                 add_path
18154     1.2143  postgres                 AllocSetFree
18119     1.2120  postgres                 cost_mergejoin
18036     1.2064  postgres                 add_paths_to_joinrel
17791     1.1901  postgres                 nocachegetattr
17277     1.1557  postgres                 generate_join_implied_equalities
16956     1.1342  libc-2.8.so              vfprintf
16125     1.0786  postgres                 hash_search_with_hash_value
15733     1.0524  libc-2.8.so              __printf_fp
15387     1.0292  postgres                 MemoryContextAlloc
...
5759      0.3852  postgres                 add_similar_paths

Here we have a total of 3.6% spent in code that had used ~2.5% before
we started.  Not very promising.

Now, maybe if we could force the compiler to inline
compare_fuzzy_path_costs we could buy some of that back.  Another
possibility is to contort add_similar_path some more so that it avoids
double comparisons so long as cheapest_total and cheapest_startup are
the same path; I have a feeling that's not going to win much though.

So it's kind of looking like a dead end from here.  But in any case the
patch isn't acceptable as-is with that fundamental logic error in
add_similar_paths, so I'm bouncing it back for rework.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Robert HaasDate: 2009-02-28 03:38:48
Subject: Re: add_path optimization
Previous:From: James PyeDate: 2009-02-28 02:19:51
Subject: Re: xpath processing brain dead

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group