Skip site navigation (1) Skip section navigation (2)

TwoPO: experimental join order algorithm

From: Adriano Lange <alange0001(at)gmail(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: TwoPO: experimental join order algorithm
Date: 2010-07-24 13:20:01
Message-ID: 4C4AE881.2030904@gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi,

I'd like to release the last version of my experimental join order
algorithm (TwoPO - Two Phase Optimization [1]):

http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary

This algorithm is not production-ready, but an experimental set of
ideas, which need to be refined and evaluated. As the join order
optimization is a hard problem, the evaluation of a search strategy is
also a hard task. Therefore, I think the most important TODO item
related to replacement of GEQO algorithm is to define a set of
evaluation criteria considered as relevant.

TwoPO is encapsulated in a plug-in called LJQO (Large Join Query
Optimization [2]). This plug-in has two main GUC variables:

ljqo_threshold = N (like geqo_threshold)
ljqo_algorithm = {twopo|geqo}

As its name means, TwoPO has internally two search strategies that
constitute its two phases of optimization:

 * Iterative Improvement (II) – Biased Sampling + Local Search
 * Simulated Annealing (SA)

This algorithm also works in two search spaces:

 * deep-trees (subset of bushy-trees)
    - list of baserels
    - initial states: biased sampling using
          Prim algorithm over join graph (new, very efficient)
    - moves: swap, 3cycle [2]
 * bushy-trees
    - binary join tree representation
    - initial states: biased sampling using
          Kruskal's algorithm over join graph [3,4].
    - moves: associative

You can modify the functionality of TwoPO through the following parameters:

twopo_bushy_space = {true|false}
  - set it to false if you want only deep-trees
    default=true
twopo_heuristic_states = {true|false}
  - enables heuristic to initial states
    default=true
twopo_ii_stop = Int
  - number of initial states
    default=10
twopo_ii_improve_states = {true|false}
  - find local-minimum of each initial state
    default=true
twopo_sa_phase = {true|false}
  - enables Simulated Annealing (SA) phase
    default=true
twopo_sa_initial_temperature = Float
  - initial temperature for SA phase
    default=0.1
twopo_sa_temperature_reduction = Float
  - temperature reduction
    default=0.95
twopo_sa_equilibrium = Int
  - number of states generated for each temperature
    (Int * State Size)
    default=16


References:

[1] Yannis E. Ioannidis e Younkyung Kang. Randomized algorithms for
optimizing large join queries. SIGMOD Rec., 19(2):312-321, 1990.

[2] Arun Swami e Anoop Gupta. Optimization of large join queries. SIGMOD
'88: Proceedings of the 1988 ACM SIGMOD international conference on
Management of data, pages 8-17, New York, NY, USA, 1988. ACM.

[3] P.B. Guttoski, M. S. Sunye, e F. Silva. Kruskal's algorithm for
query tree optimization. IDEAS '07: Proceedings of the 11th
International Database Engineering and Applications Symposium, pages
296-302, Washington, DC, USA, 2007. IEEE Computer Society.

[4] Florian Waas e Arjan Pellenkoft. Join order selection - good enough
is easy. BNCOD, pages 51-67, 2000.


Att,
Adriano Lange


Responses

pgsql-hackers by date

Next:From: Ron MayerDate: 2010-07-24 14:02:54
Subject: Re: antisocial things you can do in git (but not CVS)
Previous:From: zbDate: 2010-07-24 13:06:20
Subject: Re: Review of Synchronous Replication patches

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group