Building large GiST indexes by simply inserting all the tuples tends to be slow, because if the index tuples are scattered across the index and the index is large enough to not fit in cache, the insertions need to perform a lot of random I/O. Beginning in version 9.2, PostgreSQL supports a more efficient method to build GiST indexes based on buffering, which can dramatically reduce the number of random I/Os needed for non-ordered data sets. For well-ordered data sets the benefit is smaller or non-existent, because only a small number of pages receive new tuples at a time, and those pages fit in cache even if the index as whole does not.
However, buffering index build needs to call the
penalty function more often,
which consumes some extra CPU resources. Also, the buffers used
in the buffering build need temporary disk space, up to the
size of the resulting index. Buffering can also influence the
quality of the resulting index, in both positive and negative
directions. That influence depends on various factors, like the
distribution of the input data and the operator class
By default, a GiST index build switches to the buffering method when the index size reaches effective_cache_size. It can be manually turned on or off by the buffering parameter to the CREATE INDEX command. The default behavior is good for most cases, but turning buffering off might speed up the build somewhat if the input data is ordered.
Please use this form to add your own comments regarding your experience with particular features of PostgreSQL, clarifications of the documentation, or hints for other users. Please note, this is not a support forum, and your IP address will be logged. If you have a question or need help, please see the faq, try a mailing list, or join us on IRC. Note that submissions containing URLs or other keywords commonly found in 'spam' comments may be silently discarded. Please contact the webmaster if you think this is happening to you in error.
Proceed to the comment form.