Skip site navigation (1) Skip section navigation (2)

Re: [GENERAL] Creation of tsearch2 index is very slow

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Stephan Vollmer <svollmer(at)gmx(dot)de>, pgsql-performance(at)postgreSQL(dot)org
Subject: Re: [GENERAL] Creation of tsearch2 index is very slow
Date: 2006-01-20 19:14:29
Message-ID: 26153.1137784469@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-performance
[ thread moved to pgsql-performance ]

I've obtained a gprof profile on Stephan's sample case (many thanks for
providing the data, Stephan).  The command is
	CREATE INDEX foo ON publications_test USING gist (fti_title);
where fti_title is a tsvector column.  There are 236984 rows in the
table, most with between 4 and 10 words in fti_title.
sum(length(fti_title)) yields 1636202 ... not sure if this is a
relevant measure, however.

Using CVS tip with a fairly vanilla configuration (including
--enable-cassert), here are all the hotspots down to the 1% level:

  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 20.19      1.90     1.90   588976     0.00     0.00  gistchoose
 19.02      3.69     1.79   683471     0.00     0.00  XLogInsert
  5.95      4.25     0.56  3575135     0.00     0.00  LWLockAcquire
  4.46      4.67     0.42  3579005     0.00     0.00  LWLockRelease
  4.14      5.06     0.39  3146848     0.00     0.00  AllocSetAlloc
  3.72      5.41     0.35   236984     0.00     0.00  gistdoinsert
  3.40      5.73     0.32   876047     0.00     0.00  hash_search
  2.76      5.99     0.26  3998576     0.00     0.00  LockBuffer
  2.28      6.21     0.22 11514275     0.00     0.00  gistdentryinit
  1.86      6.38     0.18   841757     0.00     0.00  UnpinBuffer
  1.81      6.55     0.17 12201023     0.00     0.00  FunctionCall1
  1.81      6.72     0.17   237044     0.00     0.00  AllocSetCheck
  1.49      6.86     0.14   236984     0.00     0.00  gistmakedeal
  1.49      7.00     0.14 10206985     0.00     0.00  FunctionCall3
  1.49      7.14     0.14  1287874     0.00     0.00  MemoryContextAllocZero
  1.28      7.26     0.12   826179     0.00     0.00  PinBuffer
  1.17      7.37     0.11   875785     0.00     0.00  hash_any
  1.17      7.48     0.11  1857292     0.00     0.00  MemoryContextAlloc
  1.17      7.59     0.11   221466     0.00     0.00  PageIndexTupleDelete
  1.06      7.69     0.10  9762101     0.00     0.00  gistpenalty

Clearly, one thing that would be worth doing is suppressing the WAL
traffic when possible, as we already do for btree builds.  It seems
that gistchoose may have some internal ineffiency too --- I haven't
looked at the code yet.  The other thing that jumps out is the very
large numbers of calls to gistdentryinit, FunctionCall1, FunctionCall3.
Some interesting parts of the calls/calledby graph are:

-----------------------------------------------
                0.35    8.07  236984/236984      gistbuildCallback [14]
[15]    89.5    0.35    8.07  236984         gistdoinsert [15]
                0.14    3.55  236984/236984      gistmakedeal [16]
                1.90    0.89  588976/588976      gistchoose [17]
                0.07    0.83  825960/841757      ReadBuffer [19]
                0.09    0.10  825960/1287874     MemoryContextAllocZero [30]
                0.12    0.05 1888904/3998576     LockBuffer [29]
                0.13    0.00  825960/3575135     LWLockAcquire [21]
                0.10    0.00  825960/3579005     LWLockRelease [26]
                0.06    0.00  473968/3146848     AllocSetAlloc [27]
                0.03    0.00  473968/1857292     MemoryContextAlloc [43]
                0.02    0.00  825960/1272423     gistcheckpage [68]
-----------------------------------------------
                0.14    3.55  236984/236984      gistdoinsert [15]
[16]    39.2    0.14    3.55  236984         gistmakedeal [16]
                1.20    0.15  458450/683471      XLogInsert [18]
                0.01    0.66  224997/224997      gistxlogInsertCompletion [20]
                0.09    0.35  444817/444817      gistgetadjusted [23]
                0.08    0.17  456801/456804      formUpdateRdata [32]
                0.17    0.01  827612/841757      UnpinBuffer [35]
                0.11    0.00  221466/221466      PageIndexTupleDelete [42]
                0.02    0.08  456801/460102      gistfillbuffer [45]
                0.06    0.04    1649/1649        gistSplit [46]
                0.08    0.00  685099/3579005     LWLockRelease [26]
                0.03    0.05  446463/446463      gistFindCorrectParent [50]
                0.04    0.02  685099/3998576     LockBuffer [29]
                0.04    0.00    1649/1649        gistextractbuffer [58]
                0.03    0.00  460102/460121      write_buffer [66]
                0.02    0.00  825960/826092      ReleaseBuffer [69]
                0.02    0.00  221402/221402      gistadjscans [82]
                0.00    0.00    1582/1582        gistunion [131]
                0.00    0.00    1649/1649        formSplitRdata [147]
                0.00    0.00    1649/1649        gistjoinvector [178]
                0.00    0.00       3/3           gistnewroot [199]
                0.00    0.00  458450/461748      gistnospace [418]
                0.00    0.00  458450/458450      WriteNoReleaseBuffer [419]
                0.00    0.00    1652/1671        WriteBuffer [433]
-----------------------------------------------
                1.90    0.89  588976/588976      gistdoinsert [15]
[17]    29.7    1.90    0.89  588976         gistchoose [17]
                0.25    0.17 9762101/10892174     FunctionCall3 <cycle 1> [38]
                0.18    0.14 9762101/11514275     gistdentryinit [28]
                0.10    0.00 9762101/9762101     gistpenalty [47]
                0.04    0.02  588976/1478610     gistDeCompressAtt [39]
-----------------------------------------------
                0.00    0.00       1/683471      gistbuild [12]
                0.00    0.00       1/683471      log_heap_update [273]
                0.00    0.00       1/683471      RecordTransactionCommit [108]
                0.00    0.00       1/683471      smgrcreate [262]
                0.00    0.00       3/683471      gistnewroot [199]
                0.00    0.00       5/683471      heap_insert [116]
                0.00    0.00      12/683471      _bt_insertonpg [195]
                0.59    0.07  224997/683471      gistxlogInsertCompletion [20]
                1.20    0.15  458450/683471      gistmakedeal [16]
[18]    21.4    1.79    0.22  683471         XLogInsert [18]
                0.11    0.00  683471/3575135     LWLockAcquire [21]
                0.08    0.00  687340/3579005     LWLockRelease [26]
                0.03    0.00  683471/683471      GetCurrentTransactionIdIfAny [65]
                0.01    0.00   15604/15604       AdvanceXLInsertBuffer [111]
                0.00    0.00       3/10094       BufferGetBlockNumber [95]
                0.00    0.00    3869/3870        XLogWrite [281]
                0.00    0.00    3870/3871        LWLockConditionalAcquire [428]
                0.00    0.00       3/3           BufferGetFileNode [611]
-----------------------------------------------
                0.00    0.00    3164/11514275     gistunion [131]
                0.01    0.00  270400/11514275     gistSplit [46]
                0.03    0.02 1478610/11514275     gistDeCompressAtt [39]
                0.18    0.14 9762101/11514275     gistchoose [17]
[28]     4.0    0.22    0.16 11514275         gistdentryinit [28]
                0.16    0.00 11514275/12201023     FunctionCall1 [36]
-----------------------------------------------
                0.00    0.00      67/12201023     index_endscan <cycle 1> [167]
                0.01    0.00  686681/12201023     gistcentryinit [62]
                0.16    0.00 11514275/12201023     gistdentryinit [28]
[36]     1.8    0.17    0.00 12201023         FunctionCall1 [36]
                0.00    0.00      67/67          btendscan [231]
                0.00    0.00 12200956/22855929     data_start [414]
-----------------------------------------------
                                  67             index_beginscan_internal <cycle 1> [169]
                0.01    0.01  444817/10892174     gistgetadjusted [23]
                0.25    0.17 9762101/10892174     gistchoose [17]
[38]     1.5    0.14    0.00 10206985         FunctionCall3 <cycle 1> [38]
                0.00    0.00 10206918/22855929     data_start [414]
                0.00    0.00      67/67          btbeginscan [486]
                                  67             RelationGetIndexScan <cycle 1> [212]

Now that we have some data, we can start to think about how to improve
matters ...

			regards, tom lane

In response to

Responses

pgsql-performance by date

Next:From: Rikard PavelicDate: 2006-01-20 19:38:23
Subject: Re: [PERFORMANCE] Stored Procedures
Previous:From: Jim C. NasbyDate: 2006-01-20 19:10:31
Subject: Re: [PERFORMANCE] Stored Procedures

pgsql-general by date

Next:From: Martijn van OosterhoutDate: 2006-01-20 20:10:36
Subject: Re: [GENERAL] Creation of tsearch2 index is very slow
Previous:From: Brian DimelerDate: 2006-01-20 18:57:58
Subject: Re: sequences not restoring properly

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group