Re: gsoc, oprrest function for text search take 2

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: Jan Urbański <j(dot)urbanski(at)students(dot)mimuw(dot)edu(dot)pl>
Cc: "Postgres - Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gsoc, oprrest function for text search take 2
Date: 2008-08-11 08:40:40
Message-ID: 489FFB08.6050709@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jan Urbański wrote:
> Heikki Linnakangas wrote:
>> Jan Urbański wrote:
>>> Another thing are cstring_to_text_with_len calls. I'm doing them so I
>>> can use bttextcmp in bsearch(). I think I could come up with a
>>> dedicated function to return text Datums and WordEntries (read:
>>> non-NULL terminated strings with a given length).
>>
>> Just keep them as cstrings and use strcmp. We're only keeping the
>> array sorted so that we can binary search it, so we don't need proper
>> locale-dependent collation. Note that we already assume that two
>> strings ('text's) are equal if and only if their binary
>> representations are equal (texteq() uses strcmp).
>
> OK, I got rid of cstring->text calls and memory contexts as I went
> through it. The only tiny ugliness is that there's one function used for
> qsort() and another for bsearch(), because I'm sorting an array of texts
> (from pg_statistic) and I'm binary searching for a lexeme (non-NULL
> terminated string with length).

It would be nice to clean that up a bit. I think you could convert the
lexeme to a TextFreq, or make the TextFreq.element a "text *" instead of
Datum (ie., detoast it with PG_DETOAST_DATUM while you build the array
for qsort).

> My medicore gprof skills got me:
> 0.00 0.22 5/5 OidFunctionCall4 [37]
> [38] 28.4 0.00 0.22 5 tssel [38]
> 0.00 0.17 5/5 get_restriction_variable [40]
> 0.03 0.01 5/10 pg_qsort [60]
> 0.00 0.00 5/5 get_attstatsslot [139]
>
> Hopefully that says that the qsort() overhead is small compared to
> munging through the planner Node.

I'd like to see a little bit more testing of that. I can't read gprof
myself, so the above doesn't give me much confidence. I use oprofile,
which I find is much simpler to use.

I think the worst case scenario is with statistics_target set to
maximum, with a simplest possible query and simplest possible tsquery.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2008-08-11 09:08:04 Re: Parsing of pg_hba.conf and authentication inconsistencies
Previous Message Ryan Bradetich 2008-08-11 08:07:00 Question regarding the database page layout.