Re: [sqlsmith] Missing CHECK_FOR_INTERRUPTS in tsquery_rewrite

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Seltenreich <seltenreich(at)gmx(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [sqlsmith] Missing CHECK_FOR_INTERRUPTS in tsquery_rewrite
Date: 2016-10-30 17:19:26
Message-ID: 18952.1477847966@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andreas Seltenreich <seltenreich(at)gmx(dot)de> writes:
> testing with sqlsmith yielded an uncancellable backend hogging CPU time.
> Gdb showed it was busy in findeq() of tsquery_rewrite.c. This function
> appears to have exponential complexity wrt. the size of the involved
> tsqueries. The following query runs for 12s on my machine with no way
> to cancel it and incrementing the length of the first argument by 1
> doubles this time.

> select ts_rewrite(
> (select string_agg(i::text, '&')::tsquery from generate_series(1,32) g(i)),
> (select string_agg(i::text, '&')::tsquery from generate_series(1,19) g(i)),
> 'foo');

> The attached patch adds a CHECK_FOR_INTERRUPTS to make it cancellable.

A CHECK_FOR_INTERRUPTS is probably a good idea, but man is this code
stupid. It seems to be checking for subset inclusion by forming every
possible subset of the test node and then checking for exact equality
to the target set. Seems like we should be able to do better.

Also, I think this is outright *wrong* for phrase search --- dropping some
of the child nodes without any other adjustment isn't valid is it?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-10-30 17:40:29 Re: [sqlsmith] Missing CHECK_FOR_INTERRUPTS in tsquery_rewrite
Previous Message Kevin Grittner 2016-10-30 15:35:21 Re: delta relations in AFTER triggers