Passing query string to workers

From: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Passing query string to workers
Date: 2017-01-11 06:12:08
Message-ID: CAOGQiiMH_nOOGkxhbidnwfZ1n5pQayEzbE5iv9rO2oA8GfVj0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello everybody,

Currently, query string is not passed to the workers and only master has
it. In the events, when multiple queries are running on a server and for
one query some worker crashes then it becomes quite burdensome to find the
query with the crashed worker, since on the worker crash no query is
displayed.

To fix this, I propose a patch wherein query string is passed to the
workers as well, hence, displayed when worker crashes.

Approach:
A token for query string is created in the shared memory, this token is
populated with the query string using the global string --
debug_query_string. Now, for each of the worker when
ExecGetParallelQueryDesc is called, we retrieve the query text from shared
memory and pass it to CreateQueryDesc.

Next, to ensure that query gets displayed at the time of crash,
BackendStatusArray needs to be populated correctly, specifically for our
purpose, activity needs to be filled with current query. For this I called
pgstat_report_activity in ParallelWorkerMain, with the query string, this
populates workers' tuples in system table -- pgstat_activity as well.
Previously, pgstat_report_activity was only called for master in
exec_simple_query, hence, for workers pgstat_activity remained null.

Results:
Here is an output for artificially created worker crash with and without
the patch.

Without the patch error report on worker crash:
LOG: worker process: parallel worker for PID 49739 (PID 49741) was
terminated by signal 11: Segmentation fault

Error report with the patch:
LOG: worker process: parallel worker for PID 51757 (PID 51758) was
terminated by signal 11: Segmentation fault
2017-01-11 11:10:27.630 IST [51742] DETAIL: Failed process was running:
explain analyse select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date '1998-12-01' - interval '119' day
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus
LIMIT 1;

Inputs of all sorts are encouraged.
--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Attachment Content-Type Size
pass_queryText_to_workers_v1.patch application/octet-stream 3.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-01-11 07:04:13 Re: pg_restore accepts -j -1
Previous Message Kyotaro HORIGUCHI 2017-01-11 05:51:14 Re: Floating point comparison inconsistencies of the geometric types