| From: | Bryan Green <dbryan(dot)green(at)gmail(dot)com> |
|---|---|
| To: | Euler Taveira <euler(at)eulerto(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
| Cc: | Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [PATCH] Add Windows support for backtrace_functions (MSVC only) |
| Date: | 2025-11-01 04:40:58 |
| Message-ID: | 31f43d88-9949-48c5-9497-b9a9b7cdee2b@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 10/31/2025 2:07 PM, Bryan Green wrote:
> On 10/31/2025 1:46 PM, Euler Taveira wrote:
>> On Thu, Oct 30, 2025, at 11:51 AM, Bryan Green wrote:
>>> I had reservations about the value the tests were adding, and
>>> considering I am getting more concern around having the tests than not
>>> having them for this initial release I have decided to remove them. v4
>>> patch is attached. It is the same as the initial 0001-* patch.
>>>
>>
>> I spent some time on this patch. Here are some comments and suggestions.
>>
>
> Thanks for the review.
>
>> +#ifdef _MSC_VER
>> +#include <windows.h>
>> +#include <dbghelp.h>
>> +static bool win32_backtrace_symbols_initialized = false;
>> +static HANDLE win32_backtrace_process = NULL;
>> +#endif
>>
>> We usually a different style. Headers go to the top on the same
>> section as
>> system headers and below postgres.h. It is generally kept sorted. The
>> same
>
> Will fix. I will rework the style (error and otherwise) to follow the
> project tradition.
>
>> applies to variables. Add them near the other static variables. BTW
>> does it need
>> the win32_ prefix for a Window-only variable?
>>
>> + wchar_t buffer[sizeof(SYMBOL_INFOW) + MAX_SYM_NAME *
>> sizeof(wchar_t)];
>> + PSYMBOL_INFOW symbol;
>>
>> According to [1], SYMBOL_INFO is an alias that automatically selects
>> ANSI vs
>> UNICODE. Shouldn't we use it instead of SYMBOL_INFOW?
>>
>
> Good point. I was being overly explicit about wanting wide chars, but
> you're right that the generic versions are the way to go.
>
>> + elog(WARNING, "SymInitialize failed with error %lu",
>> error);
>>
>> Is there a reason to continue if SymInitialize failed? It should
>> return after
>
> None at all. Will return immediately.
>
>> printing the message. Per the error message style guide [2], my
>> suggestion is
>> "could not initialize the symbol handler: error code %lu". You can
>> also use
>> GetLastError() directly in the elog call.
>>
>> + symbol = (PSYMBOL_INFOW) buffer;
>> + symbol ->MaxNameLen = MAX_SYM_NAME;
>> + symbol ->SizeOfStruct = sizeof(SYMBOL_INFOW);
>>
>> We generally don't add spaces between variable and a member.
>>
>> + DWORD i;
>>
>> I'm curious why did you declare this variable as DWORD? Shouldn't int be
>> sufficient? The CaptureStackBackTrace function returns an unsigned short
>> (UShort). You can also declare it in the for loop.
>>
>
> Out of habit. I will change it to int.
>
>> + DWORD64 address = (DWORD64) (stack[i]);
>>
>> The parenthesis around stack is superfluous. The code usually doesn't
>> contain
>> additional parenthesis (unless it improves readability).
>>
>
> I will remove the parenthesis.
>
>> + if (frames == 0)
>> + {
>> + appendStringInfoString(&errtrace, "\nNo stack frames
>> captured");
>> + edata->backtrace = errtrace.data;
>> + return;
>> + }
>>
>> It seems CaptureStackBackTrace function may return zero frames under
>> certain
>> conditions. It is a good point having this additional message. I
>> noticed that
>> the current code path (HAVE_BACKTRACE_SYMBOLS) doesn't have this
>> block. IIUC,
>> in certain circumstances (ARM vs unwind-tables flag), the backtrace()
>> also
>> returns zero frames. Should we add this block for the backtrace() code
>> path?
>>
>
> Probably, though that seems like separate cleanup. Want me to include it
> here or handle separately (in another patch)?
>
>> + sym_result = SymFromAddrW(win32_backtrace_process,
>> + address,
>> + &displacement,
>> + symbol);
>>
>> You should use SymFromAddr, no? [3] I saw that you used the Unicode
>> functions
>> instead of the generic functions [4].
>>
>> + /* Convert symbol name to UTF-8 */
>> + utf8_len = WideCharToMultiByte(CP_UTF8, 0,
>> symbol->Name, -1,
>> + NULL, 0, NULL, NULL);
>> + if (utf8_len > 0)
>> + {
>> + char *filename_utf8;
>> + int filename_len;
>> +
>> + utf8_buffer = palloc(utf8_len);
>> + WideCharToMultiByte(CP_UTF8, 0, symbol->Name,
>> -1,
>> + utf8_buffer, utf8_len,
>> NULL, NULL);
>>
>> + /* Convert symbol name to UTF-8 */
>> + utf8_len = WideCharToMultiByte(CP_UTF8, 0,
>> symbol->Name, -1,
>> + NULL, 0, NULL, NULL);
>>
>> You are calling WideCharToMultiByte twice. The reason is to allocate
>> the exact
>> memory size. However, you can adopt another logic to avoid the first
>> call.
>>
>> maxlen = symbol->NameLen * pg_database_encoding_max_length();
>> symbol_name = palloc(maxlen + 1);
>>
>> (I suggest symbol_name instead of ut8_buffer.)
>>
>> You are considering only the UTF-8 case. Shouldn't it use wcstombs or
>> wcstombs_l? Maybe (re)use wchar2char -- see pg_locale_libc.c.
>>
>
> Hmm, you're probably right. I was thinking these Windows API strings
> needed special handling, but wchar2char should handle the conversion to
> database encoding correctly. Let me test that approach.
>
>> + if (filename_len > 0)
>> + {
>> + filename_utf8 = palloc(filename_len);
>> + WideCharToMultiByte(CP_UTF8, 0,
>> line.FileName, -1,
>> + filename_utf8,
>> filename_len,
>> + NULL, NULL);
>> +
>> + appendStringInfo(&errtrace,
>> + "\n%s+0x%llx [%s:%lu]",
>> + utf8_buffer,
>> + (unsigned long long)
>> displacement,
>> + filename_utf8,
>> + (unsigned long)
>> line.LineNumber);
>> +
>> + pfree(filename_utf8);
>> + }
>> + else
>> + {
>> + appendStringInfo(&errtrace,
>> + "\n%s+0x%llx [0x%llx]",
>> + utf8_buffer,
>> + (unsigned long long)
>> displacement,
>> + (unsigned long long)
>> address);
>> + }
>>
>> Maybe I missed something but is there a reason for not adding the
>> address in the
>> first condition?
>>
>
> No particular reason. I was trying to keep it concise when we have file/
> line info, but for consistency it probably should be there.
>
> Will send v5 with these fixes.
>
>>
>> [1] https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/ns-
>> dbghelp-symbol_infow
>> [2] https://www.postgresql.org/docs/current/error-style-guide.html
>> [3] https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-
>> dbghelp-symfromaddrw
>> [4] https://learn.microsoft.com/en-us/windows/win32/intl/conventions-
>> for-function-prototypes
>>
>>
>
> Thanks again for the review,
> Bryan
Hi all,
v5 patch attached, incorporating all of Euler's feedback with a caveat
around unicode.
The most interesting aspect turned out to be the encoding conversion for
symbol names and file paths. Initially I tried using the generic
SYMBOL_INFO and SymFromAddr functions as Euler suggested, but ran into a
subtle issue: on PostgreSQL's Windows builds, these become SYMBOL_INFOA
and SymFromAddrA (the ANSI versions), which return strings in whatever
Windows ANSI codepage happens to be active (CP1252, etc). This doesn't
necessarily match the database encoding.
When I tried converting these with wchar2char(), it failed because the
input wasn't actually wide characters - leading to backtraces showing
only raw addresses even though symbols were present.
The solution was to use the explicit Unicode versions (SYMBOL_INFOW and
SymFromAddrW), which reliably return UTF-16 strings that wchar2char()
can properly convert to the database encoding. This handles both UTF-8
and non-UTF-8 databases correctly, and wchar2char() gracefully returns
-1 on conversion failure rather than throwing errors during error
handling. Of course this also necessitated using IMAGEHLP_LINEW64 and
SymGetLineFromAddrW64.
Tested with both UTF-8 and WIN1252 databases - backtraces now show
proper symbol names in both cases.
This patch also adds a check for zero frames returned by backtrace() on
Unix/Linux platforms, which can occur in certain circumstances such as
ARM builds without unwind tables.
Bryan
| Attachment | Content-Type | Size |
|---|---|---|
| v5-0001-Add-Windows-support-for-backtrace_functions.patch | text/plain | 7.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | vignesh C | 2025-11-01 05:15:41 | Re: Logical Replication of sequences |
| Previous Message | David Rowley | 2025-11-01 03:29:30 | Re: another autovacuum scheduling thread |