UTF-8 on Postgres wire protocol

From: Rui Pacheco <rui(dot)pacheco(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: UTF-8 on Postgres wire protocol
Date: 2016-12-21 23:25:40
Message-ID: E0BE9CB9-A0B9-4373-931E-20E5E1F98BED@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I’m toying around with the wire protocol and came across something I don’t understand.

I created a table with two columns, one called “id” and one called “señor”. When I select from that table I get the list of columns and while its fairly easy to identify the column with the name “id”, I’m not sure how to identify the other column:

So this would be the ID column:

[…]
[7] = 0x69
[8] = 0x64
[9] = 0x00
[10] = 0x00
[11] = 0x00
[12] = 0x4f
[13] = 0x08
[14] = 0x00
[15] = 0x01
[16] = 0x00
[17] = 0x00
[18] = 0x00
[19] = 0x17
[20] = 0x00
[21] = 0x04
[22] = 0xff
[23] = 0xff
[24] = 0xff
[25] = 0xff
[26] = 0x00
[27] = 0x00
[…]

And this señor:
[47] = 0x01
[48] = 0x03
[49] = 0x00
[50] = 0x00
[51] = 0x73
[52] = 0x65
[53] = 0xc3
[54] = 0xb1
[55] = 0x6f
[56] = 0x72
[57] = 0x00
[58] = 0x00
[59] = 0x00
[60] = 0x4f
[61] = 0x08
[62] = 0x00
[63] = 0x08
[64] = 0x00
[65] = 0x00
[66] = 0x04
[67] = 0x13
[68] = 0xff
[69] = 0xff
[70] = 0x00
[71] = 0x00
[…]

What are the 4 bytes that precede the word señor? In other words, if I were to parse this, how would I know where the column name begins and ends?

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2016-12-21 23:27:54 Re: Disabling inheritance with query.
Previous Message James Zhou 2016-12-21 23:24:09 Re: How well does PostgreSQL 9.6.1 support unicode?