| From: | 第108次明天 <87326549(at)qq(dot)com> |
|---|---|
| To: | pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
| Subject: | [BUG] PostgreSQL 14.5: Data corruption in table tb_workstation_message (possible fstrim-related on eMMC) |
| Date: | 2025-12-18 06:42:56 |
| Message-ID: | tencent_BA7C5BFF421FD1965E088C9E12CDD7EB6A06@qq.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
To: pgsql-bugs(at)postgresql(dot)org
Subject: [BUG] PostgreSQL 14.5: Data corruption in table tb_workstation_message (possible fstrim-related on eMMC)
Environment Information
PostgreSQL Version: 14.5
OS: Debian GNU/Linux 11 (bullseye)
Architecture: aarch64 (8-core CPU)
Memory: 8GB
Storage Medium: eMMC
Problem Description
Data corruption occurred in the tb_workstation_message table (record guid: ce04bd3f-232d-4c7f-9a91-540d0e581649). Queries targeting this record fail with:
plaintext
ERROR: invalid memory alloc request size 18446744073709551613
Corruption details of the record:
Only the guid field remains valid;
All int-type fields (e.g., upload_status) are set to 0;
All timestamp-type fields (e.g., create_time) are reset to 2000-01-01 00:00:00;
String-type fields (e.g., operator, content) are inaccessible (queryable only after forcing UPDATE to null).
Timeline
Record insertion: 2025-11-25 21:23:51
Last modification: 2025-11-27 09:41:10
Anomaly window: 2025-12-01 03:00:00 ~ 09:02:31 (record was intact before 03:00:00)
First error detected: 2025-12-01 09:02:31 (triggered by a 30s-interval business cron job querying upload_status = 0)
Error Output (Query Example)
sql
postgres=# select * from tb_workstation_message where guid = 'ce04bd3f-232d-4c7f-9a91-540d0e581649'; ERROR: invalid memory alloc request size 18446744073709551613
Investigation Details
Database Logs: No output between 2025-12-01 05:37:05.932 ~ 09:02:31.533 (no abnormalities found).
Dynamic Library Check: libpq.so had environment variable misconfiguration (not pointing to PG install dir), but the business app does NOT depend on this library — low corruption probability.
eMMC Storage Analysis:
Used dd to extract the corrupted block; hex analysis shows only guid bytes are valid (others are 0, confirming physical data corruption).
fstrim Check:
fstrim executed at 2025-12-01 00:27 (asynchronous, runs on disk idle); anomaly occurred after this.
Reproduction attempt: Simulated data insertion + 10+ fstrim runs — corruption not reproduced.
Impact
Business failure: Query errors trigger business alerts.
Backup failure: Database backups are affected by corrupted data.
Additional Notes
No system/app/PG restarts during the anomaly window.
No writes to tb_workstation_message after 2025-12-01 05:34:00.
No online operations/terminals between 05:34:00 ~ 09:02:31.
Request:
We suspect this may relate to fstrim on eMMC (with PG 14.5) but cannot reproduce it. Could the community help analyze the root cause, or guide further troubleshooting?
第108次明天
87326549(at)qq(dot)com
从QQ邮箱发来的超大附件
postgresql_2025-12-01_000000(1).csv (739.5KB, 2026年1月17日 14:42) 进入下载页面 :https://wx.mail.qq.com/ftn/download?func=3&k=c6cd4a393619073df4be1b3961663338e7bf3e3963663338111d585f4c530b5e0e05060d534b57090f54180d025650155805500f4e5f030f0153575d5a5302590b1135490c15475f4b5746480f3901080b071808514b03096602050953560310081b1b5a1010272c6b8d6cf59bcb17826e22a5c9c7abe665fb3a453863663338393235&key=c6cd4a393619073df4be1b3961663338e7bf3e3963663338111d585f4c530b5e0e05060d534b57090f54180d025650155805500f4e5f030f0153575d5a5302590b1135490c15475f4b5746480f3901080b071808514b03096602050953560310081b1b5a1010272c6b8d6cf59bcb17826e22a5c9c7abe665fb3a453863663338393235&code=9259cf38&from=
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Laurenz Albe | 2025-12-18 08:40:56 | Re: [BUG] PostgreSQL 14.5: Data corruption in table tb_workstation_message (possible fstrim-related on eMMC) |
| Previous Message | Tom Lane | 2025-12-17 21:37:24 | Re: BUG #19358: Short circuit optimization exists in generic plan but missed in custom plan |