Skip to content

DB corruption upon restart #598

@Resousse

Description

@Resousse

Description

Inside a percona pg-db pod, the database container restarted properly, and after the restart, database was corrupted. All my tables are encrypted using pg_tde.

Expected Results

A reindex can solve the issue, but I have the feeling (as it occurred 3-5 times the past week), that it occurs at any restart of the DB

Actual Results

Leading to the following error after the restart

tenants=# select * from sessions;
ERROR:  invalid page in block 0 of relation "base/16593/17596"

Version

Percona PG 18 installed via helm charts (both pg-db and pg-operator in 2.8.2). I don't know what is the version of pg_tde.

Steps to reproduce

Not always replicable, but one restart of the DB some writes in it

Relevant logs

6-01-26 11:30:00.297	
connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL:  the database system is shutting down
	2026-01-26 11:30:00.297	
psycopg2.OperationalError: connection to server at "localhost" (::1), port 5432 failed: timeout expired
	2026-01-26 11:30:00.297	
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
	2026-01-26 11:30:00.297	
  File "/usr/lib64/python3.9/site-packages/psycopg2/__init__.py", line 127, in connect
	2026-01-26 11:30:00.297	
    ret = _connect(*args, **kwargs)
	2026-01-26 11:30:00.297	
  File "/usr/lib/python3.9/site-packages/patroni/psycopg.py", line 136, in connect
	2026-01-26 11:30:00.297	
    conn = psycopg.connect(**kwargs)
	2026-01-26 11:30:00.297	
  File "/usr/lib/python3.9/site-packages/patroni/postgresql/connection.py", line 158, in get_connection_cursor
	2026-01-26 11:30:00.297	
    return next(self.gen)
	2026-01-26 11:30:00.297	
  File "/usr/lib64/python3.9/contextlib.py", line 119, in __enter__
	2026-01-26 11:30:00.297	
    with get_connection_cursor(**conn_kwargs) as cur:
	2026-01-26 11:30:00.297	
  File "/usr/lib/python3.9/site-packages/patroni/postgresql/__init__.py", line 1095, in get_replication_connection_cursor
	2026-01-26 11:30:00.297	
    return next(self.gen)
	2026-01-26 11:30:00.297	
  File "/usr/lib64/python3.9/contextlib.py", line 119, in __enter__
	2026-01-26 11:30:00.297	
    with self.get_replication_connection_cursor(**self.config.local_replication_address) as cur:
	2026-01-26 11:30:00.297	
  File "/usr/lib/python3.9/site-packages/patroni/postgresql/__init__.py", line 1100, in get_replica_timeline
	2026-01-26 11:30:00.297	
Traceback (most recent call last):
	2026-01-26 11:30:00.294	
2026-01-26 10:29:59,475 ERROR: Can not fetch local timeline and lsn from replication connection

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions