Corruption detection

From PostgreSQL wiki

Revision as of 22:43, 19 May 2012 by Boshomi (Talk | contribs)

Jump to: navigation, search

Contents

Introduction

Detecting storage layer corruption inside PostgreSQL as early as possible is important. If left undetected, it could not only lead to wrong answers, it can also end up being copied during a base backup (affecting backups and replication). That is a major problem because users will get a false sense of confidence relying on their backup schedule.

Types of Corruption

Theoretically, corruption can happen in many areas of hardware, including the memory or even the CPU. However, for this article we focus on corruption happening between the time a page leaves the address space of the backend process until the time it is read back in.

Minor Challenges

Utilities and Background Checking

Utilities (such as pg_basebackup) should be updated to verify pages while creating the backup. Also, users might want a continuously-running daemon to slowly validate cold data.

Detecting Transposed Data

Zero Pages

Major Challenges

Upgrade and On/Off

Torn Pages

Personal tools