Skip to content

Adding unique indexes causes data loss even when using PanicOnWarnings #1636

@ggilder

Description

@ggilder

gh-ost can silently lose data when adding a unique index to a column during a migration, even with the PanicOnWarnings flag enabled. This can occur when:

  1. A migration adds a unique index to a column (e.g., email)
  2. Rows with duplicate values are inserted into the original table after the
    bulk copy phase completes (during postponed cutover)
  3. These duplicate rows are applied via binlog replay to the ghost table

Expected behavior: Migration fails with clear error
Actual behavior: Original rows with duplicate values silently deleted, data lost

Example:

Original table: id PRIMARY KEY, email (no unique constraint)
Ghost table: id PRIMARY KEY, email UNIQUE (being added)

Initial state (after bulk copy):

During postponed cutover:

Binlog replay attempts:

  • INSERT (id=3, email='bob@example.com') into ghost table
  • Duplicate email='bob@example.com' (conflicts with id=1)
  • Row with id=1 silently deleted → DATA LOSS

This is not caught by PanicOnWarnings because gh-ost uses REPLACE for binlog replay, which does not produce warnings or errors when overwriting a row that's considered a duplicate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions