Disaster Recovery for Multi-Datacenter Apache Kafka® Deployments

Introduction

Datacenter downtime and data loss can result in businesses losing a vast amount of revenue or entirely halting operations. To minimize the downtime and data loss resulting from a disaster, enterprises can create business continuity plans and disaster recovery strategies.

A disaster recovery plan often requires multi-datacenter Apache Kafka® deployments where datacenters are geographically dispersed. If disaster strikes—catastrophic hardware failure, software failure, power outage, denial of service attack or any other event that causes one datacenter to completely fail—Kafka continues running in another datacenter until service is restored. A multi-datacenter solution with a disaster recover plan ensures that your event streaming applications continue to run even if one datacenter fails.

This white paper provides a general overview of a multi-datacenter solution based on the capabilities of Confluent Platform, the leading distribution of Apache Kafka®. Confluent Platform provides the building blocks for:

  • Multi-datacenter designs
  • Centralized schema management
  • Prevention of cyclic repetition of messages
  • Automatic consumer offset translation