Business continuity and disaster recovery
Backbuild maintains a Business Continuity Plan (BCP) and Disaster Recovery (DR) program designed to keep the service available to customers and to recover operations quickly in the event of a significant disruption. The program covers technology, personnel, and communications.
Business Continuity Plan
- Maintained and reviewed annually: the BCP is owned by the security team, reviewed at least annually, and updated whenever a material change to the platform or organization warrants it.
- Scope: the plan covers disruptions to the platform, the office environment, key personnel, and critical vendors.
- Scenarios: documented scenarios include regional cloud outages, database incidents, loss of a key sub-processor, and extended connectivity issues.
Disaster Recovery procedures
- Runbooks: DR procedures are documented in internal runbooks that include recovery steps, decision points, and the roles responsible for each step.
- Recovery objectives: target RTO of 4 hours and target RPO of 1 hour. See availability.
- Failover: the managed database tier supports failover to replicas. Failover procedures are documented and rehearsed.
- Rebuild from backup: where failover is not sufficient, the platform can be rebuilt from encrypted off-site backups using documented procedures.
Backup strategy
- Daily full backups: the primary database is backed up daily.
- Hourly incremental backups: write-ahead log shipping produces hourly incremental backups that support point-in-time recovery.
- Off-site: backups are stored in object storage that is geographically separated from the primary database.
- Encryption: backups are encrypted at rest using the same standard as primary storage.
- Access: access to backup storage is restricted to authorized personnel and is audit logged.
Restore testing
Restore testing is conducted on a quarterly basis. Each test verifies that a recent backup can be restored to a clean environment, that the restored database passes integrity checks, and that application services can read and write against the restored data. Results — including any deviations from target recovery times — are documented in an internal runbook and feed back into program improvements.
Personnel redundancy
- On-call rotation: incident response is staffed by an on-call rotation with a minimum of two trained responders available at any time.
- Cross-training: critical operational knowledge is documented and multiple engineers are trained on each key procedure.
- Succession: key operational roles have documented deputies so that no single point of failure exists in the staffing model.
Customer communication during incidents
- Status page: when available, incident status will be published at
status.backbuild.ai. - Email notifications: customer-designated contacts are notified by email within one hour of incident declaration for incidents that materially affect service.
- Incident updates: updates are provided at regular intervals during an incident until resolution.
- Post-incident review: a written post-incident summary is provided to affected customers for significant incidents, including root cause, impact, timeline, and remediation.
Contact
Business continuity or DR questions: security@backbuild.ai