Redis Backup
Redis is used for caching, session storage, and rate limiting in the MATIH platform. While Redis data is generally ephemeral and can be rebuilt, session data and rate limit state benefit from periodic backups to minimize disruption during recovery.
Backup Strategy
| Method | Frequency | RPO | Use Case |
|---|---|---|---|
| RDB Snapshots | Every 6 hours | 6 hours | Full point-in-time snapshot |
| AOF Persistence | Continuous | Seconds | Append-only file for durability |
RDB Snapshots
Redis saves RDB snapshots to disk based on configured save points:
save 3600 1 # Save if at least 1 key changed in 3600 seconds
save 300 100 # Save if at least 100 keys changed in 300 seconds
save 60 10000 # Save if at least 10000 keys changed in 60 secondsBackup to Object Storage
RDB files are periodically copied to object storage by the backup automation.
AOF Persistence
For deployments requiring lower RPO, AOF (Append Only File) persistence can be enabled:
appendonly yes
appendfsync everysecThis provides near-zero data loss at the cost of higher disk I/O.
Restore Procedures
From RDB Snapshot
- Stop the Redis instance
- Replace the
dump.rdbfile with the backup - Start Redis -- it automatically loads the RDB on startup
- Verify key counts and application connectivity
From AOF
- Stop the Redis instance
- Replace the
appendonly.aoffile with the backup - Start Redis with AOF replay
- Verify data integrity
What Is Stored in Redis
| Data | TTL | Impact of Loss |
|---|---|---|
| Session tokens | 24 hours | Users must re-authenticate |
| API response cache | 5-60 minutes | Temporary performance degradation |
| Rate limit counters | 1-60 minutes | Rate limits temporarily reset |
| Feature flag cache | 60 seconds | Brief re-computation of flag values |
| Permission cache | 300 seconds | Brief re-evaluation of permissions |
Recovery Priority
Redis data loss is generally low-impact since all data has a TTL and can be regenerated. The priority is to restore Redis availability rather than data:
- Restart the Redis pod
- Verify connectivity from application services
- Monitor cache hit rates to confirm cache warming
- If session data was lost, expect a brief spike in authentication requests