Related links
- SDNC : Beijing : Geo-redundant SDNC : Manual Failover
- SDNC: Geo-redundancy Enhancements
- SDN-C Geo-Redundancy Notifications
- Geo-Redundancy Flows in SDN-C
- Potential challenges of deploying SDN-C Geo-redundancy solution under Kubernetes Federation
- OOM Geo-Red Active-Active via Affinity/AntiAffinity
- Deployment of Geo-Redundant SDN-C
Table of Contents
Levels of Redundancies
- Level 0: no redundancy
- Level 1: support manual failure detection & rerouting or recovery within a single site; expectations to complete in 30 minutes
- Level 2: support automated failure detection & rerouting
- within a single geographic site
- stateless components: establish baseline measure of failed requests for a component failure within a site
- stateful components: establish baseline of data loss for a component failure within a site
- Level 3: support automated failover detection & rerouting
- across multiple sites
- stateless components
- improve on # of failed requests for component failure within a site
- establish baseline for failed requests for site failure
- stateful components
- improve on data loss metrics for component failure within a site
- establish baseline for data loss for site failure
Level 3 redundancy → Geo-Redundancy
Geo-redundancy types
active / standby
cold standby
After health check failure detection, the administrator manually powers on the standby components and configures the all affected components. Stateful componentes are initialted with the latest backup.
warm standby
Resources of the standby components are allocated and will be periodically powered on for synchronization of the stateful components. After health check failure detection, the administrator manually configures the all affected components.
hot standby
Resources of the standby components are allocated and powered on for periodic synchronization of the stateful components. After automatic health check failure detection, algorithms automatically configure the all affected components
active / active
Cluster
High-availability clusters synchronize data of the cluster nodes almost in real-time. If one node fails the cluster remains full functional.
Such setup requires low latency between the different geo-loactions, which is not likly in production deployments.
Stay independent and synch changes (???)
Let's assume there are two independent systems starting from scrach, and all the databases are filled with the same init data. The data in the databases are in-synch as long as all updates are the same on both systems. With respect to data from the network there are usually mechnismes to ensure this - Event doublication to both systems and auto-refresh using polling. The question is: how to synch SET requests from protals by users or other system northbound applications (e.g. planning tools).