HA CSIT Testing

This page contains a list of possible tests that can be generally used to test ACM and Participants HA functionality before delivery. The tests are listed below. After meeting about this document - several comments have been added, which give direction to some tests and eliminate others. A list will be added at the end of the page for tests that should be an immediate priority.

Functional Tests

Basic Functionality Test:

  • Verify that all microservices work correctly with a single replica. 

  • Increase the number of replicas and ensure all functionality still works as expected. 

Message Processing Test:

  • Ensure that messages sent via Kafka are consumed and processed correctly by multiple replicas. 

  • Validate that there are no duplicate message processing or message loss. 

  • Ensure runtime policy operation topic sends to only one replica. 

  • Ensure sync topic sends to all replicas 

 

Database Access Test: - Maybe 

  • Verify that the microservice accessing the database handles concurrent access correctly with multiple replicas. 

  • Ensure that read and write operations are consistent and correct. 

Data Consistency Test: 

  • Test that the data remains consistent across multiple replicas when performing CRUD operations for ACM through REST API.

Performance Tests – add testing with replicas... 

Load Testing: 

  • Simulate high load on the microservices and monitor their performance. 

  • Verify that the system can handle the load with multiple replicas without degradation. 

  • Comparison of this at the end of testing will be useful 

Throughput Testing: 

  • Measure the throughput of message processing with different numbers of replicas. 

  • Ensure that increasing replicas improves or maintains the throughput. 

Reliability and Scalability Tests 

Replica Scaling Test: 

  • Dynamically scale the number of replicas up and down and ensure the system continues to function correctly. 

  • Verify that new replicas start correctly and join the existing system seamlessly. 

Failover Test: 

  • Simulate failure of one or more replicas and verify that the system continues to function correctly. 

  • Ensure that there is no data loss and in-flight messages are processed by other replicas. 

  • Test that, after crashing a participant, and a new replica comes up –operations still work as expected. 

  • Test that, after crashing a participant replica, in the middle of operations like DEPLOY, and a new replica comes up – confirm the user can successfully re-run the operation. 

 

Integration Tests 

End-to-End Testing: 

  • Test the entire workflow involving all microservices with multiple replicas. 

  • Verify that the communication between microservices over Kafka is correct and the system meets the expected behavior. 

Consistency and Synchronization Tests 

Session Management Test: 

  • Verify that user sessions are managed correctly across multiple replicas. 

  • Ensure that session data is consistent and available to all replicas. 

Distributed Locking Test: 

  • Test any distributed locking mechanisms to ensure they work correctly with multiple replicas. 

  • Verify that locks are acquired and released properly to prevent data corruption or race conditions. 

Monitoring and Logging Tests 

Logging and Monitoring Test: 

  • Ensure that logging works correctly across all replicas. 

  • Verify that monitoring tools correctly report the status and metrics of each replica. 

Health Check and Auto-recovery Test: 

  • Implement and test health checks to ensure that unhealthy replicas are detected and handled correctly. 

  • Verify that auto-recovery mechanisms (like restarting failed replicas) work as expected. 

Regression Tests 

Backward Compatibility Test: 

  • Ensure that the new multi-replica setup does not break any existing functionality. 

  • Run all existing test cases to verify backward compatibility. 

Stress Tests 

Stress Testing: 

  • Push the system to its limits with maximum load and number of replicas. 

  • Observe the system's behavior and ensure it degrades gracefully without crashing.

Priority Tests

List tests here