CPS-438 Make DB Schema Updates & Data Population More Robust for Kubernetes Environments
References
https://lf-onap.atlassian.net/browse/CPS-438
Background
cps container attempts restart if it takes too long. Liquibase does not release the changelog lock on the data table if it gets restarted.
Possible Fixes:
Name | Description | Cost/Maintainability | Agnostic of database technology | Separation of NCMP Data | Upgradability/ Rollback | Additional Pros/Cons | One instance does initialisation | |
|---|---|---|---|---|---|---|---|---|
| 1 | Liquibase init container |
|
| Yes including Neo4J | Is possible but needs some refactoring (Labeling) | Good | Good control of database versioning | Yes |
| 2 | Change/Add liveness probes? | Liquibase container is restarted by Kubernetes as it does not read a readiness probe within a certain amount of time. We could extend the time limit, change the restart condition etc... |
| No change | No change | Good |
| Yes |
| 3 | Start up probe? | Using the start up probe we can define a worst case start up time which kubernetes will wait for before restarting the container |
| No change | No change | Good |
| Yes |
| 4 | Remove Liquibase and replace with similar technology | Replace Liquibase with Flyway |
| Flyway does not support NoSQL: Neo4J | Possible | Good | Would solve https://lf-onap.atlassian.net/browse/CPS-963 Might come with same issue as Liquibase as is more of a Kubernetes issue? | Yes |
| 5 | Use cps-core API | Trigger code/script triggered by springboot that will persist the required data |
| Yes including Neo4J | Easy | Requires some code | Would solve https://lf-onap.atlassian.net/browse/CPS-963 Do we need database migration technologies? Rollback etc | Yes |
| 6 | Use Session lock instead of transaction lock for Liquibase | https://mvnrepository.com/artifact/com.github.blagerweij/liquibase-sessionlock/1.2.5 Changeloglock will be dropped once session is dropped by Liquibase container |
| Yes including Neo4J | Is possible but needs some refactoring (Labeling) | Good |
| No |
| 7 | Execute Liquibase logic in Spring Boot Service Start Up | Solution 3: https://localcoder.org/how-to-solve-liquibase-waiting-for-changelog-lock-problem-in-several-pods-in-ope Liquibase start up is contained within CPS start up so can avoid kubernetes Liquibase setup |
| Yes including Neo4J | Is possible but needs some refactoring (Labeling) | Good | Springboot supported solution | Yes |
| 8 | Pre stop hook? | Remove Changeloglock before CPS container restart occurs |
| Yes including Neo4J | Is possible but needs some refactoring (Labeling) | Good |
| No |
| 9 | Move liveness probes before liquibase | Start the liveness probes before Liquibase starts |
| Yes including Neo4J | Is possible but needs some refactoring (Labeling) | Good |
| Yes |
Resolution
Agreed on implementation of solution #3 resulting in https://lf-onap.atlassian.net/browse/CPS-1011
This involves updates to the oom project pending its upgrade to Kubernetes 1.20+
Also an update of our documentation to demonstrate how to implement this change for Kubernetes. A recommended startup time should be proposed based on Liquibase start times. https://lf-onap.atlassian.net/browse/CPS-1013
Liquibase performance will be reviewed and table above may be referred back to for solutions. https://lf-onap.atlassian.net/browse/CPS-1012