Know Gaps & Issues - Casablanca
The most noticeable gap we have to get over is that currently, Holmes uses pre-defined control loop names to achieve closed loop related operations. Ideally, all control loop names should be generated dynamically during the closed loop design phase in CLAMP. But Holmes hasn't been fully integrated with CLAMP yet. So we have to write the rules for different scenarios in the component specification file of DCAE in advance and during the design time, DCAP passes these default rules into CLAMP. Then CLAMP edits the rules using a simple text editor, prepending a pre-defined control loop name to the rules and pass them to Holmes via DCAE-CBS. Since rules are hardcoded in the DCAE component specification files, it lacks the flexibility to design rules. Besides, if we keep on using pre-defined control loop names, it'll cause great limits for flexible lifecycle control for control loops.
Separate the DSA from the engine management module.
Optimize the mechanism for auto-scaling of then engine management module. Currently, when it comes to scaling scenarios, every single instance of Holmes engines keeps a totally identical alarm set and evenly distributed rules. This does not help improve the performance and reduce the payload a lot. What's worse, it may cause an infinite loop when it comes to the scale-out scenario until the instance number reaches the configuration limitation of Kubernates. We should consider filter out alarms using DSA, letting only alarms those fulfill the criteria in the rules residing within the corresponding engine instance through. This will reduce the number of useless alarms significantly.
Improve the user experience of Holmes rule editor. The rule editor of Holmes has been added to the rule management module. But it's just a simple text editor. If we could provide keyword highlighting, auto-completion and auto-alignment features to the existing text editor, it would make the rule editor much more user-friendly.