CPS-2169: NCMP to support 200k Cells (50KCM-handles)

CPS-2169: NCMP to support 200k Cells (50KCM-handles)

References

https://lf-onap.atlassian.net/browse/CPS-2169 

Issues & Decisions

Issue

Notes 

Decision

Issue

Notes 

Decision

1

Review the use case returning all 50k CM Handles

Validity of the use case where ALL 50k CMhandles are expected to be returned every time ?

AP @Csaba Kocsis & @kieran mccarthy 

 

Jan 30, 2025

  1. Returning all 50k cmhandles will impact the speed and overall performance. Focus on Read & Write

  2. Review usecases that don’t need to return all 50k cmhandles. @kieran mccarthy

May 13, 2025 @Toine Siebelink All KPIS are now signed off.

2

Proposal to remove Hazelcast for NCMP Module sync

The use of Hazelcast during NCMP's CM-handle Module Sync is leading to:

  1. High memory usage during CM-handle registration

  2. Consistency problem

  3. Poor load balancing between NCMP instances for module sync

Proposal to remove hazelcast from NCMP Module Sync requires an LCM State Machine Change. Need stakeholder input.

Proposal:  CPS-2161: Remove Hazelcast from NCMP Module Sync

Mar 26, 2025 @Daniel Hanrahan @Toine Siebelink

3

Impacts CM Event Storm

Wil requires special test setup (maybe once off testing in CPS Team) to asses impact of CM Event Storm on KPIs

May 12, 2025 @Csaba Kocsis @kieran mccarthy All KPIs should still be met during a CM Event Storm.
See https://lf-onap.atlassian.net/browse/CPS-2808

Requirements

Characteristics (only)

Operations

Parallel
/ Sequential

Performance

Notes

Status Apr 24, 2025

Signoff

Operations

Parallel
/ Sequential

Performance

Notes

Status Apr 24, 2025

Signoff

1

Registration & De-registration (Discovery) In Batch size of 100 per request.

  • With moduleSetTag &

  • With alternateID

Parallel

  1. How long should it take cmhandle to reach in ready state - 11 Cm Handles/second per NCMP instance.

  2. Overall budget; @Peter Turcsanyi
    a. Advise → locked → Ready: ?
    b. Advise → Ready: ?

  3. Initial delay ; Currently 2 min to move to → 3min

 

  1. Registration is about 5 time faster

  2. De-reg is about 9.5 time faster

Adjust both Blue line

Apr 28, 2025

Agreed with @kieran mccarthy @Csaba Kocsis

2

CM-handle ID Search

5 Parallel

  1. No frequency increase expected

  2. Search shall return all 50k searches ?

  3. Searches shall return all 50k searches within 1 minute max TBD

Aug 28, 2024 

  1. 5 parallel request of ID search and search, = combined total of 10 parallel search requests.

Jan 30, 2025

@kieran mccarthy

  1. These test now include returning AlternateIds

  2. KPI have not been adjusted for network size

  3. Module filter NOT currently meeting the KPI, need to renegotiate the numbers for this - Returning all 50k cmhandles

Apr 29, 2025

Agree to have 10% above for every 10k cm handles

50K: 2 Sec + 3 * 0.2 Sec = 2.6 Sec

@kieran mccarthy @Csaba Kocsis

3

CM-handle search

5 Parallel

  1. No frequency increase expected

  2. Search shall return all 50k searches ?  

  3. Searches shall return all 50k searches within 1 minute max TBD

Aug 28, 2024 

  1. 5 parallel request of ID search and search, Total of 10 parallel search requests.

Jan 30, 2025

No change TBD AP; @Peter Turcsanyi @kieran mccarthy

  1. KPI have not been adjusted for network size

  2. Returning all 50k cmhandles

  3. Returning ALL public properties.

  4. Roughly about 30% below the KPI

Apr 29, 2025

Agreed to have 20% above every 10k cm handles

50K: 15 Sec + 3 * 3 Sec = 24 Sec

4

Synchronous single CM-handle pass-through Read (CUD)

4 parallel Operations

Review; the current FS numbers

  1. What will be frequencies for write req. ? AP: TBD

  2. Frequencies will increase (2.5); 10 req/sec= 25 req/sec

  3. Response time should stay the same as current - NCMP Characteristics

Nov 19, 2024

  1. Stick with 10 req/sec= 25 req/sec (Should CPS PoC it?) to measure the current no of req.

  2. CPS measure current capacity

  3. CPS will engage in Setting the frequency (Based on previous result)

  4. No changes to Delay is expected

 

Jan 30, 2025

Policy executor have different test suit and won’t be impacted by this

@kieran mccarthy

  1. Overhead based KPIs, well overperforming

  2. Frequencies have been increased

  3. This includes Alternate Id input

Apr 28, 2025

Agreed with @Csaba Kocsis @kieran mccarthy

5

Synchronous single CM-handle pass-through Write

4 (Parallel operations)

  1. Frequencies will increase (2.5) currently 5 request/second for 80k cells = 12.5 req/sec for 200k cells

  2. Response time should stay the same as current NCMP Characteristics (delay, size)

Jan 30, 2025 @kieran mccarthy

  1. CPS measure current capacity

  2. CPS will engage in Setting the frequency (Based on previous result)

  3. Share with stakeholder

  4. No changes to Delay is expected

  5. No volume increase expected

 

  1. Overhead based KPIs, well overperforming

  2. Frequencies have been increased

  3. This includes Alternate Id input

Apr 28, 2025 Agreed with @kieran mccarthy @Csaba Kocsis

6

Batch Read Operation/Legacy 

 

  1. Frequencies will increase (2.5)

  2. Same load 200 per request (Same response size)

  3. Same duration of test, No changes to volume expected

  4. Responses are expected back in 80 sec.

Investigation needed because target is not met at this point

Jan 30, 2025

https://lf-onap.atlassian.net/browse/CPS-2607

  1. About 15 times faster here.

  2. Also, includes This includes Alternate Id input

  3.  

Apr 28, 2025 Agreed with @kieran mccarthy @Csaba Kocsis

7

CM change Notification Event

  1. NCMP shall support a CM notification load of 200 million CM change notifications per day with an average of 870 notification

  2. NCMP shall support a peak CM change notification load of 7k/s for a duration 5 minutes

 

  1. 2,300  / sec (base/Average load)

  2. 8.75 k/sec (peak load) TBC @kieran mccarthy

  3. Load distribution discussion still pending

Jan 30, 2025

Use number on FS and * 2.5 as the source of truth

  1. https://lf-onap.atlassian.net/wiki/spaces/DW/pages/16549678 , currently in development

  2. Apr 28, 2025

    1. Confirm actual KPI with @Csaba Kocsis @kieran mccarthy 2,660 avc event per sec. CPS current background load 2,750 avcs/sec. Can handle 5,000 avcs/sec see report [CPS-2787] Increase CM Notification Event KPI Background Load - Jira

May 12, 2025 @kieran mccarthy @Csaba Kocsis

 

CM Event Background Load Specification

80k cells
(20k CM Handles)

Base Rate
(AVCs/sec)

Peak Rate*
(AVCs/sec)

Storm Rate**
(AVCs/sec)

80k cells
(20k CM Handles)

Base Rate
(AVCs/sec)

Peak Rate*
(AVCs/sec)

Storm Rate**
(AVCs/sec)

AVC

368

752

4,744

Create/Delete

288

312

312

Total

656

1,064

5,056

200k cells (50k cmhandles)

AVC

920

1,880

11,860

Create/Delete

720

780

780

Total

1,640

2,660

12,640

* Peak Rate happens 96 times a day for 5 minutes each time.
** Storm Rate happens 6 times a day for 6 minutes each time.

Out of Scope

  • Datajob which is still in development are out of scope

  • For now Paging is out of scope, open to rescope depending on NCMP spike result