CPS-2408 Analyze Hazelcast Instance Configuration

References

CPS-2408: Analyze the Hazelcast config in regards to the members per instanceClosed

Current Instance Configuration

 

  • As per the current configuration,

    • We have single cluster - exposed using CPS_NCMP_CACHES_CLUSTER_NAME and defaults to cps-and-ncmp-common-cache-cluster

    • Each data structure has its own instance per JVM. So for the below case we have 6 instance per JVM.

Data Structure Name

Configuration Name

Instance Name

Data Structure Name

Configuration Name

Instance Name

1

moduleSyncWorkQueue

defaultQueueConfig

moduleSyncWorkQueue

2

moduleSyncStartedOnCmHandles

moduleSyncStartedConfig

moduleSyncStartedOnCmHandles

3

dataSyncSemaphores

dataSyncSemaphoresConfig

dataSyncSemaphores

4

trustLevelPerCmHandle

trustLevelPerCmHandleCacheConfig

hazelcastInstanceTrustLevelPerCmHandleMap

5

trustLevelPerDmiPlugin

trustLevelPerDmiPluginCacheConfig

hazelcastInstanceTrustLevelPerDmiPluginMap

6

cmNotificationSubscriptionCache

cmNotificationSubscriptionCacheMapConfig

hazelCastInstanceCmNotificationSubscription

 

current_config-20241008-200233.png

 

 

image-20241009-084724.png

 

Target Instance Configuration

 

  • What I want to achieve,

    • Cluster would still be the same but want to limit the number of instances.

    • We will have a single Hazelcast instance that has all the configurations of different data structures and we can get hold off a datastructure when needed using that instance.

    • It will be a single instance per JVM.

  • How we can achieve,

    • By having a common configuration i.e master configuration for all the type of data structure that we need. We still have same config. but we are giving it different names everytime.

    • CPS_NCMP_INSTANCE_CONFIG_NAME is the environment variable to set the common config name and it defaults to cps-and-ncmp-hz-instance-config.

    • During initialization check if the configuration is present or not, if present return the initialized instance to get hold off the required data structures.

 

 

 

 

 

Impacts

  • There would be just one member per JVM now as opposed to 6 instances per JVM.

  • For multiple instance setup , lets say 2 instances , with new setup we will have 2 members as opposed to 12 members.

    • Less members , less number of TCP connections in between them.

    • More members , more chatty the members are in order to replicate the data or choose the leader partition etc.

  • Less members meaning less data to replicate , and since everything is in memory hence expecting some heap space to be free.

  • Number of exposed ports should be just one per JVM as opposed to 6 per JVM. ( may need to change the charts config to free up the exposed ports )

 

Performance Results

Before change

Parameter

Instance#1

Instance#2

Parameter

Instance#1

Instance#2

Max Heap Usage

706MB

869MB

Max Number of Threads

330

324

Max CPU Usage

99.2%

88.6%

Startup Time

45secs

55secs

After change

Parameter

Instance#1

Instance#2

Parameter

Instance#1

Instance#2

Max Heap Usage

989MB

898MB

Max Number of Threads

118

118

Max CPU Usage

100%

100%

Startup Time

18secs

28secs

 

K6 Performance Comparison

#

Test Name

Unit

Fs Requirement

Current Expectation

Actual(before_fix)

Actual(after_fix)

%Change

#

Test Name

Unit

Fs Requirement

Current Expectation

Actual(before_fix)

Actual(after_fix)

%Change

0

HTTP request failures for all tests

rate of failed requests

0

0

0

0

0%

1

Registration of CM-handles

CM-handles/second

22

110

236.678

222.497

-6%

2

De-registration of CM-handles

CM-handles/second

22

90

394.921

411.514

4%

3a

CM-handle ID search with No filter

milliseconds

2000

400

540.934

177.376

-205%

3b

CM-handle ID search with Module filter

milliseconds

2000

200

276.204

88.233

-213%

3c

CM-handle ID search with Property filter

milliseconds

2000

1300

1384.904

451.196

-207%

3d

CM-handle ID search with Cps Path filter

milliseconds

2000

1300

1406.466

449.769

-213%

3e

CM-handle ID search with Trust Level filter

milliseconds

2000

10000

10553.261

2107.99

-401%

4a

CM-handle search with No filter

milliseconds

15000

14000

14407.047

3313.815

-335%

4b

CM-handle search with Module filter

milliseconds

15000

16000

16569.609

3963.851

-318%

4c

CM-handle search with Property filter

milliseconds

15000

16000

17245.757

4277.553

-303%

4d

CM-handle search with Cps Path filter

milliseconds

15000

16000

5188.456

4306.489

-20%

4e

CM-handle search with Trust Level filter

milliseconds

15000

26000

26429.774

5939.964

-345%

5a

NCMP overhead for Synchronous single CM-handle pass-through read

milliseconds

40

30

745.818

16.33

-4467%

5b

NCMP overhead for Synchronous single CM-handle pass-through read with alternate id

milliseconds

40

60

780.423

26.794

-2813%

6a

NCMP overhead for Synchronous single CM-handle pass-through write

milliseconds

40

30

1606.115

17.382

-9140%

6b

NCMP overhead for Synchronous single CM-handle pass-through write with alternate id

milliseconds

40

60

1639.396

29.645

-5430%

7

Legacy batch read operation

events/second

150

1500

5771.839

3882.515

-49%