/
Tips on tuning akka.conf and the ...datastore.cfg files for clustering

Tips on tuning akka.conf and the ...datastore.cfg files for clustering

2019-05-13: Original post

jhartley / at - luminanetworks \ com – for any questions



Goals for akka.conf:

  1. Specify THIS cluster member's resolvable FQDN or IP address. (Tip: Use FQDNs, and ensure they're resolvable in your env.)

  2. Name the list of all cluster members in the seed-nodes list.

  3. Tune optional variables, noting that the defaults for many of these are far too low.

  4. Keep this file ~identical on all instances; only the "roles" and "hostname" are unique to this member.



Example of a 3-node configuration, tuned:

odl-cluster-data { akka { loglevel = "" remote { netty.tcp { hostname = "odl1.region.customer.com" port = 2550 }, use-passive-connections = off } actor { debug { autoreceive = on lifecycle = on unhandled = on fsm = on event-stream = on } } cluster { seed-nodes = [ "akka.tcp://opendaylight-cluster-data@odl1.region.customer.com:2550", "akka.tcp://opendaylight-cluster-data@odl2.region.customer.com:2550", "akka.tcp://opendaylight-cluster-data@odl3.region.customer.com:2550" ] seed-node-timeout = 15s roles = ["member-1"] } persistence { journal-plugin-fallback { circuit-breaker { max-failures = 10 call-timeout = 90s reset-timeout = 30s } recovery-event-timeout = 90s } snapshot-store-plugin-fallback { circuit-breaker { max-failures = 10 call-timeout = 90s reset-timeout = 30s } recovery-event-timeout = 90s } } } }



Goals for org.opendaylight.controller.cluster.datastore.cfg:

  1. This is a HOCON-style config file, so subsequent entries replace earlier entries.

  2. The goal here is to significantly reduce the race-condition that is present when starting all members of a cluster, and the race-condition any freshly restarted or "cleaned" member has when rejoining.



### Note: Some sites use batch-size of 1, not reflecting that here### persistent-actor-restart-min-backoff-in-seconds=10 persistent-actor-restart-max-backoff-in-seconds=40 persistent-actor-restart-reset-backoff-in-seconds=20 shard-transaction-commit-timeout-in-seconds=120 shard-isolated-leader-check-interval-in-millis=30000 operation-timeout-in-seconds=120



Goals for module-shards.conf:

  1. Name which members retain copies of which data shards.

  2. These shard name fields are the 'friendly" names assigned to the explicit namespaces in the modules.conf.

  3. In a K8S/Swarm environment, it's easiest to keep this identical on all members.  Unique shard replication (or isolation) strategies are for another document/discussion, and require non-trivial planning.



module-shards = [ { name = "default" shards = [ { name="default" replicas = [ "member-1" "member-2" "member-3" ] } ] }, { name = "topology" shards = [ { name="topology" replicas = [ "member-1" "member-2" "member-3" ] } ] }, { name = "inventory" shards = [ { name="inventory" replicas = [ "member-1" "member-2" "member-3" ] } ] }, ] ...thus, for example, it would be legitimate to have a single simple entry that ONLY includes "default" if desired. Thus there would only be default-config and default-operational, plus some of the auto-created shards.