2019-05-13: Original post
jhartley@luminanetworks.com – for any questions
Goals for akka.conf:
- Specify THIS cluster member's resolvable FQDN or IP address. (Tip: Use FQDNs, and ensure they're resolvable in your env.)
- Name the list of all cluster members in the seed-nodes list.
- Tune optional variables, noting that the defaults for many of these are far too low.
- Keep this file ~identical on all instances; only the "roles" and "hostname" are unique to this member.
Example of a 3-node configuration, tuned:
odl-cluster-data { akka { loglevel = "" remote { netty.tcp { hostname = "odl1.region.customer.com" port = 2550 }, use-passive-connections = off } actor { debug { autoreceive = on lifecycle = on unhandled = on fsm = on event-stream = on } } cluster { seed-nodes = [ "akka.tcp://opendaylight-cluster-data@odl1.region.customer.com:2550", "akka.tcp://opendaylight-cluster-data@odl2.region.customer.com:2550", "akka.tcp://opendaylight-cluster-data@odl3.region.customer.com:2550" ] seed-node-timeout = 15s roles = ["member-1"] } persistence { journal-plugin-fallback { circuit-breaker { max-failures = 10 call-timeout = 90s reset-timeout = 30s } recovery-event-timeout = 90s } snapshot-store-plugin-fallback { circuit-breaker { max-failures = 10 call-timeout = 90s reset-timeout = 30s } recovery-event-timeout = 90s } } } }
Goals for org.opendaylight.controller.cluster.datastore.cfg:
- This is a HOCON-style config file, so subsequent entries replace earlier entries.
- The goal here is to significantly reduce the race-condition that is present when starting all members of a cluster, and the race-condition any freshly restarted or "cleaned" member has when rejoining.
### Note: Some sites use batch-size of 1, not reflecting that here### persistent-actor-restart-min-backoff-in-seconds=10 persistent-actor-restart-max-backoff-in-seconds=40 persistent-actor-restart-reset-backoff-in-seconds=20 shard-transaction-commit-timeout-in-seconds=120 shard-isolated-leader-check-interval-in-millis=30000 operation-timeout-in-seconds=120
Goals for module-shards.conf:
- Name which members retain copies of which data shards.
- These shard name fields are the 'friendly" names assigned to the explicit namespaces in the modules.conf.
- In a K8S/Swarm environment, it's easiest to keep this identical on all members. Unique shard replication (or isolation) strategies are for another document/discussion, and require non-trivial planning.
module-shards = [ { name = "default" shards = [ { name="default" replicas = [ "member-1" "member-2" "member-3" ] } ] }, { name = "topology" shards = [ { name="topology" replicas = [ "member-1" "member-2" "member-3" ] } ] }, { name = "inventory" shards = [ { name="inventory" replicas = [ "member-1" "member-2" "member-3" ] } ] }, ] ...thus, for example, it would be legitimate to have a single simple entry that ONLY includes "default" if desired. Thus there would only be default-config and default-operational, plus some of the auto-created shards.