Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Hazelcast - distributed in-memory data grid (IMDG) platform, reliable for processing and storing data efficiently.

Defaults

  • Distributes data, partitions (both primary and backup replicas) across all cluster members

    • Back up counts are ignored in a single instance setup

    • Default number of partitions 271

  • Automatically creates backups at partition level which are distributed across all cluster members (back up count is configurable)

  • Members have core count * 20 threads that handle requests

  • Single replica of each partition is created

    • One of these replica is called primary and others are backups

    • Partition table information is sent and update across the members by default every 15 seconds, this can be configured with the ff:

      • hazelcast.partition.table.send.interval

  • For maps and queues

    • one synchronous back up, zero asynchronous back up

    • Binary in memory format

Hazelcast Topologies

There are two available Hazelcast Topologies (Embedded, Client/Server).

CPS/NCMP uses Embedded technology and will be the main focus for the rest of this documentation.

  • Scaling application instance in embedded mode also scales the hazelcast cluster

  • Application instance and hazelcast shares the same JVM and resources.

Hazelcast as a AP or CP system

Hazelcast as an AP system provides Availability and Partition Tolerance

  • Data structures under HazelcastInstance API are all AP data structures

    • CPS/NCMP uses AP data structures 

  • No master node in an AP cluster

  • When a partition fails, its data is redistributed to other members

Hazelcast as a CP system provides Consistency and Partition Tolerance

  • Data structures under HazelcastInstance.getCPSubsytem() API are CP data structures

CP Subsystem

  • Currently, CP Subsystem contains only the implementations of Hazelcast's concurrency APIs.

  • Provides mechanisms for leader election, distributed locking, synchronization, and metadata management.

  • Prioritizes consistency which means it may halt operations on some members

  • CPMap only available in enterprise edition

Hazelcast Distributed Data Structures

Standard Collections (AP)

  • Maps

    • synchronous backup

      • changes to a map waits to finish update on the primary and all other backups before operation continues

    • asynchronous backup

      • changes to a map does not wait  to finish update on the primary and all other backups before operation continues

  • Queues

  • Lists

  • Sets

  • Ringbuffers

Concurrency Utilities

  • AtomicLongs

  • AtomicReferences

  • Semaphores

  • CountdownLatches

Publish/Subscribe Messaging

CP Subsystem

Hazelcast in CPS/NCMP

Multiple instances of hazelcast created in application.

Each instance is a member and/or client in a Hazelcast cluster.

Multiple instances

  • Creates complexity

  • Resource overhead

    • Each hazelcast instance needs its own memory, CPU and network resources

    • Each instance has its own memory allocation within the JVM heap

  • Increase in network traffic

    • synchronization

    • processing

  • Data inconsistency is a risk

Cluster Sizing 

  • Cluster size depends on multiple factors for application

  • Adding more members to the cluster

    • Capacity for CPU-bound computation rises

    • Use of Jet stream processing engine helps this

    • Adding members increases the amount of data that needs to be replicated.

    • Can improve fault tolerance or performance (if minimum recommended (1 per application instance) is not enough)

  • Fault tolerance recommendation

    • At least 3 members or n+1

See Hazelcast's Cluster Sizing

Tips

  • One instance of hazelcast per application instance

    • though for fault tolerance, we might need 3 ?

  • Use map.set() on maps instead of map.put() if you don’t need the old value.

    • This eliminates unnecessary deserialization of the old value.

    • map.putIfAbsent() - inserts only if key doesn't exist

    • map.set() - inserts or updates unconditionally

  • Consider use of Entry processor of updating map entries in bulk (i.e. get then set)

  • Changing to OBJECT In-Memory format for specific maps

    • applies entry processor directly on the object

    • useful for queries

    • No serialization/deserialization is performed

  • Enabling in-memory back up reads

    • Allowing member to check its own back up instead of calling the owner of the partition or the primary copy

    • Can be useful to reduce latency and improve performance

    • Needs back ups to be synchronised

  • Back Pressure mechanism in hazelcast

    • acts like a traffic controller 

    • helps prevent system overload

    • helps prevent crashes due to excessive operation invocations

    • Limits the number of concurrent operations and periodically syncs async backups

    • Needs to be configured as it is disabled by default

      • configure to be enabled

      • configure max concurrent invocations per partition

  • Controlling/Configuring partitions

    • Putting data structures in one partition

      • reduces network traffic

      • improves data access performance

      • Simplifies data management

    • Using custom partition strategies based on requirements (Partitioning strategy)

  • Reading Map Metrics

  • No labels