Scalable data services

With Discovery’s scalable data services, you don’t need to limit a data service to a static list of providing peers. Instead, you can define a data service’s providers using a regular expression, which allows the data service to recognise new instances of providing peers added at runtime.

Contents:

Requirements
Overview
Common configurations
Source affinity
Service rebalancing

Requirements

To take advantage of scalable data services, you need to have already implemented:

Discovery licensing
Peer discovery

Overview

In a traditional data service, providing peers are defined as a static list of labels (remote-label). Any peer added at runtime that is not in the static list is not recognised by the data service as as a provider.

Example 1. Defining providing peers by static labels

Consider the data service below:

add-data-service
    …
    add-source-group
        add-priority
            remote-label trading-adapter1
            remote-label trading-adapter2
            remote-label trading-adapter3
        end-priority
    end-source-group
end-data-service

This data service recognises providing peers defined at configuration time: trading-adapter1, trading-adapter2, and trading-adapter3. At runtime, if a new instance of the trading adapter (trading-adapter4) is added by Discovery’s peer discovery, then the data service will not recognise it as a provider.

In a scalable data service, providing peers are defined as a label pattern (remote-label-regex). The pattern (a regular expression) is evaluated at runtime for each subscription request, and any peer added at runtime with a label that matches the pattern is recognised by the data service as a provider.

Example 2. Defining providing peers by label patterns (regular expressions)

Consider the data service below:

add-data-service
    …
    add-source-group
        add-priority
            remote-label-regex ^trading-adapter[0-9]+
        end-priority
    end-source-group
end-data-service

This data service recognises providing peers with labels that match the pattern ^trading-adapter[0-9]+ (a label that starts with trading-adapter followed by one or more digits). At runtime, if a new instance of the trading-adapter (trading-adapter4) is added by Discovery’s peer discovery, then the data service will recognise it as a provider.

Common configurations

In a data service’s configuration, providing peers are divided by source group then by priority (primary, secondary, …):

The most common arrangement of providing peers for a scalable data service is the Load-balanced arrangement. Other arrangements are possible, however. This section describes three common arrangements of providing peers.

Load-balanced arrangement

Providing peers are arranged in a single-priority source group.

In the Liberator configuration below, requests for subjects beginning /MyNamespace are distributed to the providing peer serving the least number of subscriptions to the data service.

This configuration is commonly used for deployments hosted on a container-orchestration platform, such as Kubernetes.

Traditional configuration

add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority
            remote-label Adapter1
            remote-label Adapter2
            remote-label Adapter3
        end-priority
    end-source-group
end-data-service

Scalable configuration

add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority
            remote-label-regex ^Adapter[0-9]+$
        end-priority
    end-source-group
end-data-service

Failover arrangement

Providing peers are arranged in a multi-priority source group.

In the configuration below, requests for subjects beginning /MyNamespace are load balanced across the primary priority adapters. If all the primary priority adapters fail, then Liberator moves existing requests and routes new requests to the secondary priority adapters.

This configuration is common in deployments that have two software stacks ('legs'), with each stack hosted on separate hardware for resilience. This arrangement is less appropriate for container-orchestrated platforms, such as Kubernetes, which run on clusters of worker nodes and are more resilient by design.

Traditional configuration

add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority (1)
            remote-label Adapter101
            remote-label Adapter102
            remote-label Adapter103
        end-priority
        add-priority (2)
            remote-label Adapter201
            remote-label Adapter202
            remote-label Adapter203
        end-priority
    end-source-group
end-data-service

1	Primary priority
2	Secondary priority

Scalable configuration

add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority (1)
            remote-label-regex ^Adapter1[0-9]+$
        end-priority
        add-priority (2)
            remote-label-regex ^Adapter2[0-9]+$
        end-priority
    end-source-group
end-data-service

1	Primary priority
2	Secondary priority

Parallel arrangement

Providing peers are arranged in multiple source groups.

In the configuration below, requests for subjects beginning /MyNamespace are routed to both a load-balanced set of AdapterA instances and to a load-balanced set of AdapterB instances. Liberator combines the data received from both requests

Traditional configuration

add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority
            remote-label AdapterA1
            remote-label AdapterA2
            remote-label AdapterA3
        end-priority
    end-source-group
    add-source group
        add-priority
            remote-label AdapterB1
            remote-label AdapterB2
            remote-label AdapterB3
        end-priority
    end-source-group
end-data-service

Scalable configuration

add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority
            remote-label-regex ^AdapterA[0-9]+$
        end-priority
    end-source-group
    add-source-group
        add-priority
            remote-label-regex ^AdapterB[0-9]+$
        end-priority
    end-source-group
end-data-service

Source affinity

Source affinity is supported by scalable data services. When all of a data service’s peers are defined by remote-label-regex, affinity key values are cached centrally with Discovery. The central cache guarantees that all components use the same affinity key values when routing requests to providing peers.

Once an providing peer has been cached under an affinity key, Discovery retains the key until the peer disconnects from Discovery.

Example 3. Scalable data service with source affinity

A data service for a trade channel, /PRIVATE/TRADE, routes requests to a single-priority source group of trading adapter instances. The source group has an affinity key defined with the format: trading-adapter-session_id.

object-map /PRIVATE/TRADE /PRIVATE/%U/TRADE (1)

add-data-service
    service-name trade-channel
    include-pattern ^/PRIVATE/[^/]+/TRADE
    add-source-group
        affinity trading-adapter ^/PRIVATE/([^/]+)/TRADE (2)
        add-priority
            remote-label-regex ^trading-adapter[0-9]+$
        end-priority
    end-source-group
end-data-service

1	Trade channel subject mapped to `/PRIVATE/session_id/TRADE`
2	Affinity key: `trading-adapter-session_id`

Service rebalancing

Scalable data services support rebalancing of workloads within source groups that do not have source affinity enabled (see add-source-group configuration option affinity).

Service rebalancing is disabled for all data services by default. To enable service rebalancing for all data services, set the configuration item service-rebalance-enable. To enable service rebalancing for a specific data service, set the data service’s configuration option rebalance-enable.

In source groups without source affinity, rebalancing occurs when a connected peer that matches a priority’s remote-label-regex changes its status to UP. The source group discards all existing subscriptions and re-requests them, balancing the requests evenly over peers in the source group’s highest available priority band.

For more information, see:

See also:

Migrate to scalable data services