Scalable data services

With Discovery’s scalable data services, you don’t need to limit a data service to a static list of providing peers. Instead, you can define a data service’s providers using a regular expression, which allows the data service to recognise new instances of providing peers added at runtime.

Requirements

To take advantage of scalable data services, you need to have already implemented:

  • Discovery licensing

  • Peer discovery

Overview

In a traditional data service, providing peers are defined as a static list of labels (remote-label). Any peer added at runtime that is not in the static list is not recognised by the data service as as a provider.

Example 1. Defining providing peers by static labels

Consider the data service below:

add-data-service
    …
    add-source-group
        add-priority
            remote-label trading-adapter1
            remote-label trading-adapter2
            remote-label trading-adapter3
        end-priority
    end-source-group
end-data-service

This data service recognises providing peers defined at configuration time: trading-adapter1, trading-adapter2, and trading-adapter3. At runtime, if a new instance of the trading adapter (trading-adapter4) is added by Discovery’s peer discovery, then the data service will not recognise it as a provider.

In a scalable data service, providing peers are defined as a label pattern (remote-label-regex). The pattern (a regular expression) is evaluated at runtime for each subscription request, and any peer added at runtime with a label that matches the pattern is recognised by the data service as a provider.

Example 2. Defining providing peers by label patterns (regular expressions)

Consider the data service below:

add-data-service
    …
    add-source-group
        add-priority
            remote-label-regex ^trading-adapter[0-9]+
        end-priority
    end-source-group
end-data-service

This data service recognises providing peers with labels that match the pattern ^trading-adapter[0-9]+ (a label that starts with trading-adapter followed by one or more digits). At runtime, if a new instance of the trading-adapter (trading-adapter4) is added by Discovery’s peer discovery, then the data service will recognise it as a provider.

Common configurations

In a data service’s configuration, providing peers are divided by source group then by priority (primary, secondary, …​):

Data serviceSource groupPriorityProviding peerRequests are load-balancedacross peers.Priorities are defined indescending order of priority;primary priority first. Requests are routed to thehighest priority that has atleast one available peer.Requests are routed toeach source group.

The most common arrangement of providing peers for a scalable data service is the Load-balanced arrangement. Other arrangements are possible, however. This section describes three common arrangements of providing peers.

Load-balanced arrangement

Providing peers are arranged in a single-priority source group.

In the Liberator configuration below, requests for subjects beginning /MyNamespace are distributed to the providing peer serving the least number of subscriptions to the data service.

This configuration is commonly used for deployments hosted on a container-orchestration platform, such as Kubernetes.

Liberator«data service»MyDataServiceAdapter1Adapter2Adapter3
Traditional configuration
add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority
            remote-label Adapter1
            remote-label Adapter2
            remote-label Adapter3
        end-priority
    end-source-group
end-data-service
Scalable configuration
add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority
            remote-label-regex ^Adapter[0-9]+$
        end-priority
    end-source-group
end-data-service

Failover arrangement

Providing peers are arranged in a multi-priority source group.

In the configuration below, requests for subjects beginning /MyNamespace are load balanced across the primary priority adapters. If all the primary priority adapters fail, then Liberator moves existing requests and routes new requests to the secondary priority adapters.

This configuration is common in deployments that have two software stacks ('legs'), with each stack hosted on separate hardware for resilience. This arrangement is less appropriate for container-orchestrated platforms, such as Kubernetes, which run on clusters of worker nodes and are more resilient by design.

LiberatorPrimary prioritySecondary priority«data service»MyDataServiceAdapter103Adapter102Adapter101Adapter203Adapter202Adapter201
Traditional configuration
add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority (1)
            remote-label Adapter101
            remote-label Adapter102
            remote-label Adapter103
        end-priority
        add-priority (2)
            remote-label Adapter201
            remote-label Adapter202
            remote-label Adapter203
        end-priority
    end-source-group
end-data-service
1 Primary priority
2 Secondary priority
Scalable configuration
add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority (1)
            remote-label-regex ^Adapter1[0-9]+$
        end-priority
        add-priority (2)
            remote-label-regex ^Adapter2[0-9]+$
        end-priority
    end-source-group
end-data-service
1 Primary priority
2 Secondary priority

Parallel arrangement

Providing peers are arranged in multiple source groups.

In the configuration below, requests for subjects beginning /MyNamespace are routed to both a load-balanced set of AdapterA instances and to a load-balanced set of AdapterB instances. Liberator combines the data received from both requests

LiberatorSource Group 1Source Group 2«data service»MyDataServiceAdapterA3AdapterA2AdapterA1AdapterB3AdapterB2AdapterB1
Traditional configuration
add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority
            remote-label AdapterA1
            remote-label AdapterA2
            remote-label AdapterA3
        end-priority
    end-source-group
    add-source group
        add-priority
            remote-label AdapterB1
            remote-label AdapterB2
            remote-label AdapterB3
        end-priority
    end-source-group
end-data-service
Scalable configuration
add-data-service
    service-name MyDataService
    include-pattern ^/MyNamespace/.+
    add-source-group
        add-priority
            remote-label-regex ^AdapterA[0-9]+$
        end-priority
    end-source-group
    add-source-group
        add-priority
            remote-label-regex ^AdapterB[0-9]+$
        end-priority
    end-source-group
end-data-service

Source affinity

Source affinity is supported by scalable data services. When all of a data service’s peers are defined by remote-label-regex, affinity key values are cached centrally with Discovery. The central cache guarantees that all components use the same affinity key values when routing requests to providing peers.

Once an providing peer has been cached under an affinity key, Discovery retains the key until the peer disconnects from Discovery.

Example 3. Scalable data service with source affinity

A data service for a trade channel, /PRIVATE/TRADE, routes requests to a single-priority source group of trading adapter instances. The source group has an affinity key defined with the format: trading-adapter-session_id.

object-map /PRIVATE/TRADE /PRIVATE/%U/TRADE (1)

add-data-service
    service-name trade-channel
    include-pattern ^/PRIVATE/[^/]+/TRADE
    add-source-group
        affinity trading-adapter ^/PRIVATE/([^/]+)/TRADE (2)
        add-priority
            remote-label-regex ^trading-adapter[0-9]+$
        end-priority
    end-source-group
end-data-service
1 Trade channel subject mapped to /PRIVATE/session_id/TRADE
2 Affinity key: trading-adapter-session_id

Service rebalancing

Scalable data services support rebalancing of workloads within source groups that do not have source affinity enabled (see add-source-group configuration option affinity).

Service rebalancing is disabled for all data services by default. To enable service rebalancing for all data services, set the configuration item service-rebalance-enable. To enable service rebalancing for a specific data service, set the data service’s configuration option rebalance-enable.

In source groups without source affinity, rebalancing occurs when a connected peer that matches a priority’s remote-label-regex changes its status to UP. The source group discards all existing subscriptions and re-requests them, balancing the requests evenly over peers in the source group’s highest available priority band.

For more information, see:


See also: