ADS Client Load Balancing

The purpose of this article is to introduce the client load balancing feature available in ADS. It also demonstrates how to configure this feature and test it with RFA examples and WebSocket API.

Overview

ADS provides a basic runtime load balancing feature for the RSSL consumer application. The RSSL consumer application connects to an ADS load balancer to get a list of all available servers with their loads. Then, the client can disconnect the current connection and reconnect to the selected server with the least load. After that, all subsequent request messages are sent to the selected server.

For example, the consumer in the below figure connects to ADS_A and gets load information of all active ADSs. The consumer select ADS_C which is the least loaded ADS based on the configured load factor.

ADS Client Load Balancing

The ADS supports three options for calculating the load factor:

  • The number of current connections to the ADS
  • The total size of updates and images handled during the heartbeat message interval
  • The average update rates per second collected over a configurable interval

The option can be selected via the TREP configuration file.

Configurations

The following ADS configuration parameters are used for client load balancing:

Configuration Name Description Default Value
enableConnectConfig Indicates whether the feature should be enabled False
connectConfig*<userName>*active  List of active servers for this user None
disableLocalRedirect Determines whether the server sends heartbeat messages, thus including itself in the list of available active servers. Include the server in the list of discovered servers by setting this parameter to False. False
discoveryNetwork The UDP port and multicast address to use when sending/receiving heartbeat messages. The IP addresses of available servers which the Load Balancer ADS sends to client applications are those the ADS discovers from the discovery network. For this reason, client applications must be able to reach the discovery network. The multicast address must be specified, the network interface and port number are optional None
discoveryMulticastTTL Specifies the hop limit (across routing devices) for discovery multicast datagrams 1
discoveryHeartbeat Sets the heartbeat message timeout interval in seconds 10
logDiscoveryEvents Sets whether the server logs various events related to the discovery process False
loadBalanceAlgorithm Sets how to calculate the load factor 1. Calculation based on mounts/current connections 2. Calculation based on bandwidth 3. Calculation based on average update rate 1
loadFactorCalcInterval Specify time interval in seconds to calculate average update rate used with loadBalanceAlgorithm: 3 10
scalingFactor This parameter is used for different hardware configuration of running ADS. The parameter has to be greater than zero. The more powerful CPU, the lower number 1

Sample Configurations

To test this feature, we will setup four ADSs: ADS_A, ADS_B, ADS_C, and ADS_D.

ADS_A is a dedicated load balancer that is only used to balance the connections. It will not service any request. Therefore, its disableLocalRedirect configuration is set to True. The discovery network uses multicast address 224.5.5.5 and UDP port 7100.

The load factor is based on an average update rate.

ADS Client Load Balancing

You can enable client load balancing by added the following configurations into the TREP configuration file (rmds.cnf)

The configurations for ADS_A

ADS_A*ads*enableConnectConfig : True
ADS_A*ads*discoveryNetwork : discoverynet;224.5.5.5|7100
ADS_A*ads*connectConfig*.active : ADS_B|14002,ADS_C|14002,ADS_D|14002
ADS_A*ads*disableLocalRedirect: True
ADS_A*ads*logDiscoveryEvents: False
ADS_A*ads*loadBalanceAlgorithm: 3

The configurations for ADS_B

ADS_B*ads*enableConnectConfig : True
ADS_B*ads*discoveryNetwork : discoverynet;224.5.5.5|7100
ADS_B*ads*connectConfig*.active : ADS_B|14002,ADS_C|14002,ADS_D|14002
ADS_B*ads*disableLocalRedirect: False
ADS_B*ads*logDiscoveryEvents: False
ADS_B*ads*loadBalanceAlgorithm: 3

The configurations for ADS_C

ADS_C*ads*enableConnectConfig : True
ADS_C*ads*discoveryNetwork : discoverynet;224.5.5.5|7100
ADS_C*ads*connectConfig*.active : ADS_B|14002,ADS_C|14002,ADS_D|14002
ADS_C*ads*disableLocalRedirect: False
ADS_C*ads*logDiscoveryEvents: False
ADS_C*ads*loadBalanceAlgorithm: 3

The configurations for ADS_D

ADS_D*ads*enableConnectConfig : True
ADS_D*ads*discoveryNetwork : discoverynet;224.5.5.5|7100
ADS_D*ads*connectConfig*.active : ADS_B|14002,ADS_C|14002,ADS_D|14002
ADS_D*ads*disableLocalRedirect: False
ADS_D*ads*logDiscoveryEvents: False
ADS_D*ads*loadBalanceAlgorithm: 3

discoverynet is a network used by ADSs for sending and receiving heartbeats. It is defined in /etc/networks file. For example, the discovery network for this test environment is 192.168.27.0.

discoverynet 192.168.27.0

ADSs must be restarted in order to take effect.

Test with RFA

RFA has a built-in support for client load balancing. All we have to do is configuring RFA to connect to ADS_A. After that, the connection will be redirected to ADS which has the least load factor.

This can be verified by enabling RSSL tracing with the following RFA configurations.

\Connections\Connection_RSSL\traceMsgToFile = true
\Connections\Connection_RSSL\traceMsgDomains = "all"
\Connections\Connection_RSSL\traceMsgMaxMsgSize = 5000000
\Connections\Connection_RSSL\traceMsgMultipleFiles = true
\Connections\Connection_RSSL\tracePing = true

From the RSSL trace file, the login request message sent by RFA contains the DownloadConnectionConfig element with 1 as a value indicating that RFA wants to download connection configuration information.

<requestMsg domainType="RSSL_DMT_LOGIN" streamId="1" containerType="RSSL_DT_NO_DATA" flags="0x4 (RSSL_RQMF_STREAMING)" dataSize="0">
    <key  flags="0x26 (RSSL_MKF_HAS_NAME|RSSL_MKF_HAS_NAME_TYPE|RSSL_MKF_HAS_ATTRIB)"  name="test" nameType="1" attribContainerType="RSSL_DT_ELEMENT_LIST">
        <attrib>
            <elementList flags="0x8 (RSSL_ELF_HAS_STANDARD_DATA)">
                <elementEntry name="ApplicationId" dataType="RSSL_DT_ASCII_STRING" data="256"/>
                <elementEntry name="Position" dataType="RSSL_DT_ASCII_STRING" data="192.168.1.1/net"/>
                <elementEntry name="DownloadConnectionConfig" dataType="RSSL_DT_UINT" data="1"/>
            </elementList>
        </attrib>
    </key>
    <dataBody>
    </dataBody>
</requestMsg>

The login response sent by ADS_A contains a list of all available servers with load factors in its payload.

<refreshMsg domainType="RSSL_DMT_LOGIN" streamId="1" containerType="RSSL_DT_ELEMENT_LIST" ...>
    <key  flags="0x26 (RSSL_MKF_HAS_NAME|RSSL_MKF_HAS_NAME_TYPE|RSSL_MKF_HAS_ATTRIB)"  name="test" nameType="1" attribContainerType="RSSL_DT_ELEMENT_LIST">
        <attrib>
         ...
        </attrib>
    </key>
    <dataBody>
        <elementList flags="0x8 (RSSL_ELF_HAS_STANDARD_DATA)">
            <elementEntry name="ConnectionConfig" dataType="RSSL_DT_VECTOR">
                <vector flags="0x3 (RSSL_VTF_HAS_SET_DEFS|RSSL_VTF_HAS_SUMMARY_DATA)" countHint="0" containerType="RSSL_DT_ELEMENT_LIST">
                    <elementSetDefs>
                        ...
                    </elementSetDefs>
                    <summaryData>
                        <elementList flags="0x8 (RSSL_ELF_HAS_STANDARD_DATA)">
                            <elementEntry name="NumStandbyServers" dataType="RSSL_DT_UINT" data="0"/>
                        </elementList>
                    </summaryData>
                    <vectorEntry index="0" action="RSSL_VTEA_SET_ENTRY" flags="0x0">
                        <elementList flags="0x2 (RSSL_ELF_HAS_SET_DATA)">
                            <elementEntry name="Hostname" dataType="RSSL_DT_ASCII_STRING" data="192.168.27.16"/>
                            <elementEntry name="Port" dataType="RSSL_DT_UINT" data="14002"/>
                            <elementEntry name="LoadFactor" dataType="RSSL_DT_UINT" data="0"/>
                            <elementEntry name="ServerType" dataType="RSSL_DT_ENUM" data="0"/>
                            <elementEntry name="SystemID" dataType="RSSL_DT_ASCII_STRING" data="Default"/>
                        </elementList>
                    </vectorEntry>
                    <vectorEntry index="1" action="RSSL_VTEA_SET_ENTRY" flags="0x0">
                        <elementList flags="0x2 (RSSL_ELF_HAS_SET_DATA)">
                            <elementEntry name="Hostname" dataType="RSSL_DT_ASCII_STRING" data="192.168.27.17"/>
                            <elementEntry name="Port" dataType="RSSL_DT_UINT" data="14002"/>
                            <elementEntry name="LoadFactor" dataType="RSSL_DT_UINT" data="0"/>
                            <elementEntry name="ServerType" dataType="RSSL_DT_ENUM" data="0"/>
                            <elementEntry name="SystemID" dataType="RSSL_DT_ASCII_STRING" data="Default"/>
                        </elementList>
                    </vectorEntry>
                    ...
                    ...
                </vector>
            </elementEntry>
        </elementList>
    </dataBody>
</refreshMsg>

The payload of login response is an element list with ConnectionConfig as an element entry. This ConnectionConfig entry contains a Vector. Each entry in the Vector contains an element list which provides data specific to one ADS. The structure of the login response's payload is:

Login Response Payload

After that, RFA will select the server that has the least load factor and then reconnect to that selected server. All subsequent requests will be sent to that server.

<!-- Disconnected from '192.168.27.15:14002' on 'localhost' interface -->
<!-- Time: 14:16:15:084 -->

<!-- Attempt to Connect to '192.168.27.16:14002' on 'localhost' interface -->
<!-- Time: 14:16:15:252 -->

<!-- Connected to '192.168.27.16:14002' on 'localhost' interface -->
<!-- Time: 14:16:15:264 -->

<!-- Outgoing Message to '192.168.27.16:14002' on 'localhost' interface -->
<!-- Time: 14:16:15:264 -->
<!-- rwfMajorVer="14" rwfMinorVer="1" -->
<requestMsg domainType="RSSL_DMT_LOGIN" streamId="1" containerType="RSSL_DT_NO_DATA" flags="0x4 (RSSL_RQMF_STREAMING)" dataSize="0">
    <key  flags="0x26 (RSSL_MKF_HAS_NAME|RSSL_MKF_HAS_NAME_TYPE|RSSL_MKF_HAS_ATTRIB)"  name="test" nameType="1" attribContainerType="RSSL_DT_ELEMENT_LIST">
        <attrib>
            <elementList flags="0x8 (RSSL_ELF_HAS_STANDARD_DATA)">
                <elementEntry name="ApplicationId" dataType="RSSL_DT_ASCII_STRING" data="256"/>
                <elementEntry name="Position" dataType="RSSL_DT_ASCII_STRING" data="192.168.1.1/net"/>
            </elementList>
        </attrib>
    </key>
    <dataBody>
    </dataBody>
</requestMsg>

The above RSSL trace indicates that RFA selected 192.168.27.16 (ADS_B) so it disconnected the current connection of 192.168.27.15 (ADS_A) and then reconnected to 192.168.27.16 (ADS_B). The login request message was sent to 192.168.27.16 (ADS_B).

Test with WebSocket API

ADS client load balancing feature can also be used with WebSocket API. However, the developers need to implement logic to get the list of all available servers, select the server that has the least load factor, and then reconnect to the selected server. This section will explain steps by steps to modify a python example code in the WebSocket API Sample Applications. The example code used for the demonstration is a simple python example of outputting Market Price JSON data using WebSocket API (market_price.py).

Step 1: Create another web socket connection to a server

The market_price.py example has already created a connection to ADS in order to get the data. However, it will be modified to create another connection to the ADS server (ADS_A) to get the list of all available servers. Once this connection has been established, the send_login_request(ws) function is called with the created connection as an argument in order to send a login request. The modified code is shown below.

    # Start websocket handshake
    ws_address = "ws://{}:{}/WebSocket".format(hostname, port)
    print("Connecting to WebSocket " + ws_address + " ...")

    ###########################################################
    #Step 1: Create another web socket connection to a server #
    ###########################################################
    web_socket_app = websocket.create_connection(ws_address,
                                header=['User-Agent: Python'],
                                subprotocols=['tr_json2'])
    send_login_request(web_socket_app)
    ###########################################################
    # End Step 1                                              #
    ###########################################################

Step 2: Add a DownloadConnectionConfig attribute in the login request

To get a list of all load balancing servers, the DownloadConnectionConfig attribute must be added in the login request and its value is one. Therefore, the send_login_request method is modified to include the DownloadConnectionConfig attribute in the login request. The modified code is shown below.

def send_login_request(ws):
    """ Generate a login request from command line data (or defaults) and send """
    login_json = {
        'ID': 1,
        'Domain': 'Login',
        'Key': {
            'Name': '',
            'Elements': {
                'ApplicationId': '',
                'Position': ''
            }
        }
    }

    login_json['Key']['Name'] = user
    login_json['Key']['Elements']['ApplicationId'] = app_id
    login_json['Key']['Elements']['Position'] = position

    ########################################################################
    #Step 2: Add a DownloadConnectionConfig attribute in the login request #
    ########################################################################
    login_json['Key']['Elements']['DownloadConnectionConfig'] = 1
    ########################################################################
    #End Step 2                                                            #
    ########################################################################

    ws.send(json.dumps(login_json))
    print("SENT:")
    print(json.dumps(login_json, sort_keys=True, indent=2, separators=(',', ':')))

Step 3: Receive the login response

The next step is calling the recv() method on the created connection to get the login response. The retrieved login response is passed to a new function called find_least_load_server to select a server which has the least load factor. The modified code is shown below.

    ##########################################################
    # End Step 1                                             #
    ##########################################################

    ######################################
    #Step 3:  Receive the login response #
    ######################################
    login_response = web_socket_app.recv()
    sel_server = find_least_load_server(login_response)
    print("The server that has the least load factor is "+select_host)
    ######################################
    #End Step 3                          #
    ######################################

Step 4: Select a server with the least load factor

A new function called find_least_load_server is created to find the server with the least load factor in the login response. It accepts the login response as an argument and returns the server with the least load factor. It will return None if there is no available load balancing server in the login response.

#####################################################
#Step 4: Select a server with the least load factor #
#####################################################
def find_least_load_server(login_response):
    resp = json.loads(login_response)
    print(json.dumps(resp, sort_keys=True, indent=2, separators=(',', ':')))
    connectionconfig = resp[0]['Elements'].get('ConnectionConfig', None)
    if connectionconfig == None:
        return None
    else:
        entries = connectionconfig['Data']['Entries']
        if len(entries) == 0:
            return None
        else:
            least_hostname = None
            least_port = None
            least_factor = sys.maxsize
            for s in entries:
                if s['Elements']['LoadFactor'] < least_factor:
                    least_hostname = s['Elements']['Hostname']
                    least_factor = s['Elements']['LoadFactor']                   
            print(least_hostname)
            return least_hostname    
########################################################
#End Step 4                                            #
########################################################

This function will parse the following login response to find a server with the least load factor.

[
  {
    "Domain":"Login",
    "Elements":{
      "ConnectionConfig":{
        "Data":{
          "Entries":[
            {
              "Action":"Set",
              "Elements":{
                "Hostname":"192.168.27.16",
                "LoadFactor":0,
                "Port":14002,
                "ServerType":{
                  "Data":0,
                  "Type":"Enum"
                },
                "SystemID":"Default"
              },
              "Index":1
            },
            {
              "Action":"Set",
              "Elements":{
                "Hostname":"192.168.27.17",
                "LoadFactor":4,
                "Port":14002,
                "ServerType":{
                  "Data":0,
                  "Type":"Enum"
                },
                "SystemID":"Default"
              },
              "Index":0
            },
           ...
          ],
          "Summary":{
            "Elements":{
              "NumStandbyServers":0
            }
          }
        },
        "Type":"Vector"
      },
      "MaxMsgSize":61430,
      "PingTimeout":30
    },
    "ID":1,
    "Key":{
      "Elements":{
        "AllowSuspectData":1,
        "ApplicationId":"256",
        "ApplicationName":"ADS",
...
      },
      "Name":"user01"
    },
    "State":{
      "Data":"Ok",
      "Stream":"Open",
      "Text":"Login accepted by host ADS_A."
    },
    "Type":"Refresh"
  }
]

All load balancing servers are listed as an array in the ['Elements']['ConnectionConfig']['Data']['Entries'] element. The code will iterate all servers in the array to find and return the least loaded server. If the element doesn't exist or there is no entry in the array, this function will return None.

Step 5: Reconnect to the selected server

The final part is reconnecting to the selected server. However, if the find_least_load_server function returns None, it will close the current connection and reconnect to the same server.

    ######################################
    #End Step 3                          #
    ######################################

    ###########################################
    #Step 5: Reconnect to the selected server #
    ###########################################
    if sel_server != None:
        ws_address = "ws://{}:{}/WebSocket".format(sel_server, port)
    web_socket_app.close()
    print("Reconnect to "+ws_address)
    ###########################################
    #End Step 5                               #
    ###########################################

    web_socket_app = websocket.WebSocketApp(ws_address, header=['User-Agent: Python'],
                                        on_message=on_message,
                                        on_error=on_error,
                                        on_close=on_close,
                                        subprotocols=['tr_json2'])

After reconnecting to the selected server, it will start processing the response message and then subscribing to an item.

The modified example can be run with the following parameter to connect to ADS_A

    python market_price.py --host ADS_A

The full code is available in GitHub.

Note: The subscribed item should provide update messages because the calculation of load factor on ADS is based on average update rate (loadBalanceAlgorithm: 3). If the update rate is zero, the example will always reconnect to the first available server.

Note: With ADS 3.2.1.L1, a calculation based on mounts/current connections (loadBalanceAlgorithm: 1) doesn't work with WebSocket API because the web socket connections aren't counted in the calculation.

Summary

ADS supports a basic load balancing feature which allows client applications to get a list of all available servers with their loads. Then, the client application can use this information to redirect a connection to a selected server which has the least load factor. ADS supports three algorithms to calculate the load factors:

  1. Calculation based on mounts/current connections
  2. Calculation based on bandwidth
  3. Calculation based on average update rate

APIs also support this feature via the login domain model. The DownloadConnectionConfig element must be added into the login request. With this element, the login response sent by ADS will have a list of all available servers with load factors in its payload.

RFA has a built-in support for client load balancing so developers don't need to implement any code to support this feature. Otherwise, for other APIs (Elektron APIs and WebSocket API), developers need to implement logic to get the list of all available servers via the login domain model, select a server that has the least load factor, and then reconnect to the selected server. This article has explained how to do this in WebSocket API but developers can apply this to other APIs.

References

  1. Advanced Distribution Server Software Installation Guide
  2. RDM Usage Guide
  3. WebSocket API
  4. The modified market_price.py