Purpose and Operation Description of Failover Clusters

The failover feature enables automated recovery of your system in the event of a sudden server failure. Macula carefully tracks all server operations, and, if it detects that any of the servers have failed, it quickly activates a spare server to perform the actions that the malfunctioning server had been performing prior to failing. GSF offers native failover clustering mechanism for Macula Enterprise recording servers: you do not need to maintain the backup server manually or use Microsoft failover clustering. Easy setup and centralized tracking is available via Macula Console for administrator's convenience.

Failover clusters are groups of servers (failover nodes) working together to provide a higher service availability rate, thus increasing system fault tolerance and overall stability. When one server (node) fails, another one takes its place, guaranteeing service availability to the end user and minimizing service downtime.

Note that failover clusters have been designed specifically for recording servers; the central server is not subject to failover using this mechanism.

Each failover cluster consists of at least one failover node and at least one actual recording server - there is no limit on the number of either. Depending on the scenario, server importance and estimated system stability, cluster contents may be different: 1 failover node for N servers, M failover nodes for N servers, M failover nodes for 1 server. Each server (whether recording or failover) can only belong to a single failover cluster at a time; there is no limit imposed on the number of failover clusters.

  • If a recording server is down for more that the specified amount of time, the failover node automatically assumes its responsibilities

  • When the recording server is back online, system administrator should manually turn off acting failover node to enable the main recording server

  • There is an option to enable automatic recovery of the recording servers

  • Failover node status is available in the monitoring section of Macula Console and in the View servers mode via the Failover section

Starting from Macula version 1.7.0, there exists an option of automatic recording server recovery. Up to 1.7.0, the returning a recording server back to operation should be done manually. Default failover mode is manual recovery.

The central server controls the failover start conditions. Recording servers communicate with the central server by sending 'heartbeat' signals: if no heartbeat is received for longer than the period of time specified, then a timeout is reached and the central server triggers failover operation. Each recording server can have individual settings for timeouts. Once a Macula Recording Server reaches the failover timeout, the central management server assigns its configuration to the first available failover node.

The following functionality is supported and guaranteed during failover operation:

  • live channels

  • video (main and secondary stream), audio and motion recording, archive access

  • server-side VCA (additional VCA license must be applied to the failover server)

  • Event & Action rules

  • Macula License Plate Recognition, when it is set up to use one of the channels from the failed Macula Recording Server

Recorded archive and counters will stay on the failover server after the main recording server comes back online: they will not be transferred or copied anywhere. While the Macula Recording Server is offline, it will not be possible to retrieve its video archive and counters' database. However, once both servers are reachable, all data will become available for investigation in the Macula Monitor application, and the hardware layer will be transparent for the user.

Failover server must be located in the same environment as main recording server(s) in order to be able to access all cameras from the failing server configuration.

It is best that the failover server hardware is as powerful as the most powerful Macula Recording Server in the cluster: otherwise, if the most powerful Macula Recording Server goes down, the failover node might not have enough resources and will have issues carrying out operation with the assigned configuration.

Failover servers themselves cannot accept any device configuration: as a single failover server may take over operation of one of N different servers, its device configuration remains empty until the precise moment of a recording server failure. Storage configuration is not transmitted to the failover node, the failover server records the data to whichever storage is configured on it so there is no need to match the failover node storage configuration to any of the recording servers.

Streams recorded on a failover server during its operation are kept on the failover server storage and are not transferred to the recording server storage.

If recording servers in a failover cluster have different storages, it is not necessary to set up the same storage labels on the failover nodes: instead, it is sufficient to have a Default storage and all recordings will be saved there.

When a server belonging to a cluster fails, its operation is carried on automatically by the first idle failover node. The central server transmits failing server configuration to the failover node and the failover server then begins operation in place of the failing server. Recording server recovery can be only performed by the administrator or automatically, if the corresponding setting is enabled; until then, failover node will continue performing its primary function. The administrator can also manually assign a specific failover node to operate instead of any recording server.

Last updated