Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Overview

The Advanced Recovery Commercial module provides an easy to configure replication engine along with the ability to automatically fail over to a secondary server; this mechanism protects voice services when there is failure in primary server.

...

Details of each feature and how it can be configured is defined below.

...

Prerequisites

In order to successfully deploy the Advanced Recovery module the following requirements must be met:

  1. You have an existing PBX system that will be your Primary server.

  2. You have an identical PBX system that will be your Secondary server.

  3. The two servers can communicate on an IP level.

    1. Both systems are configured with their own IP addresses.

    2. Both the systems IP network, can be a local network or can be on separate geographically location.

    3. SSH and HTTP(s) ports should be open between both the servers.

  4. Both servers are running the Advanced Recovery module, and it is "licensed" on each system.

  5. Backup & RestoreAPI and Filestore are dependent modules that must be installed on both the systems.

  6. ICMP/Ping between Primary and Secondary is required.

Setup Configuration

Advanced Recovery module can be configured by following below mentioned simple steps.

  1. Establish SSH connection between both the servers. 

    1. SSH to secondary server from primary without needing password.

    2. SSH to primary server from secondary without needing password.

  2. Configure Advanced Recovery module using "Quick configuration" wizard.

Establish SSH connection between both the servers

We need to establish SSH connection between both the servers.  

Please find below instructions refer to Setting up the SSH key to copy primary server SSH key to secondary so that we can easily SSH to secondary server from primary without needing password.

  1. Login to your Primary server with an SSH client such as PuTTySecureCRT, or other SSH client. 

  2. At the primary server Linux CLI prompt type: sudo -u asterisk ssh-copy-id -i  /home/asterisk/.ssh/id_rsa.pub root@SecondaryServerIP and enter the password when prompted. 

Info

Make sure you replace the SecondaryServerIP with the IP Address of your Secondary PBX. (use IP and not a hostname that may be common to both primary and warmspare; if fqdns are desired create 3 records the common name , specific name for primary and specific name for warm spare - ie mypbx.company.com , mypbx1.company.com , mypbx2.company.com)

If the Firewall is configured, pay attention to creating the right rule allowing the two servers to talk to each other.

3.  If above command completes without error, you are ready to test:

At the prompt type: ssh -i /home/asterisk/.ssh/id_rsa root@SecondaryServerIP

 If all went well, you should now be logged in to the Secondary server.

...

PBX 17+ Improvements :

Redirect option to Backup & Restore module :


Added new option “+Go to Backup & Add Key” into Advanced recovery → Global settings” which will redirect to backup module > Global Settings to copy this pbx key or add another pbx key.

...

Verify SSH Connectivity -

Note

On a fresh system, this one time step is mandatory for PBX 17+ system. ‘precheck’ command should be run on both the servers.

Check SSH Connection from Primary Server:

  • SSH into the primary server.

  • Run the following command, replacing SecondaryIP with the secondary server's IP address: fwconsole advr --precheck SecondaryIP

  • If the command is successful, it should display “SSH Connectivity is Good”.

  • Screen-shot attached below

...

Check SSH Connection from Secondary Server:

  • SSH into the secondary server.

  • Run the following command, replacing PrimaryIP with the primary server's IP address: fwconsole advr --precheck PrimaryIP

  • If the command is successful, it should display “SSH Connectivity is Good”.

Info

SSH Folder permissions

In the case that the web GUI show errors during the SSH connection, worth checking the correct permissions are set for the SSH folder and the files it contains. The permissions should look something like:

/home/asterisk/.ssh

...

Info

The above files must exist in both servers (primary and secondary). Any discrepancy with the permissions in any of the files, please re-do SSH association step N°2stated above.

Install Advanced Recovery module

Download and install the "Advanced Recovery" module by following "Check Online" and then download install guide as described in  Module Admin User Guide#CheckingforAvailableUpgrades wiki.

Advanced Recovery Module Configuration


How to Open the Advanced Recovery Module Settings 

Within the PBX GUI, navigate to Admin > Advanced Recovery

...

How to Configure the Advanced Recovery Module  

The main landing page of the Advanced Recovery module has an options to view system status, perform configuration changes, and adjust global settings(like SSH keys).

Info

'Quick Configuration' option will display only for the first time when system is not configured.

Once system is configured then this option will not be visible and we need to use "Configuration → Primary (or secondary) Server " option to do any further configuration modification.

...

Quick Configuration Wizard 

Quick  Configure wizard will provide easy GUI interface to configure Advanced Recovery module.
This quick configuration wizard will take care of configuring primary and secondary server by himself so after this we do not have to do any further configuration.
When you click on "Quick configuration" button then it will pop up wizard as shown below -

...

Step-1 Server Configuration

Here, We have to specify the "Secondary" Server IP.
Click  "Next" , after select "Secondary Server" instance, 

...

If secondary has proper active licensed Advanced Recovery module, then it will proceed further with Step-2 > Sync.

Step-2 Sync

To define syncing frequency.
Syncing can take from minutes to hours depends on system size(capacity) and additional files/directories that might have been added into the module configuration to be synced. 

Note

Syncing process might be CPU intensive depends on your system capacity so it's recommended to do syncing during "Off hours". Syncing frequency should be configured more wisely.


Step-3 Settings

This section allows you to do necessary configurations on the Advanced Recovery module required for doing replication of configuration to secondary server.

...

Once done with above configuration, then move on to the next step to do "Notifications" configuration.

Step-4 Notification

This section will allows you to do "Notification" configuration. 

...

  • Notification Extension: which extension to call during failover event. On system failure event, active system will initiate call to configured extension and will play the configured announcement. Intention of this call notification is to update admin about the system failure

  • Recording when primary fails : select recording to play when the Primary server fails. This will specify the list of "recordings" to choose from as configured in System Recording module.

  • Recording when standby fails: select recording to play when the standby/Warm Spare server fails. This will specify the list of "recordings" to choose from as configured in System Recording module.

  • Notification Email: email address where notifications will be sent to.

...

This will finish the "Quick configuration" part of Advanced Recovery module.  If any further modifications of the configuration are needed then please refer to Advanced Recovery Expert Configuration wiki.

We need to start "Advanced Recovery Service" daemon as soon as we done with "Quick configuration" process as described in below section.

Advanced Recovery service daemon

This service daemon is mainly responsible for keep monitoring the health of the primary system and on the event of failure, this will execute the necessary steps to perform switchover to the secondary server.
After completion of "Quick configuration" wizard, we can see status of the Primary and Secondary would be something like below. 

Info

Advanced recovery service needs to be started only on Primary server. Service on Secondary Server will start automatically.


Primary Server

...



Secondary Server

...

As shown below, dashboard shows configuration is done but service has not yet started. Next step is to "Start' the service from Primary.

Advanced Recovery Dashboard

Dashboard provides the information about service status and last sync time. 

We can also use "Sync now" option to forcefully sync the configuration to secondary system.

...

Advanced Recovery Sync Now 

Sync now option is to do manual configuration syncing to secondary server. 
This could be useful for user to confirm syncing is working fine as soon as initial configuration is over and also to know that how much time sync could take for the PBX system.

...

If require, change the "Syncing scheduling" frequency using "Advanced configuration" option.

...

Modules and Call Recording syncing 

On primary server settings page you have the option to add modules to be added into the sync process. By default all modules are selected and included in syncing process you can unselect the modules which are to be excluded from syncing process.

...

By default we are adding the Voicemails folder as directory item and this folder will be syncing in incremental basis.

...

  

Switchover Configuration

Advanced Recovery module provides configuration options to decide the various actions during switchover.
All the Switchover related configuration is part of the Advanced Configuration. 

We can jump to Advanced Configuration by going to "Advanced Recovery Module → Configuration → Primary Server' as shown in below screenshot.

...

Trunk Selection Configuration option

As soon as we enable the "Auto switch services", it will show list of currently configured trunks in the system.
We can select our desired state of the trunks after a switchover for every configured trunk:

...


Bring down Primary server after switchover configuration option

Execute 'fwconsole stop' on the primary server after a failover.

...

If this option is set to YES then the Advanced Recovery module will keep on checking the configured Primary server to see if it comes up and will bring down all the services on the primary when that happen.

...

Post Switchover Hook

This is for advanced users who would like to perform some special steps after switchover. 
Please specify the custom script path to execute after switchover.

Info

APPLICATION NOTE

Make sure script has execute permissions for the Asterisk user.

Info

Post Switchover Hook files

In case the Post Switchover Hook is executed on both failover and failback processes, the script needs to exist and be accessible on both servers in the configured location (directory).

NOTE:

  1. Primary and Secondary will run their own copy of the switchover script. Meaning that, in advanced scenarios where failover and failback processes need to execute different tasks, it is possible to have each server with its own copy of the script to be executed during such processes.

  2. During the Sync Back To Primary process (failback), the hook will be executed right before the 'fwconsole restart' command, so any actions performed on the script may need to take this into account.

Advanced Configuration 

Once Quick configuration wizard is over then any further configuration or change must be done on the 'Advanced Configuration' page. Changes like changing GraphQL API tokens, modifying the Primary/Secondary server IP address, etc. we will need to use "Advanced configuration" as mentioned in Advanced Recovery Expert Configuration 

Switchover 

Advanced Recovery module decides Primary is down by detecting at least one of the following conditions:

...

  1. Switchover related actions as configured in SwitchoverConfiguration

    1. Enable the Trunks on secondary as configured in TrunkSelectionConfigurationoption

    2. Execute post switchover hooks to run custom third party script with an "START" argument. 

  2. Notify to admin via Call to admin extension if Call Notification is enable.

  3. Notify to admin via Email  

...

The Advanced Recovery module will be beneficial during outages by automatically switching services over to a Secondary server when a failure is detected on Primary server. However, it is critical to understand and be aware that there are other network elements such as IP/SIP Phones, SIP Trunking, routers, etc. that need to be configured properly to ensure they start working smoothly after services are switched over. 


SIP Phones Recommendation

Regenerate existing Sangoma's phone configuration 

Advanced recovery module has an option to regenerate the configuration of already connected/configured Sangoma's S and D series phones via Endpoint Manager.

"Advanced Recovery → Endpoint → Regenerate EPM config for S and D series phones" 

This option will add the 'Secondary Server' IP address parameter into the selected template as 'Backup SIP Server'. The option 'Update Phones' may also be selected to force all the phone under the template to pull a new configuration from the server.

...

Manually editing templates for Sangoma's S series phones

The 'Regenerate EPM Config for S and D series phones' mentioned above will take care of the Sangoma templates's configs for the backup server. Phone configs for any other Brand will have to be done manually by editing the related template.

Sangoma S and D series phones support the configuration of a "Failover" IP along with the Primary IP. 
The Endpoint Manager module, which is "Free" to use for Sangoma's S and D series phones, can be used to help configure this setup. 

...

Info

Application Note

If Advanced Recovery option 'Regenerate EPM Config for S and D series phones module' was executed then the associated template(s) will be pre-populated with Secondary server IP address.

...

Manually editing templates for Sangoma's D series phones

The backup destination address is added in the D/P Series phone template, in the Redundancy tab. (EPM → Sangoma  → D & P series phones)
Please refer to EPM-Admin User Guide#AdminUserGuide-templatesTemplateCreationandEditing(ExamplewithSangomaBrand) guide to see example of how to edit templates via EPM.

...


SIP Trunk recommendation

This is recommended to ensure SIP Trunk provider allows registration requests from both the Primary and Secondary server's IP.  
During the event of a failure when secondary server will become active then SIP Trunk provider should be able to accept the registration request from secondary server to bring up the SIP traffic. This does not apply if both Primary and Secondary servers are behind the same Public IP since both servers will register from the same source IP.

IT admin recommendation

It is advisable to IT persons or PBX's administrators to take care of below roles and responsibilities:

  1. Any networking changes required in order to bring up the secondary server (like router's port forwarding in case of NATing environment to make sure):

    1. Secondary server registration messages are reaching the SIP Trunking provider.

    2. Messages from the SIP Trunk need to reach the secondary server as they would do on the primary.

    3. Phones are able to register with the secondary server. 

  2. Make sure Primary and Secondary server IPs are not changing and if they are changing, we need to make sure GraphQL configuration (only server URL) are updating accordingly because both the servers are talking to each other using GraphQL API URIs.
    IP changes might result in false declaration of "server down" event.

  3. Make sure both Primary and Secondary servers are accessible to each other. The Firewall module will need to have both the servers IP whitelisted accordingly.

  4. Make sure SSH connectivity between both the servers.

  5. Keep in mind that on latest Recording Report module (v15.0.4.28+) Call Recording files will not be the part of "full system backup" so make sure the call recordings directory is included in the Primary Server - Advanced Recovery settings (as default they are located in /var/spool/asterisk/monitor/)

Switchback to Primary server 

Advanced Recovery module is mainly designed to do easy failover to Secondary system on the event of Primary server failure.
Once primary server is back up and running then its recommended to switch back services to primary to ensure any subsequent disruption in the future will not affect the phone system's availability.

...

  1. Login to Secondary server administrative GUI which is active as of now.

    1. Stop the Advanced recovery daemon to avoid getting notifications from Freepbx/PBXact GUI → Advanced Recovery option.

  2. Repair Primary server (if possible) or bring up new a Primary server by a fresh installation of FreePBX/PBXact.

  3. Once Primary server is ready then follow steps as mentioned in "Sync back to primary" to sync the data from secondary to primary.

  4. Once syncing over, switchback to primary server so that the primary server will become the active node and secondary will become the standby server.

Sync back to Primary 

This option will be useful when we want to bring up the Primary server which could be either the same server or new server/installation.

...

Once syncing to Primary server has finished, the 3rd step will give you the option to do "Switch back" which is the process of reverting the status of the trunks on the secondary server (disabling them) and turning back on the trunks on the Primary. This process will also update the status for the Advanced Recovery module so the Primary server will be the new (again) active server and the health check will be from the Primary → Secondary server.

...

High Level use case scenario using Advanced Recovery module

Advanced Recovery module will help to maintain the below fail over scenario where Secondary server will take over the production when there is a critical failure in Primary server, as illustrated below:

...


Frequently Asked Questions 

 (question) Do we need a floating IP now like in the old HA setup? 
 A:  No. A floating IP is not a requirement with this module. Each server has its own IP address. SSH communication must be open between both the servers.

...