Watchdog (EN)

The Watchdog is an optional one Action.NET tool. It is responsible for monitoring the "health" of the computer and the server Action.NET. It also has a contingency plan for taking action in case of detecting problems.

 

Architecture

Watchdog is designed to run on every server. It can be used in an architecture with one or more servers. Each server will run an instance of Watchdog, communicating with the Action.NET running server and getting information from the server and local computer. In hot stanby architecture, Watchdogs on both servers exchange information with each other over TCP port 65001. This communication is done by the TCP/IP layer and an exchange of request and response messages is carried out in JSON format. This architecture is represented by Figure 1.

 

Configuration

The application is located within the directory of the installed Action.Net version. For example, for version 9.1 of the Action.net, you can find it under "C:\Program Files (x86)\SPIN\Action.Net\an-9.1\".

By choosing to use the tool. Starting and closing the Action.NET must be done by her. Shortcut creations in Windows are up to the user. Prior to use, some of the following settings in Project and Watchdog are required.

Action.Net Project

The project needs to have the script, which is responsible for transacting the information between Action.NET and Watchdog. The script must be executed by a Task in a specific time period. Here is the step-by-step parameterization of the project for version 9.1:

  1. Go to the Run > Build > References menu.

  2. Add reference to DLL Newtonsoft.Json with the parameters:
    Domain:Server
    Path: C:\Program Files (x86)\SPIN\Action.NET\an-9.1\Newtonsoft.Json.dll

  3. Add reference to DLL Watchdog.Protocol with the parameters:
    Domain:Server
    Path: C:\Program Files (x86)\SPIN\Action.NET\an-9.1\Watchdog.Protocol.dll

  4. Access the Edit > Script > Class menu.

  5. Create a class with the parameters:
    Name:Watchdog
    Code: CSharp
    Domain:Server
    Script: Copy and paste the script code found in:
    C:\Program Files (x86)\SPIN\Action.NET\an-9.1\Action.NET Watchdog - Script.txt

  6. Access the Edit > Script > Tasks menu.

  7. Create a task with the parameters:
    Name:Watchdog
    Code: CSharp
    Period: 00:00:05
    Domain:Server
    Script: @Script.Class.Watchdog.Handler();

Watchdog

Watchdog has a configuration screen (See Figure 2). In it, the user must parameterize according to the needs of the computer project and hardware.
To save any change in the configuration, it is necessary to click on the Save button located in the lower right corner of the screen.

Watchdog Configuration Screen

For the parameterization of Alarm Tags, they must be created in the project and associated with an alarm. This will signal the Action.NET to the operator, on the Alarms or Events screen, that there is a problem with the server. Normalized alarms receive a value of 0 and when there is a problem a value greater than 0 (Detailed in the following table).

Watchdog

Reboot Process

Shutdown Command Timeout

Maximum waiting period in seconds for the server shutdown command to finish executing.

Kill Timeout

Maximum wait period in seconds for the kill command to finish executing in Action.NET processes

Start Timeout

Maximum wait period in seconds for Action.NET to boot

Tools

Auto Memory Cleanup

Enables automatic RAM cleanup.

Log Computer Data

Enables logging of data collected from the computer every X minutes. Logs are saved in: C:\Action.NET\Projects\Logs\Computer-DD-MM-YYYY.json

Log Protocol Packages

Shows in the Log window the packets sent and received per protocol.

Computer

CPU Usage

Alarm Tag

Name of the digital alarm tag registered in the Action.NET, which will receive the values 0 or processor usage value, for alert generation.

Max

Maximum percentage of the computer's CPU usage.

Timeout

Maximum wait period in seconds for the maximum CPU usage value to decrease. If this period is overflowed, the Alarm Tag will receive processor usage value, otherwise 0.

Disk Usage

Alarm Tag

Name of the digital alarm tag registered in the Action.NET, which will receive the values 0 or hard disk usage value, for alert generation.

Max

Maximum percentage of computer disk usage.

Timeout

Maximum wait period in seconds for the maximum disk usage value to decrease. If this period is overflowed, the Alarm Tag will receive hard disk usage value, otherwise 0.

Memory Usage

Alarm Tag

Name of the digital alarm tag registered in the Action.NET, which will receive the values 0 or RAM memory usage value, for alert generation.

Max

Maximum percentage of computer memory usage.

Timeout

Maximum wait period in seconds for the maximum memory usage value to decrease. If this period is overflowed, the Alarm Tag will receive a RAM memory usage value, otherwise 0.

Network Offline

Alarm Tag

Name of the digital alarm tag registered in the Action.NET, which will receive the values 0 or 1, for alert generation.

Timeout

Maximum waiting period in seconds for the network between the servers to come back online. If this period is overflowed, the Alarm Tag will receive a value of 1, otherwise 0.

Action.NET

Server

Root Path

Action.NET Installation Directory

Version Path

Action.NET version installation directory

Retentive File

Path of the retentive file to be deleted. Leave blank in case there is no need to delete the retentive file. This file is responsible for storing the last Tag values with this functionality enabled.

Startup Arguments

Action.NET Server Startup Arguments

Primary

Local

Indicates that this is the server primary of the Action.NET. Select this option if the computer is the primary server.

Hostname

Server computer name primary From Action.NET

Communication Timeout

Maximum wait period in seconds for communication to occur between the Watchdog and the server primary From Action.NET

Secondary

Local

Indicates that this is the server secondary of the Action.NET. Select this option if the computer is the secondary server.

Hostname

Server computer name secondary of the Action.NET. If there is none, leave it blank.

Communication Timeout

Maximum wait period in seconds for communication to occur between the Watchdog and the server secondary of the Action.NET. If there is none, leave it at 0.

Issues

Core Modules Offline

Alarm Tag

Name of the alarm tag of the entire type registered in the Action.NET, which will receive the value registered in the alarm timeout. Thus, the user will be able to register a customized message indicating how long was waited before the alarm was generated.

Alarm Timeout

Maximum waiting period in seconds for the number of online core modules to be restored to normal. If this period is overflowed, the Alarm Tag will receive the value of the maximum period, otherwise 0.

Timeout

Maximum waiting period in seconds for the number of online core modules to be restored to normal. If this period expires, the local server Action.NET restarts.

Hot Standby Communication

Alarm Tag

Name of the alarm tag of the entire type registered in the Action.NET, which will receive the value registered in the alarm timeout. Thus, the user will be able to register a customized message indicating how long was waited before the alarm was generated.

Alarm Timeout

Maximum waiting period in seconds for the hot standby architecture to return to normal operation, which consists of one active server and the other on standby. If this period is overflowed, the Alarm Tag will receive the value of the maximum period, otherwise 0.

Timeout

Maximum waiting period in seconds for the hot standby architecture to return to normal operation, which consists of one active server and the other on standby. If this period is overflowed, the local and remote server Action.NET will be restarted.

Memory Overflow

Alarm Tag

Name of the alarm tag of the entire type registered in the Action.NET, which will receive the value registered in the alarm. Thus, the user will be able to register a customized message indicating how many memory overflows occurred before the alarm was generated.

Alarm

Amount of memory overflows for the alarm to be generated. The Alarm Tag will receive the value of the amount of overflows. The amount of overflows is cumulative and is only reset on the local server Action.NET restart by Watchdog.

Max

Maximum amount of memory overflows. If this value is overflowed, the local server Action.NET restarts.

Memory Usage

Alarm Tag

Name of the alarm tag of the entire type registered in the Action.NET, which will receive the value registered in the alarm. Thus, the user will be able to register a customized message indicating how many megabytes of memory consumption have already been reached by the local Action.NET server before the alarm is generated.

Alarm

The amount of megabytes consumed of memory from the local Action.NET server for the alarm to be generated. The Alarm Tag will receive the value of the memory consumption in megabytes.

Max

Maximum amount of consumption in megabytes by the local Action.NET server.

Timeout

Maximum waiting period in seconds for maximum memory consumption to be reduced. If this period expires, the local server Action.NET restarts.

No Communication

Timeout

Maximum waiting period in seconds for the local Action.NET server to communicate with the Watchdog again. If this period expires, the local server Action.NET restarts.

Offline

Timeout

 

Maximum waiting period in seconds for the local Action.NET server to be online. If this period expires, the local server Action.NET restarts.

Features

Monitoring Screen

The Watchdog has a monitoring screen (See Figure 3). It is intended to provide a real-time log and important information collected from the computer and the primary and secondary servers of the Action.NET.

 

Computer

  • THE – Operating system with version and build.

  • Uptime – Time in operation.

  • Disk – Hard drive capacity and usage.

  • Memory – RAM memory capacity and usage.

  • CPU – Number of physical/logical cores and processor usage.

  • Action.NET – Information about the server Action.NET running.

    • Status (Online or Offline)

    • Uptime

    • Memory usage and the maximum limit set in the configuration.

    • Amount of memory overflow and the maximum limit set in the setting.

  • .NET Frameworks – List of installed versions of the Microsoft .NET Framework.

Primary / Secondary Servers

  • IPv4 / IPv6 Addresses: List of IP's found for communication with the server.

  • Version: Version of the Action.NET running on the server.

  • Alarm Module - Status (Online or Offline) of the Action.NET alarm module on the server.

  • Historian Module - Status (Online or Offline) of the Action.NET history module on the server.

  • Device Module - Status (Online or Offline) of the Action.NET devices module on the server.

  • Log - Shows real-time log messages colored by type.

    • Debug –Depuration.

    • Information –Information.

    • Warning –Warnings.

    • Error –Errors.

  • Watchdog Uptime – Watchdog operating time.

Start – Button for initialize manually The Action.NET.
Shutdown – Button for Hang up manually the Action.NET and the Watchdog.

Icon menu in the system tray

Watchdog has an icon in the system tray, which is located in the lower right corner of the computer's desktop. By double-clicking the left mouse button, the Watchdog becomes visible or invisible. By right-clicking the menu below opens.

  • Hide / Show - Hides/restores the Watchdog window on Windows

  • Memory Cleanup - Manually cleans up the memory

  • Kill - Quits Watchdog manually

Log

Watchdog uses a Windows text file and event logging system. There are four types of messages: Debug, Info, Warning, and Error:

  • Debug - Lower-level messaging. Used when it is necessary to identify each process in more detail. Generally used by developers, testers and integrators;

  • Info - Messages of important information to monitor the operation of the Communication Module;

  • Warning - Warning messages, which can be ignored or need to be checked; and

  • Error - Error messages, which cannot be ignored and need immediate attention and correction for correct operation.

The configuration file (Action.NET Watchdog.exe.config) is in the same directory as the executable. It has a LOG session (See Figure 5). This session is configurable and as a reference the documentation is used in: Apache log4net – Apache log4net: Config Examples - Apache log4net

The default setting is set to store in (C:\Action.NET\Projects\Logs) a maximum of 60 files, each with a maximum size of 35 Megabytes. Each day cycle a file is created, if it has not exceeded the maximum size. In case this size is exceeded once. A copy of the day's file ending in .1 will be created, and another new file will be created. If the burst is repeated on the same day. The new file replaces the copy, and a new file is started. If not, a new daily file cycle is created. Thus, each node will be able to store approximately 2152 Megabytes on disk.

Contingency plan

The table below elucidates the measures taken in case of problem detection. All issues are analyzed with a maximum expected time margin (timeout). Alarms generation and corrective action are performed when there is an overflow of the configured maximum time. If there is a normalization, the alarms receive a value of 0 and no corrective action is taken. These settings are parameterized on the Watchdog configuration screen.

Computer

Problem

Config

Alert

Corrective action

CPU usage above the configured maximum value

CPU Usage

Alarm is generated on the configured tag

No

Hard disk usage above the configured maximum value

Disk Usage

Alarm is generated on the configured tag

No

RAM usage above the configured maximum value

Memory Usage

Alarm is generated on the configured tag

No

Offline network between the primary and secondary servers. There is no response per PING command.

Network Offline

Alarm is generated on the configured tag

No

Action.NET

Problem

Config (Issue)

Alert

Corrective action

Quantity of online master modules differs from the quantity set after 30 minutes online

Core Modules Offline

Alarm is generated on the configured tag

Rebooting the local Action.NET server

In Hot Standby architecture, both servers are online and communicating.

Hot Standby Communication

Alarm is generated on the configured tag

Action.NET reboot on both servers

The memory usage of the Action.NET server exceeds the maximum allowed. A memory overflow is accounted for.

Memory Usage

Alarm is generated on the configured tag

Rebooting the local Action.NET server

The amount of Action.NET server memory overflows exceeds the maximum allowed

Memory Overflow

Alarm is generated on the configured tag

Rebooting the local Action.NET server

There is no communication between Watchdog and the Action.NET server

No Communication

None

Rebooting the local Action.NET server

The local Action.NET server is not running

Offline

None

 

 

On this page: