Watchdog (EN)
The Watchdog is an optional one Action.NET tool. It is responsible for monitoring the "health" of the computer and the server Action.NET. It also has a contingency plan for taking action in case of detecting problems.
Architecture
Watchdog is designed to run on every server. It can be used in an architecture with one or more servers. Each server will run an instance of Watchdog, communicating with the Action.NET running server and getting information from the server and local computer. In hot stanby architecture, Watchdogs on both servers exchange information with each other over TCP port 65001. This communication is done by the TCP/IP layer and an exchange of request and response messages is carried out in JSON format. This architecture is represented by Figure 1.
Configuration
The application is located within the directory of the installed Action.Net version. For example, for version 9.1 of the Action.net, you can find it under "C:\Program Files (x86)\SPIN\Action.Net\an-9.1\".
By choosing to use the tool. Starting and closing the Action.NET must be done by her. Shortcut creations in Windows are up to the user. Prior to use, some of the following settings in Project and Watchdog are required.
Action.Net Project
The project needs to have the script, which is responsible for transacting the information between Action.NET and Watchdog. The script must be executed by a Task in a specific time period. Here is the step-by-step parameterization of the project for version 9.1:
Go to the Run > Build > References menu.
Add reference to DLL Newtonsoft.Json with the parameters:
Domain:Server
Path: C:\Program Files (x86)\SPIN\Action.NET\an-9.1\Newtonsoft.Json.dllAdd reference to DLL Watchdog.Protocol with the parameters:
Domain:Server
Path: C:\Program Files (x86)\SPIN\Action.NET\an-9.1\Watchdog.Protocol.dllAccess the Edit > Script > Class menu.
Create a class with the parameters:
Name:Watchdog
Code: CSharp
Domain:Server
Script: Copy and paste the script code found in:
C:\Program Files (x86)\SPIN\Action.NET\an-9.1\Action.NET Watchdog - Script.txtAccess the Edit > Script > Tasks menu.
Create a task with the parameters:
Name:Watchdog
Code: CSharp
Period: 00:00:05
Domain:Server
Script: @Script.Class.Watchdog.Handler();
Watchdog
Watchdog has a configuration screen (See Figure 2). In it, the user must parameterize according to the needs of the computer project and hardware.
To save any change in the configuration, it is necessary to click on the Save button located in the lower right corner of the screen.
For the parameterization of Alarm Tags, they must be created in the project and associated with an alarm. This will signal the Action.NET to the operator, on the Alarms or Events screen, that there is a problem with the server. Normalized alarms receive a value of 0 and when there is a problem a value greater than 0 (Detailed in the following table).
Watchdog | |||
Reboot Process | Shutdown Command Timeout | Maximum waiting period in seconds for the server shutdown command to finish executing. | |
Kill Timeout | Maximum wait period in seconds for the kill command to finish executing in Action.NET processes | ||
Start Timeout | Maximum wait period in seconds for Action.NET to boot | ||
Tools | Auto Memory Cleanup | Enables automatic RAM cleanup. | |
Log Computer Data | Enables logging of data collected from the computer every X minutes. Logs are saved in: C:\Action.NET\Projects\Logs\Computer-DD-MM-YYYY.json | ||
Log Protocol Packages | Shows in the Log window the packets sent and received per protocol. | ||
Computer | |||
CPU Usage | Alarm Tag | Name of the digital alarm tag registered in the Action.NET, which will receive the values 0 or processor usage value, for alert generation. | |
Max | Maximum percentage of the computer's CPU usage. | ||
Timeout | Maximum wait period in seconds for the maximum CPU usage value to decrease. If this period is overflowed, the Alarm Tag will receive processor usage value, otherwise 0. | ||
Disk Usage | Alarm Tag | Name of the digital alarm tag registered in the Action.NET, which will receive the values 0 or hard disk usage value, for alert generation. | |
Max | Maximum percentage of computer disk usage. | ||
Timeout | Maximum wait period in seconds for the maximum disk usage value to decrease. If this period is overflowed, the Alarm Tag will receive hard disk usage value, otherwise 0. | ||
Memory Usage | Alarm Tag | Name of the digital alarm tag registered in the Action.NET, which will receive the values 0 or RAM memory usage value, for alert generation. | |
Max | Maximum percentage of computer memory usage. | ||
Timeout | Maximum wait period in seconds for the maximum memory usage value to decrease. If this period is overflowed, the Alarm Tag will receive a RAM memory usage value, otherwise 0. | ||
Network Offline | Alarm Tag | Name of the digital alarm tag registered in the Action.NET, which will receive the values 0 or 1, for alert generation. | |
Timeout | Maximum waiting period in seconds for the network between the servers to come back online. If this period is overflowed, the Alarm Tag will receive a value of 1, otherwise 0. | ||
Action.NET | |||
Server | Root Path | Action.NET Installation Directory | |
Version Path | Action.NET version installation directory | ||
Retentive File | Path of the retentive file to be deleted. Leave blank in case there is no need to delete the retentive file. This file is responsible for storing the last Tag values with this functionality enabled. | ||
Startup Arguments | Action.NET Server Startup Arguments | ||
Primary | Local | Indicates that this is the server primary of the Action.NET. Select this option if the computer is the primary server. | |
Hostname | Server computer name primary From Action.NET | ||
Communication Timeout | Maximum wait period in seconds for communication to occur between the Watchdog and the server primary From Action.NET | ||
Secondary | Local | Indicates that this is the server secondary of the Action.NET. Select this option if the computer is the secondary server. | |
Hostname | Server computer name secondary of the Action.NET. If there is none, leave it blank. | ||
Communication Timeout | Maximum wait period in seconds for communication to occur between the Watchdog and the server secondary of the Action.NET. If there is none, leave it at 0. | ||
Issues | |||
Core Modules Offline | Alarm Tag | Name of the alarm tag of the entire type registered in the Action.NET, which will receive the value registered in the alarm timeout. Thus, the user will be able to register a customized message indicating how long was waited before the alarm was generated. | |
Alarm Timeout | Maximum waiting period in seconds for the number of online core modules to be restored to normal. If this period is overflowed, the Alarm Tag will receive the value of the maximum period, otherwise 0. | ||
Timeout | Maximum waiting period in seconds for the number of online core modules to be restored to normal. If this period expires, the local server Action.NET restarts. | ||
Hot Standby Communication | Alarm Tag | Name of the alarm tag of the entire type registered in the Action.NET, which will receive the value registered in the alarm timeout. Thus, the user will be able to register a customized message indicating how long was waited before the alarm was generated. | |
Alarm Timeout | Maximum waiting period in seconds for the hot standby architecture to return to normal operation, which consists of one active server and the other on standby. If this period is overflowed, the Alarm Tag will receive the value of the maximum period, otherwise 0. | ||
Timeout | Maximum waiting period in seconds for the hot standby architecture to return to normal operation, which consists of one active server and the other on standby. If this period is overflowed, the local and remote server Action.NET will be restarted. | ||
Memory Overflow | Alarm Tag | Name of the alarm tag of the entire type registered in the Action.NET, which will receive the value registered in the alarm. Thus, the user will be able to register a customized message indicating how many memory overflows occurred before the alarm was generated. | |
Alarm | Amount of memory overflows for the alarm to be generated. The Alarm Tag will receive the value of the amount of overflows. The amount of overflows is cumulative and is only reset on the local server Action.NET restart by Watchdog. | ||
Max | Maximum amount of memory overflows. If this value is overflowed, the local server Action.NET restarts. | ||
Memory Usage | Alarm Tag | Name of the alarm tag of the entire type registered in the Action.NET, which will receive the value registered in the alarm. Thus, the user will be able to register a customized message indicating how many megabytes of memory consumption have already been reached by the local Action.NET server before the alarm is generated. | |
Alarm | The amount of megabytes consumed of memory from the local Action.NET server for the alarm to be generated. The Alarm Tag will receive the value of the memory consumption in megabytes. | ||
Max | Maximum amount of consumption in megabytes by the local Action.NET server. | ||
Timeout | Maximum waiting period in seconds for maximum memory consumption to be reduced. If this period expires, the local server Action.NET restarts. | ||
No Communication | Timeout | Maximum waiting period in seconds for the local Action.NET server to communicate with the Watchdog again. If this period expires, the local server Action.NET restarts. | |
Offline | Timeout
| Maximum waiting period in seconds for the local Action.NET server to be online. If this period expires, the local server Action.NET restarts. |
Features
Monitoring Screen
The Watchdog has a monitoring screen (See Figure 3). It is intended to provide a real-time log and important information collected from the computer and the primary and secondary servers of the Action.NET.
Computer
THE – Operating system with version and build.
Uptime – Time in operation.
Disk – Hard drive capacity and usage.
Memory – RAM memory capacity and usage.
CPU – Number of physical/logical cores and processor usage.
Action.NET – Information about the server Action.NET running.
Status (Online or Offline)
Uptime
Memory usage and the maximum limit set in the configuration.
Amount of memory overflow and the maximum limit set in the setting.
.NET Frameworks – List of installed versions of the Microsoft .NET Framework.
Primary / Secondary Servers
IPv4 / IPv6 Addresses: List of IP's found for communication with the server.
Version: Version of the Action.NET running on the server.
Alarm Module - Status (Online or Offline) of the Action.NET alarm module on the server.
Historian Module - Status (Online or Offline) of the Action.NET history module on the server.
Device Module - Status (Online or Offline) of the Action.NET devices module on the server.
Log - Shows real-time log messages colored by type.
Debug –Depuration.
Information –Information.
Warning –Warnings.
Error –Errors.
Watchdog Uptime – Watchdog operating time.
Start – Button for initialize manually The Action.NET.
Shutdown – Button for Hang up manually the Action.NET and the Watchdog.
Icon menu in the system tray
Watchdog has an icon in the system tray, which is located in the lower right corner of the computer's desktop. By double-clicking the left mouse button, the Watchdog becomes visible or invisible. By right-clicking the menu below opens.
Hide / Show - Hides/restores the Watchdog window on Windows
Memory Cleanup - Manually cleans up the memory
Kill - Quits Watchdog manually
Log
Watchdog uses a Windows text file and event logging system. There are four types of messages: Debug, Info, Warning, and Error:
Debug - Lower-level messaging. Used when it is necessary to identify each process in more detail. Generally used by developers, testers and integrators;
Info - Messages of important information to monitor the operation of the Communication Module;
Warning - Warning messages, which can be ignored or need to be checked; and
Error - Error messages, which cannot be ignored and need immediate attention and correction for correct operation.
The configuration file (Action.NET Watchdog.exe.config) is in the same directory as the executable. It has a LOG session (See Figure 5). This session is configurable and as a reference the documentation is used in: Apache log4net – Apache log4net: Config Examples - Apache log4net
The default setting is set to store in (C:\Action.NET\Projects\Logs) a maximum of 60 files, each with a maximum size of 35 Megabytes. Each day cycle a file is created, if it has not exceeded the maximum size. In case this size is exceeded once. A copy of the day's file ending in .1 will be created, and another new file will be created. If the burst is repeated on the same day. The new file replaces the copy, and a new file is started. If not, a new daily file cycle is created. Thus, each node will be able to store approximately 2152 Megabytes on disk.
Contingency plan
The table below elucidates the measures taken in case of problem detection. All issues are analyzed with a maximum expected time margin (timeout). Alarms generation and corrective action are performed when there is an overflow of the configured maximum time. If there is a normalization, the alarms receive a value of 0 and no corrective action is taken. These settings are parameterized on the Watchdog configuration screen.
Computer | |||
Problem | Config | Alert | Corrective action |
CPU usage above the configured maximum value | CPU Usage | Alarm is generated on the configured tag | No |
Hard disk usage above the configured maximum value | Disk Usage | Alarm is generated on the configured tag | No |
RAM usage above the configured maximum value | Memory Usage | Alarm is generated on the configured tag | No |
Offline network between the primary and secondary servers. There is no response per PING command. | Network Offline | Alarm is generated on the configured tag | No |
Action.NET | |||
Problem | Config (Issue) | Alert | Corrective action |
Quantity of online master modules differs from the quantity set after 30 minutes online | Core Modules Offline | Alarm is generated on the configured tag | Rebooting the local Action.NET server |
In Hot Standby architecture, both servers are online and communicating. | Hot Standby Communication | Alarm is generated on the configured tag | Action.NET reboot on both servers |
The memory usage of the Action.NET server exceeds the maximum allowed. A memory overflow is accounted for. | Memory Usage | Alarm is generated on the configured tag | Rebooting the local Action.NET server |
The amount of Action.NET server memory overflows exceeds the maximum allowed | Memory Overflow | Alarm is generated on the configured tag | Rebooting the local Action.NET server |
There is no communication between Watchdog and the Action.NET server | No Communication | None | Rebooting the local Action.NET server |
The local Action.NET server is not running | Offline | None |
|
On this page: