For an infrastructure to be healthy, there must be good monitoring, and the team must have a monitoring infrastructure that speeds up and facilitates the verification of problems, following the line of prevention, maintenance, and correction. Librato was created with the purpose of helping monitoring teams control infrastructure, with an extra emphasis on Linux monitoring.
It is critical that a technology team prepare for any situation that occurs in their environment. The purpose of monitoring is to be aware of any changes occurring in the environment so that immediate action is possible to solve problems. With good monitoring history and proper perception, one can suggest environmental improvements according to what the monitoring charts present. If you have a server that displays memory usage for a certain amount of time, you can purchase more memory, or investigate the cause of the abnormal behavior before the environment becomes unavailable.
Monitoring indexes can be used for various purposes, such as application availability for a given number of users, tool deployment tracking, operating system update behavior, purchase requests, and exchanges or hardware upgrades. Each point of use depends on your deployment purpose.
Linux servers historically have operating systems that are difficult to monitor because tools in the market serve other platforms. In addition, a portion of information technology professionals cannot make monitoring work properly on these servers, so when a disaster occurs, it is difficult to identify what happened.
Constant monitoring of servers and services used in production is critical for company environments. Server failures in virtualization, backup, firewalls, and proxies can directly impact availability and quality of service.
The Linux operating system has a basic monitoring system which can be used by more experienced administrators; however, when it comes to a monitoring team, there is a need for real-time reports for immediate action. You cannot count on the availability of an experienced system administrator to access the servers, or assume that he or she can perform all existing monitoring capabilities.
In the current job market, it is important to remember that Linux specialists are rare, and their availability is limited. There are cases where an expert administrator can only act on a server when the problem has been long-standing. Training for teams to become Linux experts is expensive and time-consuming, with potentially low returns.
Metrics used for monitoring
1. CPU - Because this is such an important component of the computer system, it is crucial to monitor equipment, as it can reach a high utilization rate and temperature. The CPU can have multiple cores. An application can be directed to only one of these cores and point to a dangerous behavior in hardware.
2. Load - Basically, this specifies whether the CPU is being used, how much is being executed, and how long it has been running.
3. Disk Capacity and IO - Disk capacity is of the utmost importance, especially when it comes to image servers, files, and VMs, as it can directly affect system shutdown, corrupt the operating system, or cause extreme slowness when it comes to IO. Along with disk monitoring, it’s possible to plan for an eventual change or addition of a disk, and to verify the behavior of a disk that demonstrates signs of hardware failure.
4. Network - When it comes to servers such as DNS, DHCP, firewall, file server, and proxy, it is extremely important to monitor network performance as input and output of data packets—and with network performance logs, it is possible to measure the utilization of the card, and create a plan to suit the application according to the use of the network.
5. Memory - Memory monitoring in other components comes as a priority because it determines the immediate stop of a system due to memory overflow or misdirection for a single application.
6. Swap - This is virtual memory created by the system to be used when necessary. Swap is memory created by the system allocated to disk. Its monitoring is necessary because its high utilization can indicate that the amount of memory for the server is insufficient.
With this set of information taken from Linux systems, it is possible to have good monitoring, and a team that can act immediately on system downtime that can paralyze critical systems.
Monitoring with Librato
Librato is a real-time web monitoring tool, making it possible to set up a real-time monitoring environment, create alerts by e-mail, and focus on threshold and monitoring history.
Among other options, it is possible to create monitoring levels, with profiles of equipment to be monitored, and simple monitoring viewers will trigger a specialist or open a call for immediate action when needed.
This tool can also be an ally of an ITIL/COBIT team, which can use the reports to justify scheduled and unscheduled stops, and clarify systems that have problems within their history. It can also be used to justify the purchase of new equipment, software upgrades, or the migration of a system that no longer meets the needs of a company.
Librato can be installed in major Linux distributions such as RedHat, CentOS, Ubuntu, Debian, Fedora, and Amazon Linux. Its deployment is easy, fast, and practical. You can follow a step-by-step guide on the Integration page, where there are Easy Install and Advanced options for users. After installation, you can start configuring the dashboard for monitoring on the server.
Space - You can create spaces to demonstrate the location you want to monitor—for example, Datacenter01, SRV_Corp2017, SRV_FW2B4, etc. You can choose the type of monitoring display (Line, Stacked and Big Number). Then you can choose what you want to monitor as CPU Percent, Swap, or Load. In addition, within the dashboard, you can select how long you want to monitor a group of equipment or set it to be monitored indefinitely.
Metrics - You can select existing metrics to create new composite metrics according to what you want to be monitored in the operating system.
Alert - Where alerts are created for the operating system, with time settings for issuing a new alert, and the conditions for issuing alerts.
Integration - Where the Linux distribution is selected that will be part of your monitoring, and the steps for the installation of each distribution.
Librato is a great monitoring tool when it comes to Linux operating systems, making it easier to implement and monitor equipment. Librato has a range of ready-made tools, customizable monitoring panels, and reports that are critical for investigating chronic infrastructure problems.