11. Monitoring System¶
Table of contents
11.1. Overview¶
The Monitoring system is used to monitor various components of DIRAC. Currently, we have three monitoring types:
WMSHistory: for monitoring the DIRAC WMS
Component Monitoring: for monitoring DIRAC components such as services, agents, etc.
RMS Monitoring: for monitoring the DIRAC RequestManagement System (mostly the Request Executing Agent).
It is based on Elasticsearch distributed search and analytics NoSQL database. If you want to use it, you have to install the Monitoring service, and of course connect to a ElasticSearch instance.
11.2. Install Elasticsearch¶
This is not covered here, as installation and administration of ES are not part of DIRAC guide. Just a note on the ES versions supported: ES7 and ES6 are supported, the support for ES5 is not assured, and the one for ES6 will be dropped in a future release.
11.3. Configure the MonitoringSystem¶
You can run your Elastic cluster even without authentication, or using User name and password. You have to add the following parameters:
User
Password
Host
Port
The User name and Password must be added to the local cfg file while the other can be added to the CS using the Configuration web application. You have to handle the ES secret information in a similar way to what is done for the other supported SQL databases, e.g. MySQL
For example:
Systems
{
NoSQLDatabases
{
User = test
Password = password
}
}
The following option can be set in Systems/Monitoring/<Setup>/Databases/MonitoringDB:
- IndexPrefix: Prefix used to prepend to indexes created in the ES instance. If this
is not present in the CS, the indices are prefixed with the setup name.
For each monitoring types managed, the Period (how often a new index is created) can be defined with:
MonitoringTypes
{
ComponentMonitoring
{
# Indexing strategy. Possible values: day, week, month, year, null
Period = month
}
RMSMonitoring
{
# Indexing strategy. Possible values: day, week, month, year, null
Period = month
}
WMSHistory
{
# Indexing strategy. Possible values: day, week, month, year, null
Period = day
}
}
The given periods above are also the default periods in the code.
11.4. Enable WMSHistory monitoring¶
You have to add Monitoring
to the Backends
option of WorkloadManagemet/StatesAccountingAgent.
If you do so, this agent will collect information using the JobDB and send it to the Elasticsearch database.
This same agent can also report to the MySQL backend of the Accounting system (which is in fact the default).
Optionally, you can use an MQ system (like RabbitMQ) for failover, even though the agent already has a simple failover mechanism. You can configure the MQ in the local dirac.cfg file where the agent is running:
Resources
{
MQServices
{
hostname.some.where
{
MQType = Stomp
Port = 61613
User = monitoring
Password = seecret
Queues
{
WMSHistory
{
Acknowledgement = True
}
}
}
}
}
11.5. Enable Component monitoring¶
You have to set EnableActivityMonitoring=True
in the CS.
It can be done globally, the Operations
section, or per single component.
11.6. Enable RMS Monitoring¶
In order to enable RMSMonitoring we need to set value of EnableRMSMonitoring
flag to yes/true in the CS:
Systems
{
RequestManagement
{
<instance>
{
Agents
{
RequestExecutingAgent
{
...
EnableRMSMonitoring = True
}
}
}
}
}
11.7. Accessing the Monitoring information¶
After you installed and configured the Monitoring system, you can use the Monitoring web application.