Monitoring and Healthcheck
EJBCA contains a health check service that can be used for health monitoring remotely. This should be considered essential if running EJBCA in a clustered installation, as it can be used to determine whether a node is able to remain in the cluster or needs to be taken out.
The purpose of an Healthcheck is to notify if something is not as expected in order to allow a monitoring system to send alarms and cluster nodes to be taken offline.
An example, and a source of a common misunderstanding, is that putting a CA offline (CA Service State) is expected and will not result in Healthcheck warnings. The motivation is that if you have de-activated a CA, this has been done deliberately, i.e. everything is as it should be, and the Healthcheck should not warn. This is in contrast to when you have the CA activated but the Crypto Token goes offline, then the CA is expected to be online, but it cannot be because the crypto token is offline and therefore Healtcheck should warn.
Note that a configuration as in the following example will thus not result in Healthcheck warnings.
The servlet is located in the URL:
Which CAs that are checked by the health check service can be configured in the Admin Web on the CA Activation page as well as in the Edit CA page.
Common Criteria ComplianceTo be fully Common Criteria compliant, a different key for signature tests than certificate signing should be used in the CA's HSM token configuration (the "testKey" alias should point to a key with no other uses).
The following configuration parameters may be set to configure authorization and what the service checks:
|healthcheck.amountfreemem||1||The amount of memory that must be free on the server, in megabytes.|
|healthcheck.dbquery||select 1||Parameter indicating the string that should be used to do a minimal check that the database is working.|
|healthcheck.authorizedips||127.0.0.1||Specifies which remote IPs that may call this healthcheck servlet. Multiple IPs may be separated by a semicolon.|
|healthcheck.catokensigntest||false||Set to true to perform a test signature on each CA token during the check. Otherwise just checks that the token status is active.|
|healthcheck.publisherconnections||false||Set to true to perform a health test on all active publisher connections.|
Maintenance File Properties
|healthcheck.maintenancefile||Location of file containing information about maintenance.|
|healthcheck.maintenancepropertyname||DOWN_FOR_MAINTENANCE||The key of the property value in the maintenance, should be in the following format: |
The following parameters configure what message or HTTP error code the health service returns.
|healthcheck.okmessage||ALLOK||Text string used to say that everything is ok with this node. Any properties defined properties value can be used here by inserting it in as a property, e.g: |
|healthcheck.sendservererror||true||Set to true of the HTTP error code 500 should be sent in case of error.|
|healthcheck.customerrormessage||null||Allows for a custom error message to be configured.|
If an error is detected one or several of the following error messages is reported. All errors will be sent with a response code of 500
|MEM: Error Virtual Memory is about to run out, currently free memory : number||The JVM is about to run out of memory|
|DB: Error creating connection to database||JDBC Connection to the database failed, this might occur if DB crashes or network is down.|
|CA: Error CA Token is disconnected: CAName||This is a sign of hardware problems with one or several of the hard ca tokens in the node.|
|MAINT: DOWN_FOR_MAINTENANCE||This is reported when the healthcheck.maintenancefile is used and the node is set to be offline.|
|Error when testing the connection with publisher: PublisherName||This is reported when a test connection to one of the publishers failed.|
|Could not perform a test signature on the audit log.||Reported when the audit log failed to sign (if database protection is enabled)|