HealthCheck Service Asadmin Command Reference
The following is a detailed list of the administration commands that can be used to correctly configure the HealthCheck Service.
set-healthcheck-configuration
- Usage
set-healthcheck-configuration --enabled=true|false --dynamic=true|false --historic-trace-enabled=true|false --historic-trace-store-size=20 --historic-trace-store-timeout=<integer.value>s|m|h|d --set-notifiers=<notifier.name> --enable-notifiers=<notifier.name> --disable-notifiers=<notifier.name>
- Aim
-
Enables and disables the HealthCheck service. This includes configuration for tracing historic health check events for later inspection.
Command Options
Option | Type | Description | Default | Mandatory |
---|---|---|---|---|
|
String |
The instance or cluster that will enable or disable its service |
server |
no |
|
Boolean |
Whether to apply the changes directly to the server without a restart |
false |
no |
|
Boolean |
Whether to enable or disable the service |
N/A |
yes |
|
Boolean |
Enables storing traces in a rolling store for later inspection |
false |
no |
|
Integer |
Sets the maximum number of health checks to store |
20 |
no |
|
String |
Sets the time period after which a historic health check event entry is removed from visable history. The time expression should consist of a number followed by a time unit; |
- |
no |
|
String |
Use the option to set the notifiers to the HealthCheck Service. It will replace all the notifiers that have already been set to the HealthCheck Service. A comma-separated list can be used to represent multiple notifiers. |
The notifiers avaiable by default are:
|
no |
|
String |
Use the option to enable a notifier. A comma-separated list can be used to represent multiple notifiers. |
- |
no |
|
String |
Use the option to disable a notifier. A comma-separated list can be used to represent multiple notifiers. |
- |
no |
Enabling or disabling the health check service implicitly also enables or disables the log notifier which is the default notifier. |
You can find the list of available notifiers using the
list-notifiers command.
|
Example
The following example will enable the Healthcheck service such that it will only activate from the next time the server is restarted. It sets the log notifier and JMS notifiers and the historical trace store to retain 20 health checks.
asadmin> set-healthcheck-configuration
--enabled=true
--dynamic=false
--historic-trace-enabled=true
--historic-trace-store-size=20
--set-notifiers=log-notifier,jms-notifiers
list-healthcheck-services
- Usage
-
asadmin> list-healthcheck-services
- Aim
-
Lists the names of all available metric checker services.
Example
Running the command will show output similar to the example below:
Available Health Check Services: Name Description healthcheck-cpool Provides ratio on connection usage for a given pool name with severity according to defined threshold values healthcheck-mp Checks that all instances are responding to Microprofile Healthcheck requests with an UP response healthcheck-stuck Provides thread name, id and stack trace for requests which reach over defined threshold values healthcheck-cpu Provides ratio on cpu usage time with severity according to defined threshold values healthcheck-gc Provides ratio on garbage collection count with severity according to defined threshold values healthcheck-heap Provides ratio on used heap memory with severity according to defined threshold values healthcheck-threads Lists hogging threads with their id when given thresholds exceed healthcheck-machinemem Provides ratio on used machine memory with severity according to defined threshold values healthcheck-mpmetrics Provides a way to monitor and log the values of metrics exposed by MicroProfile Metrics Command list-healthcheck-services executed successfully.
set-healthcheck-service-configuration
- Usage
set-healthcheck-service-configuration --enabled=true|false --dynamic=true|false --service=<service.name> --checker-name=<string.value> --add-to-microprofile-health=true|false --time=<integer.value> --time-unit=DAYS|HOURS|MINUTES|SECONDS|MILLISECONDS --threshold-critical=80 --threshold-warning=50 --threshold-good=0 --hogging-threads-threshold=<integer.value> --hogging-threads-retry-count=<integer.value> --stuck-threads-threshold=<integer.value> --stuck-threads-threshold-unit=DAYS|HOURS|MINUTES|SECONDS|MILLISECONDS --add-metric=<metric.name> --delete-metric=<metric.name>
- Aim
-
Enables or disables the monitoring of an specific metric. The command also configures the frequency of monitoring for that metric. Furthermore it configures metric specific properties.
Command Options
Option | Type | Description | Default | Mandatory |
---|---|---|---|---|
|
String |
The instance or cluster that will enable or disable its metric configuration |
server |
no |
|
Boolean |
Whether to apply the changes directly to the server/instance without a restart |
false |
no |
|
Boolean |
Whether to enable or disable the metric monitoring |
N/A |
yes |
|
String |
The service metric name. One of:
|
- |
yes |
|
String |
A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages. |
Depends on the service checker. One of:
|
no |
|
String |
When enabled the checker is add to MicroProfile Health and all health check result for the checker is displayed on MicroProfile Health REST endpoints. |
false |
no |
|
Integer |
The amount of time units that the service will use to periodically monitor the metric |
5 |
no |
|
TimeUnit |
The time unit to set the frequency of the metric monitoring. Must correspond to a valid
|
|
no |
|
Integer |
The threshold value that this metric must surpass to generate a |
90 |
no |
|
Integer |
The threshold value that this metric must surpass to generate a |
50 |
no |
|
Integer |
The threshold value that this metric must surpass to generate a |
0 |
no |
|
Integer |
The threshold value that this metric will be compared to mark threads as hogging the CPU. Only available for |
95 |
no |
|
Integer |
The number of retries that the checker service will execute in order to identify a hogging thread. Only available for |
3 |
no |
|
Integer |
The threshold above which a thread is considered stuck. Must be 1 or greater. Only available for |
- |
no |
|
The unit for the threshold for when a thread should be considered stuck. Only available for |
- |
no |
|
|
String |
Adds a metric exposed by MicroProfile Metrics to monitor. Takes a string of the format |
- |
no |
|
String |
Removes a metric exposed by MicroProfile Metrics that has been added to monitor. Takes a string of the format |
- |
no |
If this command gets executed before running the set-healthcheck-configuration
command, it will succeed and the configuration will be saved, but the HealthCheck
service will not be enabled.
|
Examples
A very basic example command to simply enable the GC checker and activate it without needing a restart would be as follows:
asadmin> set-healthcheck-service-configuration
--enabled=true
--service=gc
--dynamic=true
Monitoring the health of JDBC connection pools is a common need. In that
scenario, it is very unlikely that on-the-fly configuration changes
would be made, so a very high CRITICAL
threshold can be set. Likewise,
a nonzero GOOD
threshold is needed because an empty or unused
connection pool may not be healthy either.
The following command would apply these settings to the connection pool checker:
asadmin> set-healthcheck-service-configuration
--service=cp
--dynamic=true
--threshold-critical=95
--threshold-warning=70
--threshold-good=30
Monitoring which threads hog the CPU is extremely important since this can lead to performance degradation, deadlocks and extreme bottlenecks issues that web applications can incur. In some cases the defaults are all that is needed, but imagine that in a critical system you want to set the threshold percentage to 90%, and you want to make sure that the health check service guarantees the state of such threads with a retry count of 5. Additionally, you want to set the frequency of this check for every 20 seconds.
The following command would apply these settings to the connection pool checker:
asadmin> set-healthcheck-service-configuration
--service=cp
--dynamic=true
--hogging-threads-threshold=90
--hogging-threads-retry-count=5
--time=20
--time-unit=SECONDS
The following example configures the stuck threads checker to check every 30 seconds for any threads which have been stuck for more than 5 minutes and applies the configuration change without needing a restart:
asadmin> set-healthcheck-service-configuration
--service=stuck-thread
--enabled=true
--dynamic=true
--time=30
--time-unit=SECONDS
--stuck-threads-threshold=5
--stuck-threads-threshold-unit=MINUTES
The following example configures the Microprofile Metrics Checker to add
base_thread_max_count
metrics for monitoring, adds the checker to MicroProfile Health to
display its result on MicroProfile Health REST endpoints
and applies the configuration change without needing a restart:
asadmin> set-healthcheck-service-configuration
--service=mp-metrics
--enabled=true
--dynamic=true
--add-to-microprofile-health=true
--add-metric='metricName=base_thread_max_count'
get-healthcheck-configuration
- Usage
-
asadmin> get-healthcheck-configuration
- Aim
-
Lists the current configuration for the health check service, configured checkers and enabled notifiers.
Example
A sample output is as follows:
Health Check Service Configuration is enabled?: true Historical Tracing Enabled?: true Historical Tracing Store Size: 20 Name Notifier Enabled log-notifier true jms-notifier false cdieventbus-notifier false eventbus-notifier false Below are the list of configuration details of each checker listed by its name. Name Enabled Time Unit Add to MicroProfile Health Critical Threshold Warning Threshold Good Threshold CPUC true 5 MINUTES true 80 50 0 HEAP true 5 MINUTES false 80 50 0 Name Enabled Time Unit Add to MicroProfile Health Threshold Time Threshold Unit STUCK true 5 MINUTES false 5 MINUTES Name Enabled Time Unit Add to MicroProfile Health MPM true 5 MINUTES false Monitored Metric Name Description base_thread_max_count Displays the peak live thread count since the Java virtual machine started or peak was reset. This includes daemon and non-daemon threads. base_gc_total_total Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector. Command get-healthcheck-configuration executed successfully.