HealthCheck Service Asadmin Command Reference

The following is a detailed list of the administration commands that can be used to correctly configure the HealthCheck Service.

`set-healthcheck-configuration`

Usage

set-healthcheck-configuration
 --enabled=true|false
 --dynamic=true|false
 --historic-trace-enabled=true|false
 --historic-trace-store-size=20
 --historic-trace-store-timeout=<integer.value>s|m|h|d
 --set-notifiers=<notifier.name>
 --enable-notifiers=<notifier.name>
 --disable-notifiers=<notifier.name>

Aim: Enables and disables the HealthCheck service. This includes configuration for tracing historic health check events for later inspection.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its service	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the service	N/A	yes
`--historic-trace-enabled`	Boolean	Enables storing traces in a rolling store for later inspection	false	no
`--historic-trace-store-size`	Integer	Sets the maximum number of health checks to store	20	no
`--historic-trace-store-timeout`	String	Sets the time period after which a historic health check event entry is removed from visable history. The time expression should consist of a number followed by a time unit; `s` for seconds, `m` for minutes, `h` for hours or `d` for days. If no time unit is given the number specifies seconds. If the parameter is zero or unspecified there is no timeout for entries.	-	no
`--set-notifiers`	String	Use the option to set the notifiers to the HealthCheck Service. It will replace all the notifiers that have already been set to the HealthCheck Service. A comma-separated list can be used to represent multiple notifiers.	The notifiers avaiable by default are: `log-notifier` `jms-notifier` `cdieventbus-notifier` `eventbus-notifier`	no
`--enable-notifiers`	String	Use the option to enable a notifier. A comma-separated list can be used to represent multiple notifiers.	-	no
`--disable-notifiers`	String	Use the option to disable a notifier. A comma-separated list can be used to represent multiple notifiers.	-	no

--target

String

The instance or cluster that will enable or disable its service

server

--dynamic

Boolean

Whether to apply the changes directly to the server without a restart

false

--enabled

Boolean

Whether to enable or disable the service

N/A

yes

--historic-trace-enabled

Boolean

Enables storing traces in a rolling store for later inspection

false

--historic-trace-store-size

Integer

Sets the maximum number of health checks to store

--historic-trace-store-timeout

String

Sets the time period after which a historic health check event entry is removed from visable history. The time expression should consist of a number followed by a time unit; s for seconds, m for minutes, h for hours or d for days. If no time unit is given the number specifies seconds. If the parameter is zero or unspecified there is no timeout for entries.

--set-notifiers

String

Use the option to set the notifiers to the HealthCheck Service. It will replace all the notifiers that have already been set to the HealthCheck Service. A comma-separated list can be used to represent multiple notifiers.

The notifiers avaiable by default are:

log-notifier
jms-notifier
cdieventbus-notifier
eventbus-notifier

--enable-notifiers

String

Use the option to enable a notifier. A comma-separated list can be used to represent multiple notifiers.

--disable-notifiers

String

Use the option to disable a notifier. A comma-separated list can be used to represent multiple notifiers.

Enabling or disabling the health check service implicitly also enables or disables the log notifier which is the default notifier.

You can find the list of available notifiers using the list-notifiers command.

Example

The following example will enable the Healthcheck service such that it will only activate from the next time the server is restarted. It sets the log notifier and JMS notifiers and the historical trace store to retain 20 health checks.

asadmin> set-healthcheck-configuration
    --enabled=true
    --dynamic=false
    --historic-trace-enabled=true
    --historic-trace-store-size=20
    --set-notifiers=log-notifier,jms-notifiers

shell

`list-healthcheck-services`

Usage: asadmin> list-healthcheck-services
Aim: Lists the names of all available metric checker services.

Command Options

There are no options available.

Example

Running the command will show output similar to the example below:

Available Health Check Services:
        Name                    Description
        healthcheck-cpool       Provides ratio on connection usage for a given pool name with severity according to defined threshold values
        healthcheck-mp          Checks that all instances are responding to Microprofile Healthcheck requests with an UP response
        healthcheck-stuck       Provides thread name, id and stack trace for requests which reach over defined threshold values
        healthcheck-cpu         Provides ratio on cpu usage time with severity according to defined threshold values
        healthcheck-gc          Provides ratio on garbage collection count with severity according to defined threshold values
        healthcheck-heap        Provides ratio on used heap memory with severity according to defined threshold values
        healthcheck-threads     Lists hogging threads with their id when given thresholds exceed
        healthcheck-machinemem  Provides ratio on used machine memory with severity according to defined threshold values
        healthcheck-mpmetrics   Provides a way to monitor and log the values of metrics exposed by MicroProfile Metrics

Command list-healthcheck-services executed successfully.

`set-healthcheck-service-configuration`

Usage

set-healthcheck-service-configuration
 --enabled=true|false
 --dynamic=true|false
 --service=<service.name>
 --checker-name=<string.value>
 --add-to-microprofile-health=true|false
 --time=<integer.value>
 --time-unit=DAYS|HOURS|MINUTES|SECONDS|MILLISECONDS
 --threshold-critical=80
 --threshold-warning=50
 --threshold-good=0
 --hogging-threads-threshold=<integer.value>
 --hogging-threads-retry-count=<integer.value>
 --stuck-threads-threshold=<integer.value>
 --stuck-threads-threshold-unit=DAYS|HOURS|MINUTES|SECONDS|MILLISECONDS
 --add-metric=<metric.name>
 --delete-metric=<metric.name>

Aim: Enables or disables the monitoring of an specific metric. The command also configures the frequency of monitoring for that metric. Furthermore it configures metric specific properties.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its metric configuration	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the metric monitoring	N/A	yes
`--service`	String	The service metric name. One of: `connection-pool` or `cp` `cpu-usage` or `cu` `garbage-collector` or `gc` `heap-memory-usage` or `hmu` `hogging-threads` or `ht` `machine-memory-usage` or `mmu` `stuck-thread` or `st` `mp-health` or `mh` `mp-metrics` or `mm`	-	yes
`--checker-name`	String	A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.	Depends on the service checker. One of: `CONP` `CPUC` `GBGC` `HEAP` `HOGT` `MEMM` `MP` `MPM`	no
`--add-to-microprofile-health`	String	When enabled the checker is add to MicroProfile Health and all health check result for the checker is displayed on MicroProfile Health REST endpoints.	false	no
`--time`	Integer	The amount of time units that the service will use to periodically monitor the metric	5	no
`--time-unit`	TimeUnit	The time unit to set the frequency of the metric monitoring. Must correspond to a valid `java.util.concurrent.TimeUnit` value	`MINUTES`	no
`--threshold-critical`	Integer	The threshold value that this metric must surpass to generate a `CRITICAL` event. A value between WARNING VALUE and 100 must be used. Available for services `cp`, `cu`, `gc`, `hmu` and `mmu`.	90	no
`--threshold-warning`	Integer	The threshold value that this metric must surpass to generate a `WARNING` event. A value between GOOD VALUE and CRITICAL VALUE must be used. Available for services `cp`, `cu`, `gc`, `hmu` and `mmu`.	50	no
`--threshold-good`	Integer	The threshold value that this metric must surpass to generate a `GOOD` event. A value between 0 and WARNING VALUE must be used. Available for services `cp`, `cu`, `gc`, `hmu` and `mmu`.	0	no
`--hogging-threads-threshold`	Integer	The threshold value that this metric will be compared to mark threads as hogging the CPU. Only available for `ht` service.	95	no
`--hogging-threads-retry-count`	Integer	The number of retries that the checker service will execute in order to identify a hogging thread. Only available for `ht` service.	3	no
`--stuck-threads-threshold`	Integer	The threshold above which a thread is considered stuck. Must be 1 or greater. Only available for `st` service.	-	no
`--stuck-threads-threshold-unit`	`TimeUnit`	The unit for the threshold for when a thread should be considered stuck. Only available for `st` service.	-	no
`--add-metric`	String	Adds a metric exposed by MicroProfile Metrics to monitor. Takes a string of the format `'metricName=MetricName description=Description'`, where `metricName` is required.	-	no
`--delete-metric`	String	Removes a metric exposed by MicroProfile Metrics that has been added to monitor. Takes a string of the format `'metricName=MetricName'`, where `metricName` is required.	-	no

--target

String

The instance or cluster that will enable or disable its metric configuration

server

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--enabled

Boolean

Whether to enable or disable the metric monitoring

N/A

yes

--service

String

The service metric name. One of:

connection-pool or cp
cpu-usage or cu
garbage-collector or gc
heap-memory-usage or hmu
hogging-threads or ht
machine-memory-usage or mmu
stuck-thread or st
mp-health or mh
mp-metrics or mm

yes

--checker-name

String

A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.

Depends on the service checker. One of:

CONP
CPUC
GBGC
HEAP
HOGT
MEMM
MP
MPM

--add-to-microprofile-health

String

When enabled the checker is add to MicroProfile Health and all health check result for the checker is displayed on MicroProfile Health REST endpoints.

false

--time

Integer

The amount of time units that the service will use to periodically monitor the metric

--time-unit

TimeUnit

The time unit to set the frequency of the metric monitoring. Must correspond to a valid java.util.concurrent.TimeUnit value

MINUTES

--threshold-critical

Integer

The threshold value that this metric must surpass to generate a CRITICAL event. A value between WARNING VALUE and 100 must be used. Available for services cp, cu, gc, hmu and mmu.

--threshold-warning

Integer

The threshold value that this metric must surpass to generate a WARNING event. A value between GOOD VALUE and CRITICAL VALUE must be used. Available for services cp, cu, gc, hmu and mmu.

--threshold-good

Integer

The threshold value that this metric must surpass to generate a GOOD event. A value between 0 and WARNING VALUE must be used. Available for services cp, cu, gc, hmu and mmu.

--hogging-threads-threshold

Integer

The threshold value that this metric will be compared to mark threads as hogging the CPU. Only available for ht service.

--hogging-threads-retry-count

Integer

The number of retries that the checker service will execute in order to identify a hogging thread. Only available for ht service.

--stuck-threads-threshold

Integer

The threshold above which a thread is considered stuck. Must be 1 or greater. Only available for st service.

--stuck-threads-threshold-unit

TimeUnit

The unit for the threshold for when a thread should be considered stuck. Only available for st service.

--add-metric

String

Adds a metric exposed by MicroProfile Metrics to monitor. Takes a string of the format 'metricName=MetricName description=Description', where metricName is required.

--delete-metric

String

Removes a metric exposed by MicroProfile Metrics that has been added to monitor. Takes a string of the format 'metricName=MetricName', where metricName is required.

If this command gets executed before running the set-healthcheck-configuration command, it will succeed and the configuration will be saved, but the HealthCheck service will not be enabled.

Examples

A very basic example command to simply enable the GC checker and activate it without needing a restart would be as follows:

asadmin> set-healthcheck-service-configuration
 --enabled=true
 --service=gc
 --dynamic=true

shell

Monitoring the health of JDBC connection pools is a common need. In that scenario, it is very unlikely that on-the-fly configuration changes would be made, so a very high CRITICAL threshold can be set. Likewise, a nonzero GOOD threshold is needed because an empty or unused connection pool may not be healthy either.

The following command would apply these settings to the connection pool checker:

asadmin> set-healthcheck-service-configuration
 --service=cp
 --dynamic=true
 --threshold-critical=95
 --threshold-warning=70
 --threshold-good=30

shell

Monitoring which threads hog the CPU is extremely important since this can lead to performance degradation, deadlocks and extreme bottlenecks issues that web applications can incur. In some cases the defaults are all that is needed, but imagine that in a critical system you want to set the threshold percentage to 90%, and you want to make sure that the health check service guarantees the state of such threads with a retry count of 5. Additionally, you want to set the frequency of this check for every 20 seconds.

The following command would apply these settings to the connection pool checker:

asadmin> set-healthcheck-service-configuration
 --service=cp
 --dynamic=true
 --hogging-threads-threshold=90
 --hogging-threads-retry-count=5
 --time=20
 --time-unit=SECONDS

shell

The following example configures the stuck threads checker to check every 30 seconds for any threads which have been stuck for more than 5 minutes and applies the configuration change without needing a restart:

asadmin> set-healthcheck-service-configuration
 --service=stuck-thread
 --enabled=true
 --dynamic=true
 --time=30
 --time-unit=SECONDS
 --stuck-threads-threshold=5
 --stuck-threads-threshold-unit=MINUTES

Shell

The following example configures the Microprofile Metrics Checker to add base_thread_max_count metrics for monitoring, adds the checker to MicroProfile Health to display its result on MicroProfile Health REST endpoints and applies the configuration change without needing a restart:

asadmin> set-healthcheck-service-configuration
 --service=mp-metrics
 --enabled=true
 --dynamic=true
 --add-to-microprofile-health=true
 --add-metric='metricName=base_thread_max_count'

Shell

`get-healthcheck-configuration`

Usage: asadmin> get-healthcheck-configuration
Aim: Lists the current configuration for the health check service, configured checkers and enabled notifiers.

Command Options

There are no options available.

Example

A sample output is as follows:

Health Check Service Configuration is enabled?: true
Historical Tracing Enabled?: true
Historical Tracing Store Size: 20
Name                  Notifier Enabled
log-notifier          true
jms-notifier          false
cdieventbus-notifier  false
eventbus-notifier     false
Below are the list of configuration details of each checker listed by its name.

Name  Enabled  Time  Unit     Add to MicroProfile Health  Critical Threshold  Warning Threshold  Good Threshold
CPUC  true     5     MINUTES  true                        80                  50                 0
HEAP  true     5     MINUTES  false                       80                  50                 0

Name   Enabled  Time  Unit     Add to MicroProfile Health  Threshold Time  Threshold Unit
STUCK  true     5     MINUTES  false                       5               MINUTES

Name  Enabled  Time  Unit     Add to MicroProfile Health
MPM   true     5     MINUTES  false

Monitored Metric Name  Description
base_thread_max_count Displays the peak live thread count since the Java virtual machine started or peak was reset. This includes daemon and non-daemon threads.
base_gc_total_total    Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.

Command get-healthcheck-configuration executed successfully.