HealthCheck Service

Payara Server includes a HealthCheck Service that is disabled by default. When enabled it can periodically check the following performance metrics:

Host CPU Usage
Host Memory Usage
Payara Server’s JVM Garbage Collections
Payara Server’s JVM Heap Usage
CPU Usage of individual threads
Detects stuck threads
Monitors metrics exposed by MicroProfile Metrics

If there is a problem with any of these metrics and they exceed a configurable threshold then a GOOD, WARNING or CRITICAL event notification is sent to the Notification Service. Notifications can be then sent to one or more notifiers, e.g. a log file.

This allows operations teams to rapidly detect problems or work out what happened after these problems have occurred.

It is possible to display metrics produced by HealthCheck service on MicroProfile Health REST endpoints. You can find more information on set-healthcheck-service-configuration.

The Host Memory Usage currently only works on Linux and BSD derivatives.

If the Log Notifier is enabled, such events will be presented in the server’s log file like in the following sample:

[2016-05-24T03:52:28.690+0000] [Payara 4.1] [INFO] [fish.payara.nucleus.healthcheck.HealthCheckService] [tid: _ThreadID=72 _ThreadName=healthcheck-service-3 [timeMillis: 1464061948690] [levelValue: 800] [[ CPUC:Health Check Result:[[status=WARNING, message='CPU%: 75.6, Time CPU used: 267 milliseconds'']']]]  [2016-05-24T21:11:36.579+0000] [Payara 4.1] [SEVERE] [fish.payara.nucleus.healthcheck.HealthCheckService] [tid: _ThreadID=71 _ThreadName=healthcheck-service-3] [timeMillis: 1464124296579] [levelValue: 1000] [[ HOGT:Health Check Result:[[status=CRITICAL, message='Thread with <id-name>: 145-testing-thread-1 is a hogging thread for the last 59 seconds 999 milliseconds'']']]]

log

HealthCheck Service Configuration

As with any other services available in Payara Server, the HealthCheck service can be configured by using the web console, administration commands or editing the domain.xml configuration file directly.

Using the Web Administration Console

To configure the HealthCheck Service in the Administration Console, go to Configuration → [instance-configuration (like server-config)] → Healthcheck:

HealthCheck Configuration in the Admin Console

Check the Enabled box (and the Dynamic box too if you don’t want to restart the domain) to switch the HealthCheck service on.

The general settings of the service are:

Threshold Unit: This defines the time duration per unit. The accepted options are any valid java.util.concurrent.TimeUnit values. The default value is SECONDS.
Threshold Value: This defines the number of units beyond which a request will be traced.
Store Historic Traces: When present, enables the storage of the slowest historical request trace events.
Historic Trace Store Size: Determines the number of historical trace events that can be stored in memory when historical storing is enabled. The default value is 20 records

Aside from this configuration settings, you can also define which notifiers will be used to relay the HealthCheck events by moving them to the Active Notifiers box.

Keep in mind that for HealthCheck events to be relayed to the active notifiers, both the Notification Service and each respective notifier must be enabled first.

You don’t need to manually add each notifier on this screen. When enabling a notifier on its configuration screen, the server will automatically add it to the list of active notifiers for the HealthCheck service. This same result occurs when enabling the notifier using the appropriate asadmin command.

Configuring the Available Checkers

Each of the available checkers that are used to determine the server’s health can also be configured separately from each other on the admin console. The list of the available checkers is as follows:

CPU Usage: Calculates the CPU usage and prints out the percentage along with the usage time.
Connection Pool: Calculates the ratio of free/used connections available for all JDBC connections pool and prints the percentage of used connections for each active pool.
Heap Memory Usage: Calculates the heap memory usage and prints out the percentage along with initial and committed heap sizes.
Machine Memory Usage: Calculates the machine memory usage and prints out the percentage along with the total and used physical memory size.
Hogging Threads: Identifies the threads that are hogging the CPU.
Stuck Threads: Identifies the threads that are stuck for a specified period of time.
Garbage Collector: Calculates and prints out how many times GC is executed with its elapsed time.

You can configure the settings for each checker on the respective tab in the web console. Here’s a sample image with the current configuration for the CPU Usage checker:

CPU Usage Checker Configuration in the Admin Console

From the Command Line

You can configure the HealthCheck Service by using the asadmin commands. The following is a detailed list of the administration commands that can be used to correctly configure the HealthCheck Service.

`set-healthcheck-configuration`

Usage

set-healthcheck-configuration
 --enabled=true|false
 --dynamic=true|false
 --historic-trace-enabled=true|false
 --historic-trace-store-size=20
 --historic-trace-store-timeout=<integer.value>s|m|h|d

Aim: Enables and disables the HealthCheck service. This includes configuration for tracing historic health check events for later inspection.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its service	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the service	N/A	no
`--historic-trace-enabled`	Boolean	Enables storing traces in a rolling store for later inspection	false	no
`--historic-trace-store-size`	Integer	Sets the maximum number of health checks to store	20	no
`--historic-trace-store-timeout`	String	Sets the time period after which a historic health check event entry is removed from visible history. The time expression should consist of a number followed by a time unit; `s` for seconds, `m` for minutes, `h` for hours or `d` for days. If no time unit is given the number specifies seconds. If the parameter is zero or unspecified there is no timeout for entries.	-	no

--target

String

The instance or cluster that will enable or disable its service

server

--dynamic

Boolean

Whether to apply the changes directly to the server without a restart

false

--enabled

Boolean

Whether to enable or disable the service

N/A

--historic-trace-enabled

Boolean

Enables storing traces in a rolling store for later inspection

false

--historic-trace-store-size

Integer

Sets the maximum number of health checks to store

--historic-trace-store-timeout

String

Sets the time period after which a historic health check event entry is removed from visible history. The time expression should consist of a number followed by a time unit; s for seconds, m for minutes, h for hours or d for days. If no time unit is given the number specifies seconds. If the parameter is zero or unspecified there is no timeout for entries.

Enabling or disabling the health check service implicitly also enables or disables the log notifier which is the default notifier. This behaviour is similar to the replaced healthcheck-configure command.

Example

The following example will enable the Healthcheck service such that it will only activate from the next time the server is restarted. It enables the log notifier and sets the historical trace store to retain 20 health checks.

asadmin> set-healthcheck-configuration
    --enabled=true
    --dynamic=false
    --historic-trace-enabled=true
    --historic-trace-store-size=20

shell

`healthcheck-configure`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-configuration command.

Usage: asadmin> healthcheck-configure --enabled=true|false --dynamic=true|false --historicaltraceenabled --historicaltracestoresize=20
Aim: Enables and disables the HealthCheck service. Also allows configuration of the store of historical health checks.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its service	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the service	N/A	yes
`--notifierenabled`	Boolean	Whether or not to enable the default notifier	false	no
`--historicaltraceenabled`	Boolean	Enables historic checks if present	false	no
`--historicaltracestoresize`	Integer	Sets the maximum number of health checks to store	20	no

--target

String

The instance or cluster that will enable or disable its service

server

--dynamic

Boolean

Whether to apply the changes directly to the server without a restart

false

--enabled

Boolean

Whether to enable or disable the service

N/A

yes

--notifierenabled

Boolean

Whether or not to enable the default notifier

false

--historicaltraceenabled

Boolean

Enables historic checks if present

false

--historicaltracestoresize

Integer

Sets the maximum number of health checks to store

Starting from release 4.1.1.171, the --notifierenabled argument is used to enable or disable the Log Notifier, which is considered the default notifier. Use the healthcheck-[NOTIFIER_NAME]-configure command to enable or disable other available notifiers.

Example

asadmin > healthcheck-configure
    --enabled=true
    --dynamic=false
    --notifierenabled=true
    --historicaltraceenabled=true
    --historicaltracestoresize=20

shell

`list-healthcheck-services`

Usage: asadmin> list-healthcheck-services
Aim: Lists the names of all available metric checker services.

Command Options

There are no options available.

Example

Running the command will show output similar to the example below:

Available Health Check Services:
Name                    Description
healthcheck-mp          Checks that all instances are responding to Microprofile Healthcheck requests with an UP response
healthcheck-cpu         Provides ratio on cpu usage time with severity according to defined threshold values
healthcheck-gc          Provides ratio on garbage collection count with severity according to defined threshold values
healthcheck-heap        Provides ratio on used heap memory with severity according to defined threshold values
healthcheck-threads     Lists hogging threads with their id when given thresholds exceed
healthcheck-machinemem  Provides ratio on used machine memory with severity according to defined threshold values
healthcheck-cpool       Provides ratio on connection usage for a given pool name with severity according to defined threshold values
healthcheck-stuck       Provides thread name, id and stack trace for requests which reach over defined threshold values
healthcheck-mpmetrics   Provides a way to monitor and log the values of metrics exposed by MicroProfile Metrics
Command list-healthcheck-services executed successfully.

`healthcheck-list-services`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the list-healthcheck-services command.

Usage: asadmin> healthcheck-list-services
Aim: Exactly the same as the list-healthcheck-services command.

`set-healthcheck-service-configuration`

Usage

set-healthcheck-service-configuration
 --enabled=true|false
 --dynamic=true|false
 --service=<service.name>
 --checker-name=<string.value>
 --add-to-microprofile-health=true|false
 --time=<integer.value>
 --time-unit=DAYS|HOURS|MINUTES|SECONDS|MILLISECONDS
 --threshold-critical=80
 --threshold-warning=50
 --threshold-good=0
 --hogging-threads-threshold=<integer.value>
 --hogging-threads-retry-count=<integer.value>
 --stuck-threads-threshold=<integer.value>
 --stuck-threads-threshold-unit=DAYS|HOURS|MINUTES|SECONDS|MILLISECONDS
 --add-metric=<metric.name>
 --delete-metric=<metric.name>

Aim: Enables or disables the monitoring of a specific metric. The command also configures the frequency of monitoring for that metric. Furthermore it configures metric specific properties.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its metric configuration	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the metric monitoring	N/A	yes
`--service`	String	The service metric name. One of: `connection-pool` or `cp` `cpu-usage` or `cu` `garbage-collector` or `gc` `heap-memory-usage` or `hmu` `hogging-threads` or `ht` `machine-memory-usage` or `mmu` `stuck-thread` or `st` `mp-health` or `mh` `mp-metrics` or `mm`	-	yes
`--checker-name`	String	A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.	Depends on the service checker. One of: `CONP` `CPUC` `GBGC` `HEAP` `HOGT` `MEMM` `MP` `MPM`	no
`--add-to-microprofile-health`	String	When enabled the checker is add to MicroProfile Health and all health check result for the checker is displayed on MicroProfile Health REST endpoints.	false	no
`--time`	Integer	The amount of time units that the service will use to periodically monitor the metric	5	no
`--time-unit`	TimeUnit	The time unit to set the frequency of the metric monitoring. Must correspond to a valid `java.util.concurrent.TimeUnit` value	`MINUTES`	no
`--threshold-critical`	Integer	The threshold value that this metric must surpass to generate a `CRITICAL` event. A value between WARNING VALUE and 100 must be used. Available for services `cp`, `cu`, `gc`, `hmu` and `mmu`.	90	no
`--threshold-warning`	Integer	The threshold value that this metric must surpass to generate a `WARNING` event. A value between GOOD VALUE and CRITICAL VALUE must be used. Available for services `cp`, `cu`, `gc`, `hmu` and `mmu`.	50	no
`--threshold-good`	Integer	The threshold value that this metric must surpass to generate a `GOOD` event. A value between 0 and WARNING VALUE must be used. Available for services `cp`, `cu`, `gc`, `hmu` and `mmu`.	0	no
`--hogging-threads-threshold`	Integer	The threshold value that this metric will be compared to mark threads as hogging the CPU. Only available for `ht` service.	95	no
`--hogging-threads-retry-count`	Integer	The number of retries that the checker service will execute in order to identify a hogging thread. Only available for `ht` service.	3	no
`--stuck-threads-threshold`	Integer	The threshold above which a thread is considered stuck. Must be 1 or greater. Only available for `st` service.	-	no
`--stuck-threads-threshold-unit`	`TimeUnit`	The unit for the threshold for when a thread should be considered stuck. Only available for `st` service.	-	no
`--add-metric`	String	Adds a metric exposed by MicroProfile Metrics to monitor. Takes a string of the format `'metricName=MetricName description=Description'`, where `metricName` is required.	-	no
`--delete-metric`	String	Removes a metric exposed by MicroProfile Metrics that has been added to monitor. Takes a string of the format `'metricName=MetricName'`, where `metricName` is required.	-	no

--target

String

The instance or cluster that will enable or disable its metric configuration

server

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--enabled

Boolean

Whether to enable or disable the metric monitoring

N/A

yes

--service

String

The service metric name. One of:

connection-pool or cp
cpu-usage or cu
garbage-collector or gc
heap-memory-usage or hmu
hogging-threads or ht
machine-memory-usage or mmu
stuck-thread or st
mp-health or mh
mp-metrics or mm

yes

--checker-name

String

A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.

Depends on the service checker. One of:

CONP
CPUC
GBGC
HEAP
HOGT
MEMM
MP
MPM

--add-to-microprofile-health

String

When enabled the checker is add to MicroProfile Health and all health check result for the checker is displayed on MicroProfile Health REST endpoints.

false

--time

Integer

The amount of time units that the service will use to periodically monitor the metric

--time-unit

TimeUnit

The time unit to set the frequency of the metric monitoring. Must correspond to a valid java.util.concurrent.TimeUnit value

MINUTES

--threshold-critical

Integer

The threshold value that this metric must surpass to generate a CRITICAL event. A value between WARNING VALUE and 100 must be used. Available for services cp, cu, gc, hmu and mmu.

--threshold-warning

Integer

The threshold value that this metric must surpass to generate a WARNING event. A value between GOOD VALUE and CRITICAL VALUE must be used. Available for services cp, cu, gc, hmu and mmu.

--threshold-good

Integer

The threshold value that this metric must surpass to generate a GOOD event. A value between 0 and WARNING VALUE must be used. Available for services cp, cu, gc, hmu and mmu.

--hogging-threads-threshold

Integer

The threshold value that this metric will be compared to mark threads as hogging the CPU. Only available for ht service.

--hogging-threads-retry-count

Integer

The number of retries that the checker service will execute in order to identify a hogging thread. Only available for ht service.

--stuck-threads-threshold

Integer

The threshold above which a thread is considered stuck. Must be 1 or greater. Only available for st service.

--stuck-threads-threshold-unit

TimeUnit

The unit for the threshold for when a thread should be considered stuck. Only available for st service.

--add-metric

String

Adds a metric exposed by MicroProfile Metrics to monitor. Takes a string of the format 'metricName=MetricName description=Description', where metricName is required.

--delete-metric

String

Removes a metric exposed by MicroProfile Metrics that has been added to monitor. Takes a string of the format 'metricName=MetricName', where metricName is required.

If this command gets executed before running the set-healthcheck-configuration command, it will succeed and the configuration will be saved, but the HealthCheck service will not be enabled.

Examples

A very basic example command to simply enable the GC checker and activate it without needing a restart would be as follows:

asadmin> set-healthcheck-service-configuration
 --enabled=true
 --service=gc
 --dynamic=true

shell

Monitoring the health of JDBC connection pools is a common need. In that scenario, it is very unlikely that on-the-fly configuration changes would be made, so a very high CRITICAL threshold can be set. Likewise, a nonzero GOOD threshold is needed because an empty or unused connection pool may not be healthy either.

The following command would apply these settings to the connection pool checker:

asadmin> set-healthcheck-service-configuration
 --service=cp
 --dynamic=true
 --threshold-critical=95
 --threshold-warning=70
 --threshold-good=30

shell

Monitoring which threads hog the CPU is extremely important since this can lead to performance degradation, deadlocks and extreme bottlenecks issues that web applications can incur. In some cases the defaults are all that is needed, but imagine that in a critical system you want to set the threshold percentage to 90%, and you want to make sure that the health check service guarantees the state of such threads with a retry count of 5. Additionally, you want to set the frequency of this check for every 20 seconds.

The following command would apply these settings to the connection pool checker:

asadmin> set-healthcheck-service-configuration
 --service=cp
 --dynamic=true
 --hogging-threads-threshold=90
 --hogging-threads-retry-count=5
 --time=20
 --time-unit=SECONDS

shell

The following example configures the stuck threads checker to check every 30 seconds for any threads which have been stuck for more than 5 minutes and applies the configuration change without needing a restart:

asadmin> set-healthcheck-service-configuration
 --service=stuck-thread
 --enabled=true
 --dynamic=true
 --time=30
 --time-unit=SECONDS
 --stuck-threads-threshold=5
 --stuck-threads-threshold-unit=MINUTES

Shell

The following example configures the Microprofile Metrics Checker to add base_thread_max_count metrics for monitoring, adds the checker to MicroProfile Health to display its result on MicroProfile Health REST endpoints and applies the configuration change without needing a restart:

asadmin> set-healthcheck-service-configuration
 --service=mp-metrics
 --enabled=true
 --dynamic=true
 --add-to-microprofile-health=true
 --add-metric='metricName=base_thread_max_count'

Shell

`healthcheck-configure-service`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-configuration command.

Usage: asadmin> healthcheck-configure-service --serviceName=<service.name> --checkerName=<name> --enabled=true|false --dynamic=true|false --time=<integer.value> --unit=MICROSECONDS|MILLISECONDS|SECONDS|MINUTES|HOURS|DAYS
Aim: Enables or disables the monitoring of a specific checker. The command also configures the frequency of monitoring for that metric.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its metric configuration	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the metric monitoring	N/A	yes
`--serviceName`	String	The metric service name. Must correspond to one of the values listed before	-	yes
`--checkerName`	String	A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.	Depends on the service checker. One of: `CONP` `CPUC` `GBGC` `HEAP` `HOGT` `MEMM`	no
`--time`	Integer	The amount of time units that the service will use to periodically monitor the metric	5	no
`--unit`	TimeUnit	The time unit to set the frequency of the metric monitoring. Must correspond to a valid `java.util.concurrent.TimeUnit` value	`MINUTES`	no

--target

String

The instance or cluster that will enable or disable its metric configuration

server

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--enabled

Boolean

Whether to enable or disable the metric monitoring

N/A

yes

--serviceName

String

The metric service name. Must correspond to one of the values listed before

yes

--checkerName

String

A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.

Depends on the service checker. One of:

CONP
CPUC
GBGC
HEAP
HOGT
MEMM

--time

Integer

The amount of time units that the service will use to periodically monitor the metric

--unit

TimeUnit

The time unit to set the frequency of the metric monitoring. Must correspond to a valid java.util.concurrent.TimeUnit value

MINUTES

If this command gets executed before running the healthcheck-configure command, it will succeed and the configuration will be saved, but the HealthCheck service will not be enabled.

Example

A very basic example command to simply enable the GC checker and activate it without needing a restart would be as follows:

asadmin> healthcheck-configure-service --enabled=true
      --serviceName=healthcheck-gc
      --name=MYAPP-GC
      --dynamic=true

shell

`healthcheck-configure-service-threshold`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-configuration command.

Usage

asadmin> healthcheck-configure-service-threshold --serviceName=<service.name> --dynamic=true|false --thresholdCritical=90 --thresholdWarning=50 --thresholdGood=0

Aim

Configures CRITICAL, WARNING and GOOD threshold range values for a service checker. The dynamic attribute should be set to true in order to apply the changes directly.

This command only configures thresholds for the following checkers:

CPU Usage
Connection Pool
Heap Memory Usage
Machine Memory Usage

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will be configured	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	no
`--serviceName`	String	The metric service name. Must correspond to one of the values listed before	-	yes
`--thresholdCritical`	Integer	The threshold value that this metric must surpass to generate a `CRITICAL` event. A value between WARNING VALUE and 100 must be used	90	no
`--thresholdWarning`	Integer	The threshold value that this metric must surpass to generate a `WARNING` event. A value between GOOD VALUE and CRITICAL VALUE must be used	50	no
`--thresholdGood`	Integer	The threshold value that this metric must surpass to generate a `GOOD` event. A value between 0 and WARNING VALUE must be used	0	no

--target

String

The instance or cluster that will be configured

server

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--serviceName

String

The metric service name. Must correspond to one of the values listed before

yes

--thresholdCritical

Integer

The threshold value that this metric must surpass to generate a CRITICAL event. A value between WARNING VALUE and 100 must be used

--thresholdWarning

Integer

The threshold value that this metric must surpass to generate a WARNING event. A value between GOOD VALUE and CRITICAL VALUE must be used

--thresholdGood

Integer

The threshold value that this metric must surpass to generate a GOOD event. A value between 0 and WARNING VALUE must be used

In order to execute this command for a specific metric, the healthcheck-configure-service command needs to be executed first.

Example

The following command would apply these settings to the connection pool checker:

asadmin> healthcheck-configure-service-threshold
 --serviceName=healthcheck-cpool
 --dynamic=true
 --thresholdCritical=95
 --thresholdWarning=70
 --thresholdGood=30

shell

`healthcheck-hoggingthreads-configure`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-configuration command.

Usage: asadmin> healthcheck-hoggingthreads-configure --dynamic=true|false --threshold-percentage=50 --retry-count=3
Aim: Configures the Hogging Threads checker service settings. The checker will determine which running threads are hogging the CPU by calculating a percentage of usage with the ratio of elapsed time to the checker service execution interval and verifying if this percentage exceeds the threshold-percentage.

You can also use this command to enable the checker and configure the monitoring frequency as you would do with the healthcheck-configure-service command.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will be configured	server	no
`--enabled`	Boolean	Whether to enable or disable the checker	true	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	no
`--threshold-percentage`	Integer	The threshold value that this metric will be compared to mark threads as hogging the CPU	95	no
`--retry-count`	Integer	The number of retries that the checker service will execute in order to identify a hogging thread	3	no
`--time`	Integer	The periodic amount of time units the checker service will use to monitor hogging threads	1	no
`--unit`	TimeUnit	The time unit to set the frequency of the metric monitoring. Must correspond to a valid `java.util.concurrent.TimeUnit` value	`SECONDS`	no

--target

String

The instance or cluster that will be configured

server

--enabled

Boolean

Whether to enable or disable the checker

true

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--threshold-percentage

Integer

The threshold value that this metric will be compared to mark threads as hogging the CPU

--retry-count

Integer

The number of retries that the checker service will execute in order to identify a hogging thread

--time

Integer

The periodic amount of time units the checker service will use to monitor hogging threads

--unit

TimeUnit

The time unit to set the frequency of the metric monitoring. Must correspond to a valid java.util.concurrent.TimeUnit value

SECONDS

Example

The following command would apply these settings to the connection pool checker:

asadmin> healthcheck-hoggingthreads-configure
 --dynamic=true
 --threshold-percentage=90
 --retry-count=5
 --time=20
 --unit=SECONDS

shell

`healthcheck-stuckthreads-configure`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-configuration command.

Usage: asadmin> healthcheck-stuckthreads-configure --enabled true|false --dynamic true|false --time=<integer.value> --unit=MICROSECONDS|MILLISECONDS|SECONDS|MINUTES|HOURS|DAYS --threshold=<integer.value> --thresholdUnit=MILLISECONDS|SECONDS|MINUTES|HOURS|DAYS
Aim: Configures the Stuck Thread checker. The Stuck Threads checker is comparable to the request tracing service, in that it is triggered by exceeding a configured threshold. but in this case it reports on all threads that, when the healthcheck runs, have taken longer than the threshold time.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--enabled`	Boolean	Enables or disables the checker	-	yes
`--dynamic`	Boolean	Whether or not to apply the changes dynamically (without a restart)	false	no
`--time`	Integer	The time between checks, must be 1 or greater	-	no
`--unit`	`TimeUnit`	The unit for the time between healthchecks	-	no
`--threshold`	Integer	The threshold above which a thread is considered stuck. Must be 1 or greater.	-	no
`--thresholdUnit`	`TimeUnit`	The unit for the threshold for when a thread should be considered stuck	-	no
`--target`	String	The target to enable the checker on	`server` (the DAS)	no

--enabled

Boolean

Enables or disables the checker

yes

--dynamic

Boolean

Whether or not to apply the changes dynamically (without a restart)

false

--time

Integer

The time between checks, must be 1 or greater

--unit

TimeUnit

The unit for the time between healthchecks

--threshold

Integer

The threshold above which a thread is considered stuck. Must be 1 or greater.

--thresholdUnit

TimeUnit

The unit for the threshold for when a thread should be considered stuck

--target

String

The target to enable the checker on

server (the DAS)

Example

The following example configures the stuckthreads checker to check every 30 seconds for any threads which have been stuck for more than 5 minutes and applies the configuration change without needing a restart:

asadmin> healthcheck-stuckthreads-configure
    --enabled=true
    --dynamic=true
    --time=30
    --unit=SECONDS
    --threshold=5
    --thresholdUnit=MINUTES

Shell

`set-healthcheck-service-notifier-configuration`

Usage

asadmin> set-healthcheck-service-notifier-configuration
 --notifier=<string.value>
 --enabled=true|false
 --dynamic=true|false
 --noisy=true|false

Aim: This command can be used to enable or disable a specific notifier or to change its noisy setting.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--notifier`	String	The notifier to configure. One of (case-insensitive): `LOG` `HIPCHAT` `SLACK` `JMS` `EMAIL` `XMPP` `SNMP` `EVENTBUS` `NEWRELIC` `DATADOG` `CDIEVENTBUS`	-	yes
`--enabled`	Boolean	Enables or disables the notifier	false	Yes
`--noisy`	Boolean	Sets the notifier to noisy (a.k.a. verbose) or not noisy. A noisy notifier includes more detailed logging information in the notifiers output.	-	No
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	No
`--target`	String	The instance or cluster that will be configured	server	no

--notifier

String

The notifier to configure. One of (case-insensitive):

LOG
HIPCHAT
SLACK
JMS
EMAIL
XMPP
SNMP
EVENTBUS
NEWRELIC
DATADOG
CDIEVENTBUS

yes

--enabled

Boolean

Enables or disables the notifier

false

Yes

--noisy

Boolean

Sets the notifier to noisy (a.k.a. verbose) or not noisy. A noisy notifier includes more detailed logging information in the notifiers output.

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--target

String

The instance or cluster that will be configured

server

Examples

To enable the log notifier for the HealthCheck Service without having to restart the server, use the following command:

asadmin> set-healthcheck-service-notifier-configuration
 --notifier=log
 --enabled=true
 --dynamic=true

shell

`healthcheck-[NOTIFIER_NAME]-notifier-configure`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-notifier-configuration command.

Usage: asadmin> healthcheck-[NOTIFIER_NAME]-notifier-configure --enabled=true --dynamic=true
Aim: This command can be used to enable or disable the notifier represented by the [NOTIFIER_NAME] placeholder.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--enabled`	Boolean	Enables or disables the notifier	false	Yes
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	No

--enabled

Boolean

Enables or disables the notifier

false

Yes

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

Examples

To enable the log notifier for the HealthCheck Service without having to restart the server, use the following command:
```
asadmin> healthcheck-log-notifier-configure
    --enabled=true
    --dynamic=true
```
shell

`get-healthcheck-configuration`

Usage: asadmin> get-healthcheck-configuration
Aim: Lists the current configuration for the health check service, configured checkers and enabled notifiers.

Command Options

There are no options available.

Example

A sample output is as follows:

Health Check Service Configuration is enabled?: true
Historical Tracing Enabled?: false
Name      Notifier Enabled
XMPP      false
DATADOG   true
EMAIL     false
SLACK     true
EVENTBUS  false
HIPCHAT   false
NEWRELIC  true
SNMP      false
LOG       true
JMS       false

Below are the list of configuration details of each checker listed by its name.

Name  Enabled  Time  Unit     Add to MicroProfile Health  Critical Threshold  Warning Threshold  Good Threshold
CPUC  true     5     MINUTES  true                        80                  50                 0
HEAP  true     5     MINUTES  false                       80                  50                 0
Name   Enabled  Time  Unit     Add to MicroProfile Health  Threshold Time  Threshold Unit
STUCK  true     5     MINUTES  false                       5               MINUTES
Name  Enabled  Time  Unit     Add to MicroProfile Health
MPM   true     5     MINUTES  false

Monitored Metric Name  Description
base_thread_max_count Displays the peak live thread count since the Java virtual machine started or peak was reset. This includes daemon and non-daemon threads.
base_gc_total_total    Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.

Command get-healthcheck-configuration executed successfully.

[alanroth@archlabs health-check-service]$ cat asadmin-commands.adoc pwd

HealthCheck Service Asadmin Command Reference

The following is a detailed list of the administration commands that can be used to correctly configure the HealthCheck Service.

`set-healthcheck-configuration`

Usage

set-healthcheck-configuration
 --enabled=true|false
 --dynamic=true|false
 --historic-trace-enabled=true|false
 --historic-trace-store-size=20
 --historic-trace-store-timeout=<integer.value>s|m|h|d

Aim: Enables and disables the HealthCheck service. This includes configuration for tracing historic health check events for later inspection.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its service	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the service	N/A	no
`--historic-trace-enabled`	Boolean	Enables storing traces in a rolling store for later inspection	false	no
`--historic-trace-store-size`	Integer	Sets the maximum number of health checks to store	20	no
`--historic-trace-store-timeout`	String	Sets the time period after which a historic health check event entry is removed from visible history. The time expression should consist of a number followed by a time unit; `s` for seconds, `m` for minutes, `h` for hours or `d` for days. If no time unit is given the number specifies seconds. If the parameter is zero or unspecified there is no timeout for entries.	-	no

--target

String

The instance or cluster that will enable or disable its service

server

--dynamic

Boolean

Whether to apply the changes directly to the server without a restart

false

--enabled

Boolean

Whether to enable or disable the service

N/A

--historic-trace-enabled

Boolean

Enables storing traces in a rolling store for later inspection

false

--historic-trace-store-size

Integer

Sets the maximum number of health checks to store

--historic-trace-store-timeout

String

Example

asadmin> set-healthcheck-configuration
    --enabled=true
    --dynamic=false
    --historic-trace-enabled=true
    --historic-trace-store-size=20

shell

`healthcheck-configure`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-configuration command.

Usage: asadmin> healthcheck-configure --enabled=true|false --dynamic=true|false --historicaltraceenabled --historicaltracestoresize=20
Aim: Enables and disables the HealthCheck service. Also allows configuration of the store of historical health checks.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its service	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the service	N/A	yes
`--notifierenabled`	Boolean	Whether or not to enable the default notifier	false	no
`--historicaltraceenabled`	Boolean	Enables historic checks if present	false	no
`--historicaltracestoresize`	Integer	Sets the maximum number of health checks to store	20	no

--target

String

The instance or cluster that will enable or disable its service

server

--dynamic

Boolean

Whether to apply the changes directly to the server without a restart

false

--enabled

Boolean

Whether to enable or disable the service

N/A

yes

--notifierenabled

Boolean

Whether or not to enable the default notifier

false

--historicaltraceenabled

Boolean

Enables historic checks if present

false

--historicaltracestoresize

Integer

Sets the maximum number of health checks to store

Example

asadmin > healthcheck-configure
    --enabled=true
    --dynamic=false
    --notifierenabled=true
    --historicaltraceenabled=true
    --historicaltracestoresize=20

shell

`list-healthcheck-services`

Usage: asadmin> list-healthcheck-services
Aim: Lists the names of all available metric checker services.

Command Options

There are no options available.

Example

Running the command will show output similar to the example below:

Available Health Check Services:
Name                    Description
healthcheck-mp          Checks that all instances are responding to Microprofile Healthcheck requests with an UP response
healthcheck-cpu         Provides ratio on cpu usage time with severity according to defined threshold values
healthcheck-gc          Provides ratio on garbage collection count with severity according to defined threshold values
healthcheck-heap        Provides ratio on used heap memory with severity according to defined threshold values
healthcheck-threads     Lists hogging threads with their id when given thresholds exceed
healthcheck-machinemem  Provides ratio on used machine memory with severity according to defined threshold values
healthcheck-cpool       Provides ratio on connection usage for a given pool name with severity according to defined threshold values
healthcheck-stuck       Provides thread name, id and stack trace for requests which reach over defined threshold values
healthcheck-mpmetrics   Provides a way to monitor and log the values of metrics exposed by MicroProfile Metrics
Command list-healthcheck-services executed successfully.

`healthcheck-list-services`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the list-healthcheck-services command.

Usage: asadmin> healthcheck-list-services
Aim: Exactly the same as the list-healthcheck-services command.

`set-healthcheck-service-configuration`

Usage

set-healthcheck-service-configuration
 --enabled=true|false
 --dynamic=true|false
 --service=<service.name>
 --checker-name=<string.value>
 --add-to-microprofile-health=true|false
 --time=<integer.value>
 --time-unit=DAYS|HOURS|MINUTES|SECONDS|MILLISECONDS
 --threshold-critical=80
 --threshold-warning=50
 --threshold-good=0
 --hogging-threads-threshold=<integer.value>
 --hogging-threads-retry-count=<integer.value>
 --stuck-threads-threshold=<integer.value>
 --stuck-threads-threshold-unit=DAYS|HOURS|MINUTES|SECONDS|MILLISECONDS
 --add-metric=<metric.name>
 --delete-metric=<metric.name>

Aim: Enables or disables the monitoring of a specific metric. The command also configures the frequency of monitoring for that metric. Furthermore it configures metric specific properties.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its metric configuration	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the metric monitoring	N/A	yes
`--service`	String	The service metric name. One of: `connection-pool` or `cp` `cpu-usage` or `cu` `garbage-collector` or `gc` `heap-memory-usage` or `hmu` `hogging-threads` or `ht` `machine-memory-usage` or `mmu` `stuck-thread` or `st` `mp-health` or `mh` `mp-metrics` or `mm`	-	yes
`--checker-name`	String	A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.	Depends on the service checker. One of: `CONP` `CPUC` `GBGC` `HEAP` `HOGT` `MEMM` `MP` `MPM`	no
`--add-to-microprofile-health`	String	When enabled the checker is add to MicroProfile Health and all health check result for the checker is displayed on MicroProfile Health REST endpoints.	false	no
`--time`	Integer	The amount of time units that the service will use to periodically monitor the metric	5	no
`--time-unit`	TimeUnit	The time unit to set the frequency of the metric monitoring. Must correspond to a valid `java.util.concurrent.TimeUnit` value	`MINUTES`	no
`--threshold-critical`	Integer	The threshold value that this metric must surpass to generate a `CRITICAL` event. A value between WARNING VALUE and 100 must be used. Available for services `cp`, `cu`, `gc`, `hmu` and `mmu`.	90	no
`--threshold-warning`	Integer	The threshold value that this metric must surpass to generate a `WARNING` event. A value between GOOD VALUE and CRITICAL VALUE must be used. Available for services `cp`, `cu`, `gc`, `hmu` and `mmu`.	50	no
`--threshold-good`	Integer	The threshold value that this metric must surpass to generate a `GOOD` event. A value between 0 and WARNING VALUE must be used. Available for services `cp`, `cu`, `gc`, `hmu` and `mmu`.	0	no
`--hogging-threads-threshold`	Integer	The threshold value that this metric will be compared to mark threads as hogging the CPU. Only available for `ht` service.	95	no
`--hogging-threads-retry-count`	Integer	The number of retries that the checker service will execute in order to identify a hogging thread. Only available for `ht` service.	3	no
`--stuck-threads-threshold`	Integer	The threshold above which a thread is considered stuck. Must be 1 or greater. Only available for `st` service.	-	no
`--stuck-threads-threshold-unit`	`TimeUnit`	The unit for the threshold for when a thread should be considered stuck. Only available for `st` service.	-	no
`--add-metric`	String	Adds a metric exposed by MicroProfile Metrics to monitor. Takes a string of the format `'metricName=MetricName description=Description'`, where `metricName` is required.	-	no
`--delete-metric`	String	Removes a metric exposed by MicroProfile Metrics that has been added to monitor. Takes a string of the format `'metricName=MetricName'`, where `metricName` is required.	-	no

--target

String

The instance or cluster that will enable or disable its metric configuration

server

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--enabled

Boolean

Whether to enable or disable the metric monitoring

N/A

yes

--service

String

The service metric name. One of:

connection-pool or cp
cpu-usage or cu
garbage-collector or gc
heap-memory-usage or hmu
hogging-threads or ht
machine-memory-usage or mmu
stuck-thread or st
mp-health or mh
mp-metrics or mm

yes

--checker-name

String

A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.

Depends on the service checker. One of:

CONP
CPUC
GBGC
HEAP
HOGT
MEMM
MP
MPM

--add-to-microprofile-health

String

When enabled the checker is add to MicroProfile Health and all health check result for the checker is displayed on MicroProfile Health REST endpoints.

false

--time

Integer

The amount of time units that the service will use to periodically monitor the metric

--time-unit

TimeUnit

The time unit to set the frequency of the metric monitoring. Must correspond to a valid java.util.concurrent.TimeUnit value

MINUTES

--threshold-critical

Integer

The threshold value that this metric must surpass to generate a CRITICAL event. A value between WARNING VALUE and 100 must be used. Available for services cp, cu, gc, hmu and mmu.

--threshold-warning

Integer

--threshold-good

Integer

The threshold value that this metric must surpass to generate a GOOD event. A value between 0 and WARNING VALUE must be used. Available for services cp, cu, gc, hmu and mmu.

--hogging-threads-threshold

Integer

The threshold value that this metric will be compared to mark threads as hogging the CPU. Only available for ht service.

--hogging-threads-retry-count

Integer

The number of retries that the checker service will execute in order to identify a hogging thread. Only available for ht service.

--stuck-threads-threshold

Integer

The threshold above which a thread is considered stuck. Must be 1 or greater. Only available for st service.

--stuck-threads-threshold-unit

TimeUnit

The unit for the threshold for when a thread should be considered stuck. Only available for st service.

--add-metric

String

Adds a metric exposed by MicroProfile Metrics to monitor. Takes a string of the format 'metricName=MetricName description=Description', where metricName is required.

--delete-metric

String

Removes a metric exposed by MicroProfile Metrics that has been added to monitor. Takes a string of the format 'metricName=MetricName', where metricName is required.

If this command gets executed before running the set-healthcheck-configuration command, it will succeed and the configuration will be saved, but the HealthCheck service will not be enabled.

Examples

A very basic example command to simply enable the GC checker and activate it without needing a restart would be as follows:

asadmin> set-healthcheck-service-configuration
 --enabled=true
 --service=gc
 --dynamic=true

shell

The following command would apply these settings to the connection pool checker:

asadmin> set-healthcheck-service-configuration
 --service=cp
 --dynamic=true
 --threshold-critical=95
 --threshold-warning=70
 --threshold-good=30

shell

The following command would apply these settings to the connection pool checker:

asadmin> set-healthcheck-service-configuration
 --service=cp
 --dynamic=true
 --hogging-threads-threshold=90
 --hogging-threads-retry-count=5
 --time=20
 --time-unit=SECONDS

shell

asadmin> set-healthcheck-service-configuration
 --service=stuck-thread
 --enabled=true
 --dynamic=true
 --time=30
 --time-unit=SECONDS
 --stuck-threads-threshold=5
 --stuck-threads-threshold-unit=MINUTES

Shell

asadmin> set-healthcheck-service-configuration
 --service=mp-metrics
 --enabled=true
 --dynamic=true
 --add-to-microprofile-health=true
 --add-metric='metricName=base_thread_max_count'

Shell

`healthcheck-configure-service`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-configuration command.

Usage: asadmin> healthcheck-configure-service --serviceName=<service.name> --checkerName=<name> --enabled=true|false --dynamic=true|false --time=<integer.value> --unit=MICROSECONDS|MILLISECONDS|SECONDS|MINUTES|HOURS|DAYS
Aim: Enables or disables the monitoring of a specific checker. The command also configures the frequency of monitoring for that metric.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will enable or disable its metric configuration	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	no
`--enabled`	Boolean	Whether to enable or disable the metric monitoring	N/A	yes
`--serviceName`	String	The metric service name. Must correspond to one of the values listed before	-	yes
`--checkerName`	String	A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.	Depends on the service checker. One of: `CONP` `CPUC` `GBGC` `HEAP` `HOGT` `MEMM`	no
`--time`	Integer	The amount of time units that the service will use to periodically monitor the metric	5	no
`--unit`	TimeUnit	The time unit to set the frequency of the metric monitoring. Must correspond to a valid `java.util.concurrent.TimeUnit` value	`MINUTES`	no

--target

String

The instance or cluster that will enable or disable its metric configuration

server

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--enabled

Boolean

Whether to enable or disable the metric monitoring

N/A

yes

--serviceName

String

The metric service name. Must correspond to one of the values listed before

yes

--checkerName

String

A user determined name for easy identification of the checker. This should be unique among the services you have configured, to avoid confusion on the notification messages.

Depends on the service checker. One of:

CONP
CPUC
GBGC
HEAP
HOGT
MEMM

--time

Integer

The amount of time units that the service will use to periodically monitor the metric

--unit

TimeUnit

The time unit to set the frequency of the metric monitoring. Must correspond to a valid java.util.concurrent.TimeUnit value

MINUTES

If this command gets executed before running the healthcheck-configure command, it will succeed and the configuration will be saved, but the HealthCheck service will not be enabled.

Example

A very basic example command to simply enable the GC checker and activate it without needing a restart would be as follows:

asadmin> healthcheck-configure-service --enabled=true
      --serviceName=healthcheck-gc
      --name=MYAPP-GC
      --dynamic=true

shell

`healthcheck-configure-service-threshold`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-configuration command.

Usage

asadmin> healthcheck-configure-service-threshold --serviceName=<service.name> --dynamic=true|false --thresholdCritical=90 --thresholdWarning=50 --thresholdGood=0

Aim

Configures CRITICAL, WARNING and GOOD threshold range values for a service checker. The dynamic attribute should be set to true in order to apply the changes directly.

This command only configures thresholds for the following checkers:

CPU Usage
Connection Pool
Heap Memory Usage
Machine Memory Usage

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will be configured	server	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	no
`--serviceName`	String	The metric service name. Must correspond to one of the values listed before	-	yes
`--thresholdCritical`	Integer	The threshold value that this metric must surpass to generate a `CRITICAL` event. A value between WARNING VALUE and 100 must be used	90	no
`--thresholdWarning`	Integer	The threshold value that this metric must surpass to generate a `WARNING` event. A value between GOOD VALUE and CRITICAL VALUE must be used	50	no
`--thresholdGood`	Integer	The threshold value that this metric must surpass to generate a `GOOD` event. A value between 0 and WARNING VALUE must be used	0	no

--target

String

The instance or cluster that will be configured

server

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--serviceName

String

The metric service name. Must correspond to one of the values listed before

yes

--thresholdCritical

Integer

The threshold value that this metric must surpass to generate a CRITICAL event. A value between WARNING VALUE and 100 must be used

--thresholdWarning

Integer

The threshold value that this metric must surpass to generate a WARNING event. A value between GOOD VALUE and CRITICAL VALUE must be used

--thresholdGood

Integer

The threshold value that this metric must surpass to generate a GOOD event. A value between 0 and WARNING VALUE must be used

In order to execute this command for a specific metric, the healthcheck-configure-service command needs to be executed first.

Example

The following command would apply these settings to the connection pool checker:

asadmin> healthcheck-configure-service-threshold
 --serviceName=healthcheck-cpool
 --dynamic=true
 --thresholdCritical=95
 --thresholdWarning=70
 --thresholdGood=30

shell

`healthcheck-hoggingthreads-configure`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-configuration command.

Usage: asadmin> healthcheck-hoggingthreads-configure --dynamic=true|false --threshold-percentage=50 --retry-count=3
Aim: Configures the Hogging Threads checker service settings. The checker will determine which running threads are hogging the CPU by calculating a percentage of usage with the ratio of elapsed time to the checker service execution interval and verifying if this percentage exceeds the threshold-percentage.

You can also use this command to enable the checker and configure the monitoring frequency as you would do with the healthcheck-configure-service command.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--target`	String	The instance or cluster that will be configured	server	no
`--enabled`	Boolean	Whether to enable or disable the checker	true	no
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	no
`--threshold-percentage`	Integer	The threshold value that this metric will be compared to mark threads as hogging the CPU	95	no
`--retry-count`	Integer	The number of retries that the checker service will execute in order to identify a hogging thread	3	no
`--time`	Integer	The periodic amount of time units the checker service will use to monitor hogging threads	1	no
`--unit`	TimeUnit	The time unit to set the frequency of the metric monitoring. Must correspond to a valid `java.util.concurrent.TimeUnit` value	`SECONDS`	no

--target

String

The instance or cluster that will be configured

server

--enabled

Boolean

Whether to enable or disable the checker

true

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--threshold-percentage

Integer

The threshold value that this metric will be compared to mark threads as hogging the CPU

--retry-count

Integer

The number of retries that the checker service will execute in order to identify a hogging thread

--time

Integer

The periodic amount of time units the checker service will use to monitor hogging threads

--unit

TimeUnit

The time unit to set the frequency of the metric monitoring. Must correspond to a valid java.util.concurrent.TimeUnit value

SECONDS

Example

The following command would apply these settings to the connection pool checker:

asadmin> healthcheck-hoggingthreads-configure
 --dynamic=true
 --threshold-percentage=90
 --retry-count=5
 --time=20
 --unit=SECONDS

shell

`healthcheck-stuckthreads-configure`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-configuration command.

Usage: asadmin> healthcheck-stuckthreads-configure --enabled true|false --dynamic true|false --time=<integer.value> --unit=MICROSECONDS|MILLISECONDS|SECONDS|MINUTES|HOURS|DAYS --threshold=<integer.value> --thresholdUnit=MILLISECONDS|SECONDS|MINUTES|HOURS|DAYS
Aim: Configures the Stuck Thread checker. The Stuck Threads checker is comparable to the request tracing service, in that it is triggered by exceeding a configured threshold. but in this case it reports on all threads that, when the healthcheck runs, have taken longer than the threshold time.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--enabled`	Boolean	Enables or disables the checker	-	yes
`--dynamic`	Boolean	Whether or not to apply the changes dynamically (without a restart)	false	no
`--time`	Integer	The time between checks, must be 1 or greater	-	no
`--unit`	`TimeUnit`	The unit for the time between healthchecks	-	no
`--threshold`	Integer	The threshold above which a thread is considered stuck. Must be 1 or greater.	-	no
`--thresholdUnit`	`TimeUnit`	The unit for the threshold for when a thread should be considered stuck	-	no
`--target`	String	The target to enable the checker on	`server` (the DAS)	no

--enabled

Boolean

Enables or disables the checker

yes

--dynamic

Boolean

Whether or not to apply the changes dynamically (without a restart)

false

--time

Integer

The time between checks, must be 1 or greater

--unit

TimeUnit

The unit for the time between healthchecks

--threshold

Integer

The threshold above which a thread is considered stuck. Must be 1 or greater.

--thresholdUnit

TimeUnit

The unit for the threshold for when a thread should be considered stuck

--target

String

The target to enable the checker on

server (the DAS)

Example

The following example configures the stuckthreads checker to check every 30 seconds for any threads which have been stuck for more than 5 minutes and applies the configuration change without needing a restart:

asadmin> healthcheck-stuckthreads-configure
    --enabled=true
    --dynamic=true
    --time=30
    --unit=SECONDS
    --threshold=5
    --thresholdUnit=MINUTES

Shell

`set-healthcheck-service-notifier-configuration`

Usage

asadmin> set-healthcheck-service-notifier-configuration
 --notifier=<string.value>
 --enabled=true|false
 --dynamic=true|false
 --noisy=true|false

Aim: This command can be used to enable or disable a specific notifier or to change its noisy setting.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--notifier`	String	The notifier to configure. One of (case-insensitive): `LOG` `HIPCHAT` `SLACK` `JMS` `EMAIL` `XMPP` `SNMP` `EVENTBUS` `NEWRELIC` `DATADOG` `CDIEVENTBUS`	-	yes
`--enabled`	Boolean	Enables or disables the notifier	false	Yes
`--noisy`	Boolean	Sets the notifier to noisy (a.k.a. verbose) or not noisy. A noisy notifier includes more detailed logging information in the notifiers output.	-	No
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	No
`--target`	String	The instance or cluster that will be configured	server	no

--notifier

String

The notifier to configure. One of (case-insensitive):

LOG
HIPCHAT
SLACK
JMS
EMAIL
XMPP
SNMP
EVENTBUS
NEWRELIC
DATADOG
CDIEVENTBUS

yes

--enabled

Boolean

Enables or disables the notifier

false

Yes

--noisy

Boolean

Sets the notifier to noisy (a.k.a. verbose) or not noisy. A noisy notifier includes more detailed logging information in the notifiers output.

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

--target

String

The instance or cluster that will be configured

server

Examples

To enable the log notifier for the HealthCheck Service without having to restart the server, use the following command:

asadmin> set-healthcheck-service-notifier-configuration
 --notifier=log
 --enabled=true
 --dynamic=true

shell

`healthcheck-[NOTIFIER_NAME]-notifier-configure`

This is deprecated in 5.191 and will be removed in the future as it is replaced with the set-healthcheck-service-notifier-configuration command.

Usage: asadmin> healthcheck-[NOTIFIER_NAME]-notifier-configure --enabled=true --dynamic=true
Aim: This command can be used to enable or disable the notifier represented by the [NOTIFIER_NAME] placeholder.

Command Options

Option Type Description Default Mandatory

Option	Type	Description	Default	Mandatory
`--enabled`	Boolean	Enables or disables the notifier	false	Yes
`--dynamic`	Boolean	Whether to apply the changes directly to the server/instance without a restart	false	No

--enabled

Boolean

Enables or disables the notifier

false

Yes

--dynamic

Boolean

Whether to apply the changes directly to the server/instance without a restart

false

Examples

To enable the log notifier for the HealthCheck Service without having to restart the server, use the following command:
```
asadmin> healthcheck-log-notifier-configure
    --enabled=true
    --dynamic=true
```
shell

`get-healthcheck-configuration`

Usage: asadmin> get-healthcheck-configuration
Aim: Lists the current configuration for the health check service, configured checkers and enabled notifiers.

Command Options

There are no options available.

Example

A sample output is as follows:

Health Check Service Configuration is enabled?: true
Historical Tracing Enabled?: false
Name      Notifier Enabled
XMPP      false
DATADOG   true
EMAIL     false
SLACK     true
EVENTBUS  false
HIPCHAT   false
NEWRELIC  true
SNMP      false
LOG       true
JMS       false

Below are the list of configuration details of each checker listed by its name.

Name  Enabled  Time  Unit     Add to MicroProfile Health  Critical Threshold  Warning Threshold  Good Threshold
CPUC  true     5     MINUTES  true                        80                  50                 0
HEAP  true     5     MINUTES  false                       80                  50                 0
Name   Enabled  Time  Unit     Add to MicroProfile Health  Threshold Time  Threshold Unit
STUCK  true     5     MINUTES  false                       5               MINUTES
Name  Enabled  Time  Unit     Add to MicroProfile Health
MPM   true     5     MINUTES  false

Monitored Metric Name  Description
base_thread_max_count Displays the peak live thread count since the Java virtual machine started or peak was reset. This includes daemon and non-daemon threads.
base_gc_total_total    Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.

Command get-healthcheck-configuration executed successfully.

Common HealthCheck Service Checker Configuration

The following are the configurable attributes available to ALL the HealthCheck Service checkers:

Enabled

Determines whether or not the checker is enabled.

Dynamic

Determine whether the changes done to the checker’s configuration are applied immediately or after the server/instance’s restart.

Name

The name or label that the checker will use to identify itself on the notification events. The default names for all checkers are the following:

Checker Default name

Checker	Default name
CPU Usage	`CPUC`
Connection Pool	`CONP`
Heap Memory Usage	`HEAP`
Machine Memory Usage	`MEMM`
Hogging Threads	`HOGT`
Stuck Threads	`STUCK`
Garbage Collector	`GBGC`

CPU Usage

CPUC

Connection Pool

CONP

Heap Memory Usage

HEAP

Machine Memory Usage

MEMM

Hogging Threads

HOGT

Stuck Threads

STUCK

Garbage Collector

GBGC

Time

The time interval value (as an Integer) specified in given unit to execute the checker for the metric. The default value is 5.

Unit

This defines the time duration per unit. The accepted options are any valid java.util.concurrent.TimeUnit values. The default value is MINUTES.

Threshold Range Configuration for HealthCheck Service Checkers

The following attributes are available to the CPU Usage, Connection Pool, Heap Memory Usage, Machine Memory Usage and Garbage Collector checkers:

Threshold Good: The upper numeric boundary (valid Integer) of the metric used by the checker for the notification event to be classified as GOOD. It has 0 as default value.
Threshold Warning: The upper numeric boundary (valid Integer) of the metric used by the checker for the notification event to be classified as WARNING. It has 50 as default value.
Threshold Critical: The upper numeric boundary (valid Integer) of the metric used by the checker for the notification event to be classified as CRITICAL. It has 80 as default value.

The threshold values range (GOOD - WARNING - CRITICAL) are used to correctly warn users of the health of a specific metric depending on their value when they are measured based on the checking frequency. For example, if the CPU Usage checker is configured with the default threshold values, and at measuring time, the CPU is performing at 76,8%. Then this notification event would be generated:

Health Check notification with severity level: WARNING - CPUC:Health Check Result:[[status=WARNING, message='CPU%: 76.8, Time CPU used: 171 milliseconds'']']

log

Special HealthCheck Service Checkers Configuration

The Hogging Threads and the Stuck Threads checkers are special on their configuration. They do not have a threshold range configuration, instead opting for different attributes.

Here’s a configuration sample of the Hogging Threads checker:

Hogging Threads Checker Configuration in the Admin Console

The following are the attributes used to configure this checker:

Threshold Percentage: Defines the minimum percentage needed to decide if the thread is hogged CPU-wise. The percentage is calculated with the ratio of elapsed CPU time to checker execution interval. Its default value is 95.
Retry Count: Represents the count value that should be reached by the hogged thread in order for the service to send notifications. Its default value is 3

And here’s a configuration sample for the Stuck Threads checker:

Stuck Threads Checker Configuration in the Admin Console

The following are the attributes used to configure this checker:

Threshold Time: Defines the time value for which a thread can be non-responsive before it is considered stuck. It’s default value is 5.
Threshold Unit: Defines the time unit for the value of the Threshold Time field. It’s default value is Minutes.