CountSum statistics

In the output genered by ad-hoc or snapshot-diff mode, the second group of statistics shown are 'countsum' statistics. These statistics are named in this way, because for the use of yb_stats, the count (total_count) and sum (total_sum) fields are the only usable statistical values. The way 'countsum' statistics work is that an event in the code that is tracked by 'countsum' statistics keeps a count for the number of times the event was triggered, and a sum for what it measures. In a lot of cases the unit the sum is taking is time (to capture the latency of the event), but can also be bytes (to capture the size of for example an IO), or something else.

The count and sum statistics are counters, for which the value that is used is the difference between the end and begin values. For the count value difference:

If the value is zero during both the begin and end snapshot, the statistic is skipped.
If the value is non-zero and existing in the end snapshot, and the statistic is not existing in the begin snapshot, the end snapshot value is taken as value.
If the value is non-zero and existing in the begin snapshot, and not existing in the end snapshot, the value is skipped.
If the value is non-zero in the begin and end snapshots, but subtracting leads to zero, then the statistic is not printed. Supposedly nothing happened, but previously something happened.
If the value is non-zero in the begin and end snapshots, but the end value is lower than the begin value: this is a suspicious situation. Currently the resulting negative value is shown.

This is how countsum statistic output looks like:

192.168.66.80:7000   server   handler_latency_outbound_call_queue_time                                             3                  2.899 /s avg:         0 tot:               0 us
192.168.66.80:7000   server   handler_latency_outbound_call_send_time                                              3                  2.899 /s avg:         0 tot:               0 us
192.168.66.80:7000   server   handler_latency_outbound_call_time_to_response                                       3                  2.899 /s avg:      2666 tot:            8000 us
192.168.66.80:7000   server   handler_latency_yb_master_MasterHeartbeat_TSHeartbeat                                3                  2.899 /s avg:       128 tot:             386 us
192.168.66.80:7000   server   rpc_incoming_queue_time                                                              3                  2.899 /s avg:       146 tot:             439 us

Explanation:

hostname:port	metric_type	statistic name	count	count / snapshot time (s)	sum / count	sum total	sum unit
192.168.66.80:7000	server	handler_latency_outbound_call_queue_time	3	2.899 /s	avg: 0	tot: 0	us
192.168.66.80:7000	server	handler_latency_outbound_call_send_time	3	2.899 /s	avg: 0	tot: 0	us
192.168.66.80:7000	server	handler_latency_outbound_call_time_to_response	3	2.899 /s	avg: 2666	tot: 8000	us
192.168.66.80:7000	server	handler_latency_yb_master_MasterHeartbeat_TSHeartbeat	3	2.899 /s	avg: 128	tot: 386	us
192.168.66.80:7000	server	rpc_incoming_queue_time	3	2.899 /s	avg: 146	tot: 439	us

gauges

There is no gauges-like statistic type in 'countsum' statistics.

details enable

192.168.66.80:9000   server   -               -               -                              rpc_incoming_queue_time                                                            143                 13.877 /s avg:       103 tot:           14807 us
192.168.66.80:9000   server   -               -               -                              transaction_pool_cache                                                               1                  0.097 /s avg:         0 tot:               0 us
192.168.66.80:9000   table    000000000004000 yugabyte        t                              log_append_latency                                                                   4                  0.388 /s avg:        45 tot:             182 us
192.168.66.80:9000   table    000000000004000 yugabyte        t                              log_entry_batches_per_group                                                          3                  0.291 /s avg:         1 tot:               4 requests
192.168.66.80:9000   table    000000000004000 yugabyte        t                              log_group_commit_latency                                                             3                  0.291 /s avg:      2319 tot:            6958 us
192.168.66.80:9000   table    000000000004000 yugabyte        t                              log_sync_latency                                                                     1                  0.097 /s avg:      6706 tot:            6706 us
192.168.66.80:9000   table    000000000004000 yugabyte        t                              rocksdb_bytes_per_write                                                              3                  0.291 /s avg:        12 tot:              36 bytes
192.168.66.80:9000   table    000000000004000 yugabyte        t                              rocksdb_db_write_micros                                                              3                  0.291 /s avg:        11 tot:              34 us

Explanation:

hostname:port	metric_type	object_id	namespace	object name	statistic name	count	count snapshot time (s)	sum / count	sum total	sum unit
192.168.66.80:9000	server	-	-	-	rpc_incoming_queue_time	143	13.877 /s	avg: 103	tot: 14807	us
192.168.66.80:9000	server	-	-	-	transaction_pool_cache	1	0.097 /s	avg: 0	tot: 0	us
192.168.66.80:9000	table	000000000004000	yugabyte	t	log_append_latency	4	0.388 /s	avg: 45	tot: 182	us
192.168.66.80:9000	table	000000000004000	yugabyte	t	log_entry_batches_per_group	3	0.291 /s	avg: 1	tot: 4	requests
192.168.66.80:9000	table	000000000004000	yugabyte	t	log_group_commit_latency	3	0.291 /s	avg: 2319	tot: 6958	us
192.168.66.80:9000	table	000000000004000	yugabyte	t	log_sync_latency	1	0.097 /s	avg: 6706	tot: 6706	us
192.168.66.80:9000	table	000000000004000	yugabyte	t	rocksdb_bytes_per_write	3	0.291 /s	avg: 12	tot: 36	bytes
192.168.66.80:9000	table	000000000004000	yugabyte	t	rocksdb_db_write_micros	3	0.291 /s	avg: 11	tot: 34	us

Countsum statistics are called 'course_histograms' in the YugabyteDB sourcecode, and have the fields count and sum in common with 'summaries' in prometheus, however quantile items are not available. YugabyteDB adds the fields min, mean, max, percentile_75, percentile_95, percentile_99, percentile_99_9, and percentile_99_99 to its metrics. These fields are reset when the metrics are read.

yb_stats book

CountSum statistics

gauges

details enable