CountSum statistics

In the output genered by ad-hoc or snapshot-diff mode, the second group of statistics shown are 'countsum' statistics. These statistics are named in this way, because for the use of yb_stats, the count (total_count) and sum (total_sum) fields are the only usable statistical values. The way 'countsum' statistics work is that an event in the code that is tracked by 'countsum' statistics keeps a count for the number of times the event was triggered, and a sum for what it measures. In a lot of cases the unit the sum is taking is time (to capture the latency of the event), but can also be bytes (to capture the size of for example an IO), or something else.

The count and sum statistics are counters, for which the value that is used is the difference between the end and begin values. For the count value difference:

  • If the value is zero during both the begin and end snapshot, the statistic is skipped.
  • If the value is non-zero and existing in the end snapshot, and the statistic is not existing in the begin snapshot, the end snapshot value is taken as value.
  • If the value is non-zero and existing in the begin snapshot, and not existing in the end snapshot, the value is skipped.
  • If the value is non-zero in the begin and end snapshots, but subtracting leads to zero, then the statistic is not printed. Supposedly nothing happened, but previously something happened.
  • If the value is non-zero in the begin and end snapshots, but the end value is lower than the begin value: this is a suspicious situation. Currently the resulting negative value is shown.

This is how countsum statistic output looks like:

192.168.66.80:7000   server   handler_latency_outbound_call_queue_time                                             3                  2.899 /s avg:         0 tot:               0 us
192.168.66.80:7000   server   handler_latency_outbound_call_send_time                                              3                  2.899 /s avg:         0 tot:               0 us
192.168.66.80:7000   server   handler_latency_outbound_call_time_to_response                                       3                  2.899 /s avg:      2666 tot:            8000 us
192.168.66.80:7000   server   handler_latency_yb_master_MasterHeartbeat_TSHeartbeat                                3                  2.899 /s avg:       128 tot:             386 us
192.168.66.80:7000   server   rpc_incoming_queue_time                                                              3                  2.899 /s avg:       146 tot:             439 us

Explanation:

hostname:portmetric_typestatistic namecountcount / snapshot time (s)sum / countsum totalsum unit
192.168.66.80:7000serverhandler_latency_outbound_call_queue_time32.899 /savg: 0tot: 0us
192.168.66.80:7000serverhandler_latency_outbound_call_send_time32.899 /savg: 0tot: 0us
192.168.66.80:7000serverhandler_latency_outbound_call_time_to_response32.899 /savg: 2666tot: 8000us
192.168.66.80:7000serverhandler_latency_yb_master_MasterHeartbeat_TSHeartbeat32.899 /savg: 128tot: 386us
192.168.66.80:7000serverrpc_incoming_queue_time32.899 /savg: 146tot: 439us

gauges

There is no gauges-like statistic type in 'countsum' statistics.

details enable

192.168.66.80:9000   server   -               -               -                              rpc_incoming_queue_time                                                            143                 13.877 /s avg:       103 tot:           14807 us
192.168.66.80:9000   server   -               -               -                              transaction_pool_cache                                                               1                  0.097 /s avg:         0 tot:               0 us
192.168.66.80:9000   table    000000000004000 yugabyte        t                              log_append_latency                                                                   4                  0.388 /s avg:        45 tot:             182 us
192.168.66.80:9000   table    000000000004000 yugabyte        t                              log_entry_batches_per_group                                                          3                  0.291 /s avg:         1 tot:               4 requests
192.168.66.80:9000   table    000000000004000 yugabyte        t                              log_group_commit_latency                                                             3                  0.291 /s avg:      2319 tot:            6958 us
192.168.66.80:9000   table    000000000004000 yugabyte        t                              log_sync_latency                                                                     1                  0.097 /s avg:      6706 tot:            6706 us
192.168.66.80:9000   table    000000000004000 yugabyte        t                              rocksdb_bytes_per_write                                                              3                  0.291 /s avg:        12 tot:              36 bytes
192.168.66.80:9000   table    000000000004000 yugabyte        t                              rocksdb_db_write_micros                                                              3                  0.291 /s avg:        11 tot:              34 us

Explanation:

hostname:portmetric_typeobject_idnamespaceobject namestatistic namecountcount snapshot time (s)sum / countsum totalsum unit
192.168.66.80:9000server---rpc_incoming_queue_time14313.877 /savg: 103tot: 14807us
192.168.66.80:9000server---transaction_pool_cache10.097 /savg: 0tot: 0us
192.168.66.80:9000table000000000004000yugabytetlog_append_latency40.388 /savg: 45tot: 182us
192.168.66.80:9000table000000000004000yugabytetlog_entry_batches_per_group30.291 /savg: 1tot: 4requests
192.168.66.80:9000table000000000004000yugabytetlog_group_commit_latency30.291 /savg: 2319tot: 6958us
192.168.66.80:9000table000000000004000yugabytetlog_sync_latency10.097 /savg: 6706tot: 6706us
192.168.66.80:9000table000000000004000yugabytetrocksdb_bytes_per_write30.291 /savg: 12tot: 36bytes
192.168.66.80:9000table000000000004000yugabytetrocksdb_db_write_micros30.291 /savg: 11tot: 34us

Countsum statistics are called 'course_histograms' in the YugabyteDB sourcecode, and have the fields count and sum in common with 'summaries' in prometheus, however quantile items are not available. YugabyteDB adds the fields min, mean, max, percentile_75, percentile_95, percentile_99, percentile_99_9, and percentile_99_99 to its metrics. These fields are reset when the metrics are read.