Monitoring

Kitex has monitoring capability built in, but does not have any monitoring features itself, and can be extended by the interface.

Custom monitoring management

The framework provides a Tracer interface. Users can implement it and inject it by WithTracer Option.

// Tracer is executed at the start and finish of an RPC.
type Tracer interface {
    Start(ctx context.Context) context.Context
    Finish(ctx context.Context)
}

For detailed documentation, refer to the Monitoring Extension section.

Expansion Repository use

kitex-contrib also provides two monitoring extensions monitor-prometheus and obs-opentelemetry. They integrate Prometheus and OpenTelemetry monitoring extensions, respectively. The former is more aligned with the Prometheus ecosystem and is easier to use, while the latter provides more flexibility.

Prometheus

The extension repository monitor-prometheus offers Prometheus monitoring extension.

usage example:

Client

import (
    "github.com/kitex-contrib/monitor-prometheus"
    kClient "github.com/cloudwego/kitex/client"
)

...

client, _ := testClient.NewClient(
    "DestServiceName",
    kClient.WithTracer(prometheus.NewClientTracer(":9091", "/kitexclient")))

resp, _ := client.Send(ctx, req)

...

Server

import (
    "github.com/kitex-contrib/monitor-prometheus"
    kServer "github.com/cloudwego/kitex/server"
)

func main() {
...
	svr := xxxservice.NewServer(
	    &myServiceImpl{},
	    kServer.WithTracer(prometheus.NewServerTracer(":9092", "/kitexserver")))
	svr.Run()
...
}

Metrics

Client

NameUnitTagsDescription
kitex_client_throughput-type, caller, callee, method, statusTotal number of requests handled by the Client
kitex_client_latency_usustype, caller, callee, method, statusLatency of request handling at the Client (Response received time - Request initiation time, in microseconds)

Server

NameUnitTagsDescription
kitex_server_throughput-type, caller, callee, method, statusTotal number of requests handled by the Server
kitex_server_latency_usustype, caller, callee, method, statusLatency of request handling at the Server (Processing completion time - Request received time, in microseconds)

More complex data monitoring can be implemented based on the above metrics. Examples can be found in the Useful Examples section.

Runtime Metrics

This repository relies on prometheus/client_golang and supports its built-in runtime metrics. For more details, please refer to WithGoCollectorRuntimeMetrics.

OpenTelemetry

The extension repository obs-opentelemetry provides OpenTelemetry monitoring extension.

usage example:

For information on how to use obs-opentelemetry, please refer to the tracing section.

Metrics

Server

NameMetric Data ModelUnitUnit(UCUM)Description
rpc.server.durationHistogrammillisecondsmsmeasures duration of inbound RPC

Client

NameMetric Data ModelUnitUnit(UCUM)Description
rpc.server.durationHistogrammillisecondsmsmeasures duration of outbound RPC

Additional service metrics can be calculated using rpc.server.duration, such as R.E.D (Rate, Errors, Duration). Examples can be found here.

Runtime Metrics

Based on opentelemetry-go, it supports the following runtime metrics:

NameInstrumentUnitUnit (UCUM))Description
process.runtime.go.cgo.callsSum--Number of cgo calls made by the current process.
process.runtime.go.gc.countSum--Number of completed garbage collection cycles.
process.runtime.go.gc.pause_nsHistogramnanosecondnsAmount of nanoseconds in GC stop-the-world pauses.
process.runtime.go.gc.pause_total_nsHistogramnanosecondnsCumulative nanoseconds in GC stop-the-world pauses since the program started.
process.runtime.go.goroutinesGauge--measures duration of outbound RPC.
process.runtime.go.lookupsSum--Number of pointer lookups performed by the runtime.
process.runtime.go.mem.heap_allocGaugebytesbytesBytes of allocated heap objects.
process.runtime.go.mem.heap_idleGaugebytesbytesBytes in idle (unused) spans.
process.runtime.go.mem.heap_inuseGaugebytesbytesBytes in in-use spans.
process.runtime.go.mem.heap_objectsGauge--Number of allocated heap objects.
process.runtime.go.mem.live_objectsGauge--Number of live objects is the number of cumulative Mallocs - Frees.
process.runtime.go.mem.heap_releasedGaugebytesbytesBytes of idle spans whose physical memory has been returned to the OS.
process.runtime.go.mem.heap_sysGaugebytesbytesBytes of idle spans whose physical memory has been returned to the OS.
runtime.uptimeSummsmsMilliseconds since application was initialized.

Last modified July 24, 2024 : docs: fix error in render (#1110) (34e4f87)