Semantic Conventions for OS Process Metrics

Status: Experimental

This document describes instruments and attributes for common OS process level metrics in OpenTelemetry. Also consider the general metric semantic conventions when creating instruments not explicitly defined in this document. OS process metrics are not related to the runtime environment of the program, and should take measurements from the operating system. For runtime environment metrics see semantic conventions for runtime environment metrics.

Warning Existing instrumentations and collector that are using v1.21.0 of this document (or prior):

  • SHOULD NOT adopt any breaking changes from document until the system semantic conventions are marked stable. Conventions include, but are not limited to, attributes, metric names, and unit of measure.
  • SHOULD introduce a control mechanism to allow users to opt-in to the new conventions once the migration plan is finalized.

Process Metrics

Metric: process.cpu.time

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.cpu.timeCountersTotal CPU seconds broken down by different states.Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
cpu.modestringA process SHOULD be characterized either by data points with no mode labels, or only data points with mode labels. [1]user; systemRecommendedExperimental

[1]: Following states SHOULD be used: user, system, wait

cpu.mode has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
idleidleExperimental
interruptinterruptExperimental
iowaitiowaitExperimental
kernelkernelExperimental
niceniceExperimental
stealstealExperimental
systemsystemExperimental
useruserExperimental

Metric: process.cpu.utilization

This metric is opt-in.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.cpu.utilizationGauge1Difference in process.cpu.time since the last measurement, divided by the elapsed time and number of CPUs available to the process.Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
cpu.modestringA process SHOULD be characterized either by data points with no mode labels, or only data points with mode labels. [1]user; systemRecommendedExperimental

[1]: Following states SHOULD be used: user, system, wait

cpu.mode has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
idleidleExperimental
interruptinterruptExperimental
iowaitiowaitExperimental
kernelkernelExperimental
niceniceExperimental
stealstealExperimental
systemsystemExperimental
useruserExperimental

Metric: process.memory.usage

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.memory.usageUpDownCounterByThe amount of physical memory in use.Experimental

Metric: process.memory.virtual

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.memory.virtualUpDownCounterByThe amount of committed virtual memory.Experimental

Metric: process.disk.io

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.disk.ioCounterByDisk bytes transferred.Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
disk.io.directionstringThe disk IO operation direction.readRecommendedExperimental

disk.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
readreadExperimental
writewriteExperimental

Metric: process.network.io

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.network.ioCounterByNetwork bytes transferred.Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
network.io.directionstringThe network IO operation direction.transmitRecommendedExperimental

network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
receivereceiveExperimental
transmittransmitExperimental

Metric: process.thread.count

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.thread.countUpDownCounter{thread}Process threads count.Experimental

Metric: process.open_file_descriptor.count

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.open_file_descriptor.countUpDownCounter{count}Number of file descriptors in use by the process.Experimental

Metric: process.context_switches

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.context_switchesCounter{count}Number of times the process has been context switched.Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
process.context_switch_typestringSpecifies whether the context switches for this data point were voluntary or involuntary.voluntary; involuntaryRecommendedExperimental

process.context_switch_type has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
involuntaryinvoluntaryExperimental
voluntaryvoluntaryExperimental

Metric: process.paging.faults

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.paging.faultsCounter{fault}Number of page faults the process has made.Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
process.paging.fault_typestringThe type of page fault for this data point. Type major is for major/hard page faults, and minor is for minor/soft page faults.major; minorRecommendedExperimental

process.paging.fault_type has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
majormajorExperimental
minorminorExperimental

Metric: process.uptime

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
process.uptimeCountersThe time the process has been running. [1]Experimental

[1]: Instrumentations SHOULD use counter with type double and measure uptime with at least millisecond precision