documentation.md 23 KB

apachespark

Default Metrics

The following metrics are emitted by default. Each of them can be disabled by applying the following configuration:

metrics:
  <metric_name>:
    enabled: false

spark.driver.block_manager.disk.usage

Disk space used by the BlockManager.

Unit Metric Type Value Type Aggregation Temporality Monotonic
mb Sum Int Cumulative false

spark.driver.block_manager.memory.usage

Memory usage for the driver's BlockManager.

Unit Metric Type Value Type Aggregation Temporality Monotonic
mb Sum Int Cumulative false

Attributes

Name Description Values
location The location of the memory for which the metric was recorded.. Str: on_heap, off_heap
state The state of the memory for which the metric was recorded. Str: used, free

spark.driver.code_generator.compilation.average_time

Average time spent during CodeGenerator source code compilation operations.

Unit Metric Type Value Type
ms Gauge Double

spark.driver.code_generator.compilation.count

Number of source code compilation operations performed by the CodeGenerator.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ compilation } Sum Int Cumulative true

spark.driver.code_generator.generated_class.average_size

Average class size of the classes generated by the CodeGenerator.

Unit Metric Type Value Type
bytes Gauge Double

spark.driver.code_generator.generated_class.count

Number of classes generated by the CodeGenerator.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ class } Sum Int Cumulative true

spark.driver.code_generator.generated_method.average_size

Average method size of the classes generated by the CodeGenerator.

Unit Metric Type Value Type
bytes Gauge Double

spark.driver.code_generator.generated_method.count

Number of methods generated by the CodeGenerator.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ method } Sum Int Cumulative true

spark.driver.code_generator.source_code.average_size

Average size of the source code generated by a CodeGenerator code generation operation.

Unit Metric Type Value Type
bytes Gauge Double

spark.driver.code_generator.source_code.operations

Number of source code generation operations performed by the CodeGenerator.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ operation } Sum Int Cumulative true

spark.driver.dag_scheduler.job.active

Number of active jobs currently being processed by the DAGScheduler.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ job } Sum Int Cumulative false

spark.driver.dag_scheduler.job.count

Number of jobs that have been submitted to the DAGScheduler.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ job } Sum Int Cumulative true

spark.driver.dag_scheduler.stage.count

Number of stages the DAGScheduler is either running or needs to run.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ stage } Sum Int Cumulative false

Attributes

Name Description Values
status The status of the DAGScheduler stages for which the metric was recorded. Str: waiting, running

spark.driver.dag_scheduler.stage.failed

Number of failed stages run by the DAGScheduler.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ stage } Sum Int Cumulative true

spark.driver.executor.gc.operations

Number of garbage collection operations performed by the driver.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ gc_operation } Sum Int Cumulative true

Attributes

Name Description Values
gc_type The type of the garbage collection performed for the metric. Str: major, minor

spark.driver.executor.gc.time

Total elapsed time during garbage collection operations performed by the driver.

Unit Metric Type Value Type Aggregation Temporality Monotonic
ms Sum Int Cumulative true

Attributes

Name Description Values
gc_type The type of the garbage collection performed for the metric. Str: major, minor

spark.driver.executor.memory.execution

Amount of execution memory currently used by the driver.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative false

Attributes

Name Description Values
location The location of the memory for which the metric was recorded.. Str: on_heap, off_heap

spark.driver.executor.memory.jvm

Amount of memory used by the driver's JVM.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative false

Attributes

Name Description Values
location The location of the memory for which the metric was recorded.. Str: on_heap, off_heap

spark.driver.executor.memory.pool

Amount of pool memory currently used by the driver.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative false

Attributes

Name Description Values
type The type of pool memory for which the metric was recorded. Str: direct, mapped

spark.driver.executor.memory.storage

Amount of storage memory currently used by the driver.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative false

Attributes

Name Description Values
location The location of the memory for which the metric was recorded.. Str: on_heap, off_heap

spark.driver.hive_external_catalog.file_cache_hits

Number of file cache hits on the HiveExternalCatalog.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ hit } Sum Int Cumulative true

spark.driver.hive_external_catalog.files_discovered

Number of files discovered while listing the partitions of a table in the Hive metastore

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ file } Sum Int Cumulative true

spark.driver.hive_external_catalog.hive_client_calls

Number of calls to the underlying Hive Metastore client made by the Spark application.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ call } Sum Int Cumulative true

spark.driver.hive_external_catalog.parallel_listing_jobs

Number of parallel listing jobs initiated by the HiveExternalCatalog when listing partitions of a table.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ listing_job } Sum Int Cumulative true

spark.driver.hive_external_catalog.partitions_fetched

Table partitions fetched by the HiveExternalCatalog.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ partition } Sum Int Cumulative true

spark.driver.jvm_cpu_time

Current CPU time taken by the Spark driver.

Unit Metric Type Value Type Aggregation Temporality Monotonic
ns Sum Int Cumulative true

spark.driver.live_listener_bus.dropped

Number of events that have been dropped by the LiveListenerBus.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ event } Sum Int Cumulative true

spark.driver.live_listener_bus.posted

Number of events that have been posted on the LiveListenerBus.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ event } Sum Int Cumulative true

spark.driver.live_listener_bus.processing_time.average

Average time taken for the LiveListenerBus to process an event posted to it.

Unit Metric Type Value Type
ms Gauge Double

spark.driver.live_listener_bus.queue_size

Number of events currently waiting to be processed by the LiveListenerBus.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ event } Sum Int Cumulative false

spark.executor.disk.usage

Disk space used by this executor for RDD storage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative false

spark.executor.gc_time

Elapsed time the JVM spent in garbage collection in this executor.

Unit Metric Type Value Type Aggregation Temporality Monotonic
ms Sum Int Cumulative true

spark.executor.input_size

Amount of data input for this executor.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

spark.executor.memory.usage

Storage memory used by this executor.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative false

spark.executor.shuffle.io.size

Amount of data written and read during shuffle operations for this executor.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

Attributes

Name Description Values
direction Whether the metric is in regards to input or output operations. Str: in, out

spark.executor.storage_memory.usage

The executor's storage memory usage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative false

Attributes

Name Description Values
location The location of the memory for which the metric was recorded.. Str: on_heap, off_heap
state The state of the memory for which the metric was recorded. Str: used, free

spark.executor.task.active

Number of tasks currently running in this executor.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ task } Sum Int Cumulative false

spark.executor.task.limit

Maximum number of tasks that can run concurrently in this executor.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ task } Sum Int Cumulative false

spark.executor.task.result

Number of tasks with a specific result in this executor.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ task } Sum Int Cumulative true

Attributes

Name Description Values
result The result of the executor tasks for which the metric was recorded. Str: completed, failed

spark.executor.time

Elapsed time the JVM spent executing tasks in this executor.

Unit Metric Type Value Type Aggregation Temporality Monotonic
ms Sum Int Cumulative true

spark.job.stage.active

Number of active stages in this job.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ stage } Sum Int Cumulative false

spark.job.stage.result

Number of stages with a specific result in this job.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ stage } Sum Int Cumulative true

Attributes

Name Description Values
result The result of the job stages or tasks for which the metric was recorded. Str: completed, failed, skipped

spark.job.task.active

Number of active tasks in this job.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ task } Sum Int Cumulative false

spark.job.task.result

Number of tasks with a specific result in this job.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ task } Sum Int Cumulative true

Attributes

Name Description Values
result The result of the job stages or tasks for which the metric was recorded. Str: completed, failed, skipped

spark.stage.disk.spilled

The amount of disk space used for storing portions of overly large data chunks that couldn't fit in memory in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

spark.stage.executor.cpu_time

CPU time spent by the executor in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
ns Sum Int Cumulative true

spark.stage.executor.run_time

Amount of time spent by the executor in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
ms Sum Int Cumulative true

spark.stage.io.records

Number of records written and read in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ record } Sum Int Cumulative true

Attributes

Name Description Values
direction Whether the metric is in regards to input or output operations. Str: in, out

spark.stage.io.size

Amount of data written and read at this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

Attributes

Name Description Values
direction Whether the metric is in regards to input or output operations. Str: in, out

spark.stage.jvm_gc_time

The amount of time the JVM spent on garbage collection in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
ms Sum Int Cumulative true

spark.stage.memory.peak

Peak memory used by internal data structures created during shuffles, aggregations and joins in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

spark.stage.memory.spilled

The amount of memory moved to disk due to size constraints (spilled) in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

spark.stage.shuffle.blocks_fetched

Number of blocks fetched in shuffle operations in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ block } Sum Int Cumulative true

Attributes

Name Description Values
source The source from which data was fetched for the metric. Str: local, remote

spark.stage.shuffle.fetch_wait_time

Time spent in this stage waiting for remote shuffle blocks.

Unit Metric Type Value Type Aggregation Temporality Monotonic
ms Sum Int Cumulative true

spark.stage.shuffle.io.disk

Amount of data read to disk in shuffle operations (sometimes required for large blocks, as opposed to the default behavior of reading into memory).

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

spark.stage.shuffle.io.read.size

Amount of data read in shuffle operations in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

Attributes

Name Description Values
source The source from which data was fetched for the metric. Str: local, remote

spark.stage.shuffle.io.records

Number of records written or read in shuffle operations in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ record } Sum Int Cumulative true

Attributes

Name Description Values
direction Whether the metric is in regards to input or output operations. Str: in, out

spark.stage.shuffle.io.write.size

Amount of data written in shuffle operations in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

spark.stage.shuffle.write_time

Time spent blocking on writes to disk or buffer cache in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
ns Sum Int Cumulative true

spark.stage.status

A one-hot encoding representing the status of this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ status } Sum Int Cumulative false

Attributes

Name Description Values
active Whether the stage for which the metric was recorded is active. Any Bool
complete Whether the stage for which the metric was recorded is complete. Any Bool
pending Whether the stage for which the metric was recorded is pending. Any Bool
failed Whether the stage for which the metric was recorded is failed. Any Bool

spark.stage.task.active

Number of active tasks in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ task } Sum Int Cumulative false

spark.stage.task.result

Number of tasks with a specific result in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
{ task } Sum Int Cumulative true

Attributes

Name Description Values
result The result of the stage tasks for which the metric was recorded. Str: completed, failed, killed

spark.stage.task.result_size

The amount of data transmitted back to the driver by all the tasks in this stage.

Unit Metric Type Value Type Aggregation Temporality Monotonic
bytes Sum Int Cumulative true

Resource Attributes

Name Description Values Enabled
spark.application.id The ID of the application for which the metric was recorded. Any Str true
spark.application.name The name of the application for which the metric was recorded. Any Str true
spark.executor.id The ID of the executor for which the metric was recorded. Any Str true
spark.job.id The ID of the job for which the metric was recorded. Any Int true
spark.stage.attempt.id The ID of the stage attempt for which the metric was recorded. Any Int false
spark.stage.id The ID of the application stage for which the metric was recorded. Any Int true