Skip to main content

Standard Field Mapping

Standard fields

In addition to supporting infinite custom fields, SparkLogs also defines the following standard fields:

  • timestamp: The timestamp of the log message itself.
  • ingested_timestamp: The timestamp when the log data was actually ingested into the system.
  • event_index: The index of the ingested event within the batch of events submitted in one ingestion request. This allows to reconstruct the exact order of log events even if their timestamps are exactly the same.
  • source: The name of the source of the event (e.g., Kubernetes podname or hostname of the device that generated the log event).
  • severity: The severity level of the log message.
  • facility: The facility level of the log message, if any (usually only set for syslog data).
  • app: A string field often set by log forwarding agents to indicate the application that generated the log event.
  • message: The original and unmodified string value of the log event.
  • category: One or more category labels (separated by .), as extracted by category extraction.
  • pattern and pattern_hash: The pattern (and corresponding hash) assigned by AutoClassify.
  • trace_id: A globally unique ID that tracks a single request across distributed systems.
  • span_id: An ID unique within a given trace that tracks a single operation within that trace.

The following reserved fields are also automatically populated based on the agent and organization that ingest the data:

  • organization_id: The ID of the organization that owns the data.
  • agent_id: The ID of the agent that ingested the data.

Automatic detection of standard fields

Various logging systems and log forwarding agents have widely different names for these standard fields, and for some sources, the mapping may be different for different types of events (e.g., for Google Cloud Platform events, the source field may be set by pod_name for K8s and by instance_id for VMs).

SparkLogs will automatically detect and map fields from many common log data schemas, including:

Customizing severity mappings

For the field that is detected to be the severity field, it will interpret this field value from any supported text (case insensitive) or numeric value. Numeric severity values must be in the range 1-24 and are interpreted as defined by the OpenTelemetry standard for severity values. Standard textual severities include trace, debug, info, notice or display (info3), warn, error or fail, critical (error4), fatal, alert (fatal2), panic (fatal3), and emergency (fatal4).

If you have non-standard severity values, you can either transform these values before shipping the logs (e.g., using vector VRL transformations), or more conveniently by using the custom severity mapping feature.

To specify a custom mapping, use the X-Severity-Map HTTP header. This is a comma-delimited list of key=value pairs that specifies additional mappings from a custom severity level to a standard severity level.

For example, in the Node.js bunyan log library it uses numeric log levels from 10 (trace) to 60 (fatal). You could remap these with a X-Severity-Map HTTP header value of 10=TRACE,20=DEBUG,30=INFO,40=WARN,50=ERROR,60=FATAL.

Timestamp constraints

tip

When log data is ingested, if the timestamp is older than your configured retention period (or if you are using the SparkLogs cloud and the event is older than 50 days), then the timestamp will be set to ingested_timestamp and the original timestamp will be stored into a original_timestamp custom field.

Also, if there is no timestamp in the structured log data provided by the logging agent, then AutoExtract will attempt to determine the event timestamp from the log message. If nothing relevant is detected, it will use the current date/time for the ingested log event. If you prefer that AutoExtract uses the timestamp in the message first and then falls back to the timestamp sent by your log forwarding agent, then configure your log forwarding agent to use observedtimestamp for the timestamp field that it sends.