Standard Field Mapping
Standard fields
In addition to supporting infinite custom fields, SparkLogs also defines the following standard fields:
timestamp
: The timestamp of the log message itself.ingested_timestamp
: The timestamp when the log data was actually ingested into the system.event_index
: The index of the ingested event within the batch of events submitted in one ingestion request. This allows to reconstruct the exact order of log events even if their timestamps are exactly the same.source
: The name of the source of the event (e.g., Kubernetes podname or hostname of the device that generated the log event).severity
: The severity level of the log message.facility
: The facility level of the log message, if any (usually only set for syslog data).app
: A string field often set by log forwarding agents to indicate the application that generated the log event.message
: The original and unmodified string value of the log event.category
: One or more category labels (separated by.
), as extracted by category extraction.pattern
andpattern_hash
: The pattern (and corresponding hash) assigned by AutoClassify.trace_id
: A globally unique ID that tracks a single request across distributed systems.span_id
: An ID unique within a given trace that tracks a single operation within that trace.
The following reserved fields are also automatically populated based on the agent and organization that ingest the data:
organization_id
: The ID of the organization that owns the data.agent_id
: The ID of the agent that ingested the data.
Automatic detection of standard fields
Various logging systems and log forwarding agents have widely different names for these standard fields,
and for some sources, the mapping may be different for different types of events (e.g., for Google Cloud Platform
events, the source
field may be set by pod_name
for K8s and by instance_id
for VMs).
SparkLogs will automatically detect and map fields from many common log data schemas, including:
- syslog
- OpenTelemetry
- Elastic Common Schema
- vector.dev
- Fluent Bit
- HEC (Splunk)
- Windows Event Log
- AWS CloudTrail
- Google Cloud Logging
- zap
- log4j
Customizing severity mappings
For the field that is detected to be the severity
field, it will interpret this field value from any
supported text (case insensitive) or numeric value. Numeric severity values must be in the range 1-24
and are interpreted as defined by the OpenTelemetry standard for severity values.
Standard textual severities include trace
, debug
, info
, notice
or display
(info3
), warn
,
error
or fail
, critical
(error4
), fatal
, alert
(fatal2
), panic
(fatal3
), and emergency
(fatal4
).
If you have non-standard severity values, you can either transform these values before shipping the logs (e.g., using vector VRL transformations), or more conveniently by using the custom severity mapping feature.
To specify a custom mapping, use the X-Severity-Map
HTTP header.
This is a comma-delimited list of key=value
pairs that specifies additional mappings from a custom severity
level to a standard severity level.
For example, in the Node.js bunyan log library
it uses numeric log levels from 10
(trace
) to 60
(fatal
). You could remap these with a X-Severity-Map
HTTP header value of 10=TRACE,20=DEBUG,30=INFO,40=WARN,50=ERROR,60=FATAL
.
Timestamp constraints
When log data is ingested, if the timestamp
is older than your configured retention period (or if
you are using the SparkLogs cloud and the event is older than 50 days), then the timestamp will
be set to ingested_timestamp
and the original timestamp will be stored into a original_timestamp
custom field.
Also, if there is no timestamp
in the structured log data provided by the logging agent, then
AutoExtract will attempt to determine the event timestamp from the log message. If nothing
relevant is detected, it will use the current date/time for the ingested log event. If you prefer
that AutoExtract uses the timestamp in the message first and then falls back to the timestamp
sent by your log forwarding agent, then configure your log forwarding agent to use observedtimestamp
for the timestamp field that it sends.