Example and Key Conventions
The capabilities can be understood quickly through an illustrative example:
AutoExtract will take this raw log text line
BusinessLogic: Orders: debug [production] 29.20.23.224 - id=1315 user: 'happy bear' modified 03/Apr/2023:18:45:51 -0600
Req(`HEAD /orders/list HTTP/2.0`) perf={"time": 0.5, "stats": {"cpu": 0.1}} code(200 OK) {"bytes": 2581}
{"tags": ["http", "HEAD", "orders"]}
and automatically add these fields to the ingested log event:
- YAML
- JSON
severity: "debug"
category: "BusinessLogic.Orders"
pattern_hash: "BOdm_858lc"
pattern: "BusinessLogic: Orders: debug <ipaddr> - modified <timestamp>"
x:
b:
- "production"
ts:
- 2023-04-03T18:45:51-0600
ips:
- "29.20.23.224"
var:
- "29.20.23.224"
id: 1315
user: "happy bear"
Req: "HEAD /orders/list HTTP/2.0"
perf:
time: 0.5
stats:
cpu: 0.1
code: "200 OK"
bytes: 2581
tags:
- http
- HEAD
- orders
{
"severity": "debug",
"category": "BusinessLogic.Orders",
"pattern": "BusinessLogic: Orders: debug <ipaddr> - modified <timestamp>",
"pattern_hash": "BOdm_858lc",
"x": {
"b": [
"production"
],
"ts": [
1680569151000000
],
"ips": [
"29.20.23.224"
],
"var": [
"29.20.23.224"
],
"id": 1315,
"user": "happy bear",
"Req": "HEAD /orders/list HTTP/2.0",
"perf": {
"time": 0.5,
"stats": {
"cpu": 0.1
}
},
"code": "200 OK",
"bytes": 2581,
"tags": [
"http",
"HEAD",
"orders"
]
}
}
Key ideas from this example:
- The log severity was extracted automatically from the text, rather than a field set by the log forwarding agent.
- Automatically extracted values are placed as a subfield of the
x
field, which is reserved for AutoExtract. - Various syntax for key/value pairs are supported (
key=value
,key: value
,key(value)
). - Various syntax for quoting values with spaces are supported (
"
,'
,()
,[]
, backtick). - Date/time values do not have to be quoted, even if they contain spaces, to be recognized and extracted.
- JSON data can be a value in a key/value pair (e.g.,
perf
value), or can be anywhere and still be picked up (e.g., JSON that containedbytes
value). - JSON data can be arbitrarily deep and contain an unlimited number of custom fields, and can contain complex values like arrays of values.
- Special fields
x.b[]
,x.ips[]
, andx.ts[]
store any detected bracketed values, IP addresses, and timestamps not associated with key/value pairs. - Automatic category extraction is stored in the
category
field. - Automatic pattern classification is stored in the
pattern
field. - A shorthand (likely) unique hash of the pattern classification is stored in the
pattern_hash
field.
Automatically extracting structured data unlocks better querying. Adapt your log messages as needed to follow AutoExtract conventions, and enjoy zero-configuration unlimited custom fields.
Continuing this example, here is how you would search for any "OK" responses that took between 2 and 30 seconds that had either a "GET" or a "POST" tag:
- LQL
x.code: OK and x.perf.time between 2.0 and 30 and x.tags in (get, post)
Many of the AutoExtract conventions only apply to the first line of text in each event message. If you are sending multi-line log messages, then the full AutoExtract and AutoClassify process is applied to the first line of each log message. Any detected JSON is still extracted in additional lines of text in a given log message. This ensures that exception backtraces for example are not interpreted unexpectedly as dynamic field data by AutoExtract.