Skip to main content

Logstash Agent

Logstash is an open-source log processing and forwarding agent that is popular in the Elastic and OpenSearch communities.

SparkLogs can receive data via the OpenSearch and Elasticsearch REST API for bulk indexing and can thus receive data from Logstash. Metrics data are not yet supported.

Since SparkLogs is schemaless, no configuration or management of index templates is required. You can configure any Logstash agent to simply output data using the elasticsearch bulk indexing API to the SparkLogs cloud, and you're all set.

How to Use

Follow these steps for each logical agent that will receive data from a beats agent:

1. Create agent and get config template

In the app, click the Configure sidebar button:
Configure Sidebar Button
and then click the Agents tab.

As appropriate, create a new agent, or highlight an existing agent and click View API Key. In the dialog that shows the agent configuration template, click the Logstash tab and copy the configuration template.

2. Customize configuration

Copy the pipeline configuration template and customize it based on your needs. At a minimum, add additional inputs in the config as appropriate (e.g., for files, kernel logs, etc.).

Place the pipeline configuration file (e.g., logstash.conf) into the appropriate directory, typically /etc/logstash/conf.d.

Example Logstash pipeline configuration template
input {
# any inputs you need, files, etc.
...
}

filter {
# filter or mutate input logs as needed, e.g., to add tags or additional fields from env vars
#mutate {
# add_tag => [ "myapp" ]
# add_field => {
# "podname" => "${POD_HOSTNAME}"
# }
#}
}

output {
elasticsearch {
# The :443 are required. Use es6, es7, or es8 subdomain as
# appropriate for compatibility with your version of the Logstash tool.
hosts => ["https://es8.ingest-<REGION>.engine.sparklogs.app:443/"]
ssl => true

user => "<AGENT-ID>"
password => "<AGENT-ACCESS-TOKEN>"
# you could also get these values from env variables
# user => "${SPARKLOGS_AGENT_ID}"
# password => "${SPARKLOGS_AGENT_ACCESS_TOKEN}"

http_compression => true
resurrect_delay => 20
retry_max_interval => 128
timeout => 120
}

# Optional: Other outputs, such as printing logs to stdout for debugging
#stdout {
# codec => rubydebug
#}
}

3. Optimize Logstash configuration

In addition to any pipelines that you configure, you will also want to tune Logstash for high throughput output to SparkLogs. Edit the /etc/logstash/logstash.yml file:

# Increase based on the total MB/sec you will need for your required throughput.
# (each worker will handle one concurrent request, and thus will each handle about 1 MB/sec)
pipeline.workers: 8

# Set to the number of events that will mean each batch is roughly 1 MB in size.
# For example, if an average log event is 1Kb, set this to 1024.
pipeline.batch.size: 1024

# To encourage full batches, wait up to 2 seconds before sending an undersized batch.
pipeline.batch.delay: 2000

4. Deploy Logstash agents

On each system that will ship data to SparkLogs for this logical agent, install the Logstash agent software with the appropriate configuration, and make sure it starts on system boot.