-
Robert Fratto authoredRobert Fratto authored
Processing Log Lines
A detailed look at how to setup promtail to process your log lines, including extracting metrics and labels.
Pipeline
Pipeline stages implement the following interface:
type Stage interface {
Process(labels model.LabelSet, extracted map[string]interface{}, time *time.Time, entry *string)
}
Any Stage is capable of modifying the labels
, extracted
data, time
, and/or entry
, though generally a Stage should only modify one of those things to reduce complexity.
Typical pipelines will start with a regex or json stage to extract data from the log line. Then any combination of other stages follow to use the data in the extracted
map. It may also be common to see the use of match at the start of a pipeline to selectively apply stages based on labels.
The example below gives a good glimpse of what you can achieve with a pipeline :
scrape_configs:
- job_name: kubernetes-pods-name
kubernetes_sd_configs: ....
pipeline_stages:
- match:
selector: '{name="promtail"}'
stages:
- regex:
expression: '.*level=(?P<level>[a-zA-Z]+).*ts=(?P<timestamp>[T\d-:.Z]*).*component=(?P<component>[a-zA-Z]+)'
- labels:
level:
component:
- timestamp:
format: RFC3339Nano
source: timestamp
- match:
selector: '{name="nginx"}'
stages:
- regex:
expression: \w{1,3}.\w{1,3}.\w{1,3}.\w{1,3}(?P<output>.*)
- output:
source: output
- match:
selector: '{name="jaeger-agent"}'
stages:
- json:
expressions:
level: level
- labels:
level:
- job_name: kubernetes-pods-app
kubernetes_sd_configs: ....
pipeline_stages:
- match:
selector: '{app=~"grafana|prometheus"}'
stages:
- regex:
expression: ".*(lvl|level)=(?P<level>[a-zA-Z]+).*(logger|component)=(?P<component>[a-zA-Z]+)"
- labels:
level:
component:
- match:
selector: '{app="some-app"}'
stages:
- regex:
expression: ".*(?P<panic>panic: .*)"
- metrics:
- panic_total:
type: Counter
description: "total count of panic"
source: panic
config:
action: inc
In the first job:
The first match
stage will only run if a label named name
== promtail
, it then applies a regex to parse the line, followed by setting two labels (level and component) and the timestamp from extracted data.
The second match
stage will only run if a label named name
== nginx
, it then parses the log line with regex and extracts the output
which is then set as the log line output sent to loki
The third match
stage will only run if label named name
== jaeger-agent
, it then parses this log as JSON extracting level
which is then set as a label
In the second job:
The first match
stage will only run if a label named app
== grafana
or prometheus
, it then parses the log line with regex, and sets two new labels of level and component from the extracted data.
The second match
stage will only run if a label named app
== some-app
, it then parses the log line and creates an extracted key named panic if it finds panic:
in the log line. Then a metrics stage will increment a counter if the extracted key panic
is found in the extracted
map.
More info on each field in the interface:
labels
A set of prometheus style labels which will be sent with the log line and will be indexed by Loki.