From b74db24a007511d437c459aa36c693dc7dae8409 Mon Sep 17 00:00:00 2001 From: Robert Fratto <robert.fratto@grafana.com> Date: Fri, 6 Sep 2019 13:40:59 -0400 Subject: [PATCH] docs: re-add docs from #654 --- docs/loki/api.md | 335 ++++++++++++++++++++++++++++++++++++----------- docs/querying.md | 60 ++++++++- 2 files changed, 315 insertions(+), 80 deletions(-) diff --git a/docs/loki/api.md b/docs/loki/api.md index db116017..480d7b57 100644 --- a/docs/loki/api.md +++ b/docs/loki/api.md @@ -1,114 +1,291 @@ -# API +# Loki API The Loki server has the following API endpoints (_Note:_ Authentication is out of scope for this project): -### `POST /api/prom/push` +- `POST /api/prom/push` -For sending log entries, expects a snappy compressed proto in the HTTP Body: + For sending log entries, expects a snappy compressed proto in the HTTP Body: -- [ProtoBuffer definition](/pkg/logproto/logproto.proto) -- [Golang client library](/pkg/promtail/client/client.go) + - [ProtoBuffer definition](/pkg/logproto/logproto.proto) + - [Golang client library](/pkg/promtail/client/client.go) -Also accepts JSON formatted requests when the header `Content-Type: application/json` is sent. Example of the JSON format: + Also accepts JSON formatted requests when the header `Content-Type: application/json` is sent. Example of the JSON format: -```json -{ - "streams": [ - { - "labels": "{foo=\"bar\"}", - "entries": [{ "ts": "2018-12-18T08:28:06.801064-04:00", "line": "baz" }] - } - ] -} -``` + ```json + { + "streams": [ + { + "labels": "{foo=\"bar\"}", + "entries": [{ "ts": "2018-12-18T08:28:06.801064-04:00", "line": "baz" }] + } + ] + } + + ``` + +- `GET /api/v1/query` + + For doing instant queries at a single point in time, accepts the following parameters in the query-string: + + - `query`: a logQL query + - `limit`: max number of entries to return (not used for metric queries) + - `time`: the evaluation time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always now. + - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. + + Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, + so you need to specify the time and labels accordingly. Querying a long time into the history will cause additional + load to the index server and make the query slower. + + Responses looks like this: + + ```json + { + "resultType": "vector" | "streams", + "result": <value> + } + ``` + + Examples: + + ```bash + $ curl -G -s "http://localhost:3100/api/v1/query" --data-urlencode 'query=sum(rate({job="varlogs"}[10m])) by (level)' | jq + { + "resultType": "vector", + "result": [ + { + "metric": {}, + "value": [ + 1559848867745737, + "1267.1266666666666" + ] + }, + { + "metric": { + "level": "warn" + }, + "value": [ + 1559848867745737, + "37.77166666666667" + ] + }, + { + "metric": { + "level": "info" + }, + "value": [ + 1559848867745737, + "37.69" + ] + } + ] + } + ``` -### `GET /api/prom/query` + ```bash + curl -G -s "http://localhost:3100/api/v1/query" --data-urlencode 'query={job="varlogs"}' | jq + { + "resultType": "streams", + "result": [ + { + "labels": "{filename=\"/var/log/myproject.log\", job=\"varlogs\", level=\"info\"}", + "entries": [ + { + "ts": "2019-06-06T19:25:41.972739Z", + "line": "foo" + }, + { + "ts": "2019-06-06T19:25:41.972722Z", + "line": "bar" + } + ] + } + ] + ``` -For doing queries, accepts the following parameters in the query-string: +- `GET /api/v1/query_range` -- `query`: a [logQL query](./usage.md) (eg: `{name=~"mysql.+"}` or `{name=~"mysql.+"} |= "error"`) -- `limit`: max number of entries to return -- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is always one hour ago. -- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is current time. -- `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. -- `regexp`: a regex to filter the returned results + For doing queries over a range of time, accepts the following parameters in the query-string: -Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, -so you need to specify the start and end labels accordingly. Querying a long time into the history will cause additional -load to the index server and make the query slower. + - `query`: a logQL query + - `limit`: max number of entries to return (not used for metric queries) + - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always one hour ago. + - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always now. + - `step`: query resolution step width in seconds. Default 1 second. + - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. -Responses looks like this: + Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, + so you need to specify the time and labels accordingly. Querying a long time into the history will cause additional + load to the index server and make the query slower. -```json -{ - "streams": [ + Responses looks like this: + + ```json + { + "resultType": "matrix" | "streams", + "result": <value> + } + ``` + + Examples: + + ```bash + $ curl -G -s "http://localhost:3100/api/v1/query_range" --data-urlencode 'query=sum(rate({job="varlogs"}[10m])) by (level)' --data-urlencode 'step=300' | jq + { + "resultType": "matrix", + "result": [ { - "labels": "{instance=\"...\", job=\"...\", namespace=\"...\"}", - "entries": [ - { - "ts": "2018-06-27T05:20:28.699492635Z", - "line": "..." + "metric": { + "level": "info" }, - ... - ] - }, - ... - ] -} -``` + "values": [ + [ + 1559848958663735, + "137.95" + ], + [ + 1559849258663735, + "467.115" + ], + [ + 1559849558663735, + "658.8516666666667" + ] + ] + }, + { + "metric": { + "level": "warn" + }, + "values": [ + [ + 1559848958663735, + "137.27833333333334" + ], + [ + 1559849258663735, + "467.69" + ], + [ + 1559849558663735, + "660.6933333333334" + ] + ] + } + ] + } + ``` + + ```bash + curl -G -s "http://localhost:3100/api/v1/query_range" --data-urlencode 'query={job="varlogs"}' | jq + { + "resultType": "streams", + "result": [ + { + "labels": "{filename=\"/var/log/myproject.log\", job=\"varlogs\", level=\"info\"}", + "entries": [ + { + "ts": "2019-06-06T19:25:41.972739Z", + "line": "foo" + }, + { + "ts": "2019-06-06T19:25:41.972722Z", + "line": "bar" + } + ] + } + ] + ``` + +- `GET /api/prom/query` + + For doing queries, accepts the following parameters in the query-string: + + - `query`: a [logQL query](../querying.md) (eg: `{name=~"mysql.+"}` or `{name=~"mysql.+"} |= "error"`) + - `limit`: max number of entries to return + - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is always one hour ago. + - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is current time. + - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. + - `regexp`: a regex to filter the returned results + + Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, + so you need to specify the start and end labels accordingly. Querying a long time into the history will cause additional + load to the index server and make the query slower. + + > This endpoint will be deprecated in the future you should use `api/v1/query_range` instead. + > You can only query for logs, it doesn't accept [queries returning metrics](./usage.md#counting-logs). + + Responses looks like this: + + ```json + { + "streams": [ + { + "labels": "{instance=\"...\", job=\"...\", namespace=\"...\"}", + "entries": [ + { + "ts": "2018-06-27T05:20:28.699492635Z", + "line": "..." + }, + ... + ] + }, + ... + ] + } + ``` -### `GET /api/prom/label` +- `GET /api/prom/label` -For doing label name queries, accepts the following parameters in the query-string: + For doing label name queries, accepts the following parameters in the query-string: -- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. -- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. + - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. + - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. -Responses looks like this: + Responses looks like this: -```json -{ - "values": [ - "instance", - "job", - ... - ] -} -``` + ```json + { + "values": [ + "instance", + "job", + ... + ] + } + ``` -`GET /api/prom/label/<name>/values` +- `GET /api/prom/label/<name>/values` -For doing label values queries, accepts the following parameters in the query-string: + For doing label values queries, accepts the following parameters in the query-string: -- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. -- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. + - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. + - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. -Responses looks like this: + Responses looks like this: -```json -{ - "values": [ - "default", - "cortex-ops", - ... - ] -} -``` + ```json + { + "values": [ + "default", + "cortex-ops", + ... + ] + } + ``` -### `GET /ready` +- `GET /ready` -This endpoint returns 200 when Loki ingester is ready to accept traffic. If you're running Loki on Kubernetes, this endpoint can be used as readiness probe. + This endpoint returns 200 when Loki ingester is ready to accept traffic. If you're running Loki on Kubernetes, this endpoint can be used as readiness probe. -### `GET /flush` +- `GET /flush` -This endpoint triggers a flush of all in memory chunks in the ingester. Mainly used for local testing. + This endpoint triggers a flush of all in memory chunks in the ingester. Mainly used for local testing. -### `GET /metrics` +- `GET /metrics` -This endpoint returns Loki metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. + This endpoint returns Loki metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. ## Examples of using the API in a third-party client library -1. Take a look at this [client](https://github.com/afiskon/promtail-client), but be aware that the API is not stable yet (Golang). -2. Example on [Python3](https://github.com/sleleko/devops-kb/blob/master/python/push-to-loki.py) +1) Take a look at this [client](https://github.com/afiskon/promtail-client), but be aware that the API is not stable yet (Golang). +2) Example on [Python3](https://github.com/sleleko/devops-kb/blob/master/python/push-to-loki.py) diff --git a/docs/querying.md b/docs/querying.md index ac25c12f..cfc7b429 100644 --- a/docs/querying.md +++ b/docs/querying.md @@ -1,7 +1,7 @@ # Querying To get the previously ingested logs back from Loki for analysis, you need a -client that supports LogQL. +client that supports LogQL. Grafana will be the first choice for most users, nevertheless [LogCLI](logcli.md) represents a viable standalone alternative. @@ -111,3 +111,61 @@ The query language is still under development to support more features, e.g.,: - Number extraction for timeseries based on number in log messages - JSON accessors for filtering of JSON-structured logs - Context (like `grep -C n`) + +## Counting logs + +Loki's LogQL support sample expression allowing to count entries per stream after the regex filtering stage. + +### Range Vector aggregation + +The language shares the same [range vector](https://prometheus.io/docs/prometheus/latest/querying/basics/#range-vector-selectors) concept from Prometheus, except that the selected range of samples contains a value of one for each log entry. You can then apply an aggregation over the selected range to transform it into an instant vector. + +`rate` calculates the number of entries per second and `count_over_time` count of entries for the each log stream within the range. + +In this example, we count all the log lines we have recorded within the last 5min for the mysql job. + +> `count_over_time({job="mysql"}[5m])` + +A range vector aggregation can also be applied to a [Filter Expression](#filter-expression), allowing you to select only matching log entries. + +> `rate( ( {job="mysql"} |= "error" != "timeout)[10s] ) )` + +The query above will compute the per second rate of all errors except those containing `timeout` within the last 10 seconds. + +You can then use aggregation operators over the range vector aggregation. + +### Aggregation operators + +Like [PromQL](https://prometheus.io/docs/prometheus/latest/querying/operators/#aggregation-operators), Loki's LogQL support a subset of built-in aggregation operators that can be used to aggregate the element of a single vector, resulting in a new vector of fewer elements with aggregated values: + +- `sum` (calculate sum over dimensions) +- `min` (select minimum over dimensions) +- `max` (select maximum over dimensions) +- `avg` (calculate the average over dimensions) +- `stddev` (calculate population standard deviation over dimensions) +- `stdvar` (calculate population standard variance over dimensions) +- `count` (count number of elements in the vector) +- `bottomk` (smallest k elements by sample value) +- `topk` (largest k elements by sample value) + +These operators can either be used to aggregate over all label dimensions or preserve distinct dimensions by including a without or by clause. + +> `<aggr-op>([parameter,] <vector expression>) [without|by (<label list>)]` + +parameter is only required for `topk` and `bottomk`. without removes the listed labels from the result vector, while all other labels are preserved the output. by does the opposite and drops labels that are not listed in the by clause, even if their label values are identical between all elements of the vector. + +topk and bottomk are different from other aggregators in that a subset of the input samples, including the original labels, are returned in the result vector. by and without are only used to bucket the input vector. + +#### Examples + +Get top 10 applications by highest log throughput: + +> `topk(10,sum(rate({region="us-east1"}[5m]) by (name))` + +Get the count of logs during the last 5 minutes by level: + +> `sum(count_over_time({job="mysql"}[5m])) by (level)` + +Get the rate of HTTP GET requests from nginx logs: + +> `avg(rate(({job="nginx"} |= "GET")[10s])) by (region)` -- GitLab