From dedb585bb5dcf64d5a700f126860d54a924d0722 Mon Sep 17 00:00:00 2001 From: sh0rez <me@shorez.de> Date: Mon, 5 Aug 2019 19:23:28 +0200 Subject: [PATCH] docs: general documentation rework - restructures the docs to make them easier to explore - rewrites promtail docs - unifies, shortens and extends docs --- docs/api.md | 291 ------------------ docs/{design => design-documents}/labels.md | 0 docs/index.md | 34 ++ docs/logcli.md | 35 ++- docs/loki/api.md | 114 +++++++ docs/{ => loki}/operations.md | 152 --------- docs/loki/storage.md | 158 ++++++++++ docs/promtail/api.md | 6 +- docs/promtail/configuration.md | 185 +++++++++++ docs/promtail/deployment.md | 150 +++++++++ docs/promtail/examples.md | 92 ++++++ docs/promtail/overview.md | 41 +++ .../parsing.md} | 2 +- docs/querying.md | 111 +++++++ docs/usage.md | 136 -------- mkdocs.yml | 10 + 16 files changed, 918 insertions(+), 599 deletions(-) delete mode 100644 docs/api.md rename docs/{design => design-documents}/labels.md (100%) create mode 100644 docs/index.md create mode 100644 docs/loki/api.md rename docs/{ => loki}/operations.md (53%) create mode 100644 docs/loki/storage.md create mode 100644 docs/promtail/configuration.md create mode 100644 docs/promtail/deployment.md create mode 100644 docs/promtail/examples.md create mode 100644 docs/promtail/overview.md rename docs/{logentry/processing-log-lines.md => promtail/parsing.md} (99%) create mode 100644 docs/querying.md delete mode 100644 docs/usage.md create mode 100644 mkdocs.yml diff --git a/docs/api.md b/docs/api.md deleted file mode 100644 index 4b000044..00000000 --- a/docs/api.md +++ /dev/null @@ -1,291 +0,0 @@ -# Loki API - -The Loki server has the following API endpoints (_Note:_ Authentication is out of scope for this project): - -- `POST /api/prom/push` - - For sending log entries, expects a snappy compressed proto in the HTTP Body: - - - [ProtoBuffer definition](/pkg/logproto/logproto.proto) - - [Golang client library](/pkg/promtail/client/client.go) - - Also accepts JSON formatted requests when the header `Content-Type: application/json` is sent. Example of the JSON format: - - ```json - { - "streams": [ - { - "labels": "{foo=\"bar\"}", - "entries": [{ "ts": "2018-12-18T08:28:06.801064-04:00", "line": "baz" }] - } - ] - } - - ``` - -- `GET /api/v1/query` - - For doing instant queries at a single point in time, accepts the following parameters in the query-string: - - - `query`: a logQL query - - `limit`: max number of entries to return (not used for metric queries) - - `time`: the evaluation time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always now. - - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. - - Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, - so you need to specify the time and labels accordingly. Querying a long time into the history will cause additional - load to the index server and make the query slower. - - Responses looks like this: - - ```json - { - "resultType": "vector" | "streams", - "result": <value> - } - ``` - - Examples: - - ```bash - $ curl -G -s "http://localhost:3100/api/v1/query" --data-urlencode 'query=sum(rate({job="varlogs"}[10m])) by (level)' | jq - { - "resultType": "vector", - "result": [ - { - "metric": {}, - "value": [ - 1559848867745737, - "1267.1266666666666" - ] - }, - { - "metric": { - "level": "warn" - }, - "value": [ - 1559848867745737, - "37.77166666666667" - ] - }, - { - "metric": { - "level": "info" - }, - "value": [ - 1559848867745737, - "37.69" - ] - } - ] - } - ``` - - ```bash - curl -G -s "http://localhost:3100/api/v1/query" --data-urlencode 'query={job="varlogs"}' | jq - { - "resultType": "streams", - "result": [ - { - "labels": "{filename=\"/var/log/myproject.log\", job=\"varlogs\", level=\"info\"}", - "entries": [ - { - "ts": "2019-06-06T19:25:41.972739Z", - "line": "foo" - }, - { - "ts": "2019-06-06T19:25:41.972722Z", - "line": "bar" - } - ] - } - ] - ``` - -- `GET /api/v1/query_range` - - For doing queries over a range of time, accepts the following parameters in the query-string: - - - `query`: a logQL query - - `limit`: max number of entries to return (not used for metric queries) - - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always one hour ago. - - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always now. - - `step`: query resolution step width in seconds. Default 1 second. - - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. - - Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, - so you need to specify the time and labels accordingly. Querying a long time into the history will cause additional - load to the index server and make the query slower. - - Responses looks like this: - - ```json - { - "resultType": "matrix" | "streams", - "result": <value> - } - ``` - - Examples: - - ```bash - $ curl -G -s "http://localhost:3100/api/v1/query_range" --data-urlencode 'query=sum(rate({job="varlogs"}[10m])) by (level)' --data-urlencode 'step=300' | jq - { - "resultType": "matrix", - "result": [ - { - "metric": { - "level": "info" - }, - "values": [ - [ - 1559848958663735, - "137.95" - ], - [ - 1559849258663735, - "467.115" - ], - [ - 1559849558663735, - "658.8516666666667" - ] - ] - }, - { - "metric": { - "level": "warn" - }, - "values": [ - [ - 1559848958663735, - "137.27833333333334" - ], - [ - 1559849258663735, - "467.69" - ], - [ - 1559849558663735, - "660.6933333333334" - ] - ] - } - ] - } - ``` - - ```bash - curl -G -s "http://localhost:3100/api/v1/query_range" --data-urlencode 'query={job="varlogs"}' | jq - { - "resultType": "streams", - "result": [ - { - "labels": "{filename=\"/var/log/myproject.log\", job=\"varlogs\", level=\"info\"}", - "entries": [ - { - "ts": "2019-06-06T19:25:41.972739Z", - "line": "foo" - }, - { - "ts": "2019-06-06T19:25:41.972722Z", - "line": "bar" - } - ] - } - ] - ``` - -- `GET /api/prom/query` - - For doing queries, accepts the following parameters in the query-string: - - - `query`: a [logQL query](./usage.md) (eg: `{name=~"mysql.+"}` or `{name=~"mysql.+"} |= "error"`) - - `limit`: max number of entries to return - - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is always one hour ago. - - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is current time. - - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. - - `regexp`: a regex to filter the returned results - - Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, - so you need to specify the start and end labels accordingly. Querying a long time into the history will cause additional - load to the index server and make the query slower. - - > This endpoint will be deprecated in the future you should use `api/v1/query_range` instead. - > You can only query for logs, it doesn't accept [queries returning metrics](./usage.md#counting-logs). - - Responses looks like this: - - ```json - { - "streams": [ - { - "labels": "{instance=\"...\", job=\"...\", namespace=\"...\"}", - "entries": [ - { - "ts": "2018-06-27T05:20:28.699492635Z", - "line": "..." - }, - ... - ] - }, - ... - ] - } - ``` - -- `GET /api/prom/label` - - For doing label name queries, accepts the following parameters in the query-string: - - - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. - - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. - - Responses looks like this: - - ```json - { - "values": [ - "instance", - "job", - ... - ] - } - ``` - -- `GET /api/prom/label/<name>/values` - - For doing label values queries, accepts the following parameters in the query-string: - - - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. - - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. - - Responses looks like this: - - ```json - { - "values": [ - "default", - "cortex-ops", - ... - ] - } - ``` - -- `GET /ready` - - This endpoint returns 200 when Loki ingester is ready to accept traffic. If you're running Loki on Kubernetes, this endpoint can be used as readiness probe. - -- `GET /flush` - - This endpoint triggers a flush of all in memory chunks in the ingester. Mainly used for local testing. - -- `GET /metrics` - - This endpoint returns Loki metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. - - -## Examples of using the API in a third-party client library - -1) Take a look at this [client](https://github.com/afiskon/promtail-client), but be aware that the API is not stable yet (Golang). -2) Example on [Python3](https://github.com/sleleko/devops-kb/blob/master/python/push-to-loki.py) diff --git a/docs/design/labels.md b/docs/design-documents/labels.md similarity index 100% rename from docs/design/labels.md rename to docs/design-documents/labels.md diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 00000000..2ee9a915 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,34 @@ +<p align="center"> + <img src="logo_and_name.png" alt="Loki Logo"> <br> + <small>Like Prometheus, but for logs!</small> +</p> + +Grafana Loki is a set of components, that can be composed into a fully featured logging stack. + +It builds around the idea of treating a single log line as-is. This means that +instead of full-text indexing them, related logs are grouped using the same labels +as in Prometheus. This is much more efficient and scales better. + +## Components +- **[Loki](loki/overview.md)**: The main server component is called Loki. It is responsible for + permanently storing the logs it is being shipped and it executes the LogQL + queries from clients. + Loki shares its high-level architecture with Cortex, a highly scalable + Prometheus backend. +- **[Promtail](promtail/overview.md)**: To ship logs to a central place, an agent is required. Promtail + is deployed to every node that should be monitored and sends the logs to Loki. + It also does important task of pre-processing the log lines, including + attaching labels to them for easier querying. +- *Grafana*: The *Explore* feature of Grafana 6.0+ is the primary place of + contact between a human and Loki. It is used for discovering and analyzing logs. + +Alongside these main components, there are some other ones as well: + +- **[LogCLI](logcli.md)**: A command line interface to query logs and labels from Loki +- **Canary**: An audit utility to analyze the log-capturing performance of Loki. + Ingests data into Loki and immediately reads it back to check for latency and loss. +- **Docker Driver**: A Docker [log driver](https://docs.docker.com/config/containers/logging/configure/) to ship logs captured by Docker + directly to Loki, without the need of an agent. +- **Fluentd Plugin**: An Fluentd [output + plugin](https://docs.fluentd.org/output), to use Fluentd for shipping logs + into Loki diff --git a/docs/logcli.md b/docs/logcli.md index 4408d894..4a137365 100644 --- a/docs/logcli.md +++ b/docs/logcli.md @@ -1,23 +1,25 @@ -# Log CLI usage Instructions +# LogCLI -Loki's main query interface is Grafana; however, a basic CLI is provided as a proof of concept. - -Once you have Loki running in a cluster, you can query logs from that cluster. +LogCLI is a handy tool to query logs from Loki without having to run a full Grafana instance. ## Installation -### Get latest version +### Binary (Recommended) +Head over to the [Releases](https://github.com/grafana/loki/releases) and download the `logcli` binary for your OS: +```bash +# download a binary (adapt app, os and arch as needed) +# installs v0.2.0. For up to date URLs refer to the release's description +$ curl -fSL -o "/usr/local/bin/logcli.gz" "https://github.com/grafana/logcli/releases/download/v0.2.0/logcli-linux-amd64.gz" +$ gunzip "/usr/local/bin/logcli.gz" -``` -$ go get github.com/grafana/loki/cmd/logcli +# make sure it is executable +$ chmod a+x "/usr/local/bin/logcli" ``` -### Build from source +### From source ``` -$ go get github.com/grafana/loki -$ cd $GOPATH/src/github.com/grafana/loki -$ go build ./cmd/logcli +$ go get github.com/grafana/loki/cmd/logcli ``` Now `logcli` is in your current directory. @@ -36,14 +38,15 @@ Otherwise, when running e.g. [locally](https://github.com/grafana/loki/tree/mast ``` $ export GRAFANA_ADDR=http://localhost:3100 ``` -> Note: If you are running loki behind a proxy server and have an authentication setup. You will have to pass URL, username and password accordingly. Please refer to the [docs](https://github.com/adityacs/loki/blob/master/docs/operations.md) for more info. +> Note: If you are running loki behind a proxy server and have an authentication setup, you will have to pass URL, username and password accordingly. Please refer to [Authentication](loki/operations.md#authentication) for more info. -``` +```bash $ logcli labels job https://logs-dev-ops-tools1.grafana.net/api/prom/label/job/values cortex-ops/consul cortex-ops/cortex-gw ... + $ logcli query '{job="cortex-ops/consul"}' https://logs-dev-ops-tools1.grafana.net/api/prom/query?query=%7Bjob%3D%22cortex-ops%2Fconsul%22%7D&limit=30&start=1529928228&end=1529931828&direction=backward®exp= Common labels: {job="cortex-ops/consul", namespace="cortex-ops"} @@ -55,14 +58,14 @@ Common labels: {job="cortex-ops/consul", namespace="cortex-ops"} Configuration values are considered in the following order (lowest to highest): -- environment value -- command line +- Environment variables +- Command line flags The URLs of the requests are printed to help with integration work. ### Details -```console +```bash $ logcli help usage: logcli [<flags>] <command> [<args> ...] diff --git a/docs/loki/api.md b/docs/loki/api.md new file mode 100644 index 00000000..db116017 --- /dev/null +++ b/docs/loki/api.md @@ -0,0 +1,114 @@ +# API + +The Loki server has the following API endpoints (_Note:_ Authentication is out of scope for this project): + +### `POST /api/prom/push` + +For sending log entries, expects a snappy compressed proto in the HTTP Body: + +- [ProtoBuffer definition](/pkg/logproto/logproto.proto) +- [Golang client library](/pkg/promtail/client/client.go) + +Also accepts JSON formatted requests when the header `Content-Type: application/json` is sent. Example of the JSON format: + +```json +{ + "streams": [ + { + "labels": "{foo=\"bar\"}", + "entries": [{ "ts": "2018-12-18T08:28:06.801064-04:00", "line": "baz" }] + } + ] +} +``` + +### `GET /api/prom/query` + +For doing queries, accepts the following parameters in the query-string: + +- `query`: a [logQL query](./usage.md) (eg: `{name=~"mysql.+"}` or `{name=~"mysql.+"} |= "error"`) +- `limit`: max number of entries to return +- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is always one hour ago. +- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is current time. +- `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. +- `regexp`: a regex to filter the returned results + +Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, +so you need to specify the start and end labels accordingly. Querying a long time into the history will cause additional +load to the index server and make the query slower. + +Responses looks like this: + +```json +{ + "streams": [ + { + "labels": "{instance=\"...\", job=\"...\", namespace=\"...\"}", + "entries": [ + { + "ts": "2018-06-27T05:20:28.699492635Z", + "line": "..." + }, + ... + ] + }, + ... + ] +} +``` + +### `GET /api/prom/label` + +For doing label name queries, accepts the following parameters in the query-string: + +- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. +- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. + +Responses looks like this: + +```json +{ + "values": [ + "instance", + "job", + ... + ] +} +``` + +`GET /api/prom/label/<name>/values` + +For doing label values queries, accepts the following parameters in the query-string: + +- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. +- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. + +Responses looks like this: + +```json +{ + "values": [ + "default", + "cortex-ops", + ... + ] +} +``` + +### `GET /ready` + +This endpoint returns 200 when Loki ingester is ready to accept traffic. If you're running Loki on Kubernetes, this endpoint can be used as readiness probe. + +### `GET /flush` + +This endpoint triggers a flush of all in memory chunks in the ingester. Mainly used for local testing. + +### `GET /metrics` + +This endpoint returns Loki metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. + + +## Examples of using the API in a third-party client library + +1. Take a look at this [client](https://github.com/afiskon/promtail-client), but be aware that the API is not stable yet (Golang). +2. Example on [Python3](https://github.com/sleleko/devops-kb/blob/master/python/push-to-loki.py) diff --git a/docs/operations.md b/docs/loki/operations.md similarity index 53% rename from docs/operations.md rename to docs/loki/operations.md index d6807745..11938e8a 100644 --- a/docs/operations.md +++ b/docs/loki/operations.md @@ -86,155 +86,3 @@ When scaling Loki, consider running several Loki processes with their respective Take a look at their respective `.libsonnet` files in [our production setup](../production/ksonnet/loki) to get an idea about resource usage. We're happy to get feedback about your resource usage. - -## Storage - -Loki needs two stores: an index store and a chunk store. -Loki receives logs in separate streams. -Each stream is identified by a set of labels. -As the log entries from a stream arrive, they are gzipped as chunks and saved in the chunks store. -The index then stores the stream's label set, and links them to the chunks. -The chunk format refer to [doc](../pkg/chunkenc/README.md) - -### Local storage - -By default, Loki stores everything on disk. -The index is stored in a BoltDB under `/tmp/loki/index`. -The chunks are stored under `/tmp/loki/chunks`. - -### Google Cloud Storage - -Loki has support for Google Cloud storage. -Take a look at our [production setup](https://github.com/grafana/loki/blob/a422f394bb4660c98f7d692e16c3cc28747b7abd/production/ksonnet/loki/config.libsonnet#L55) for the relevant configuration fields. - -### Cassandra - -Loki can use Cassandra for the index storage. Please pull the **latest** Loki docker image or build from **latest** source code. Example config for using Cassandra: - -```yaml -schema_config: - configs: - - from: 2018-04-15 - store: cassandra - object_store: filesystem - schema: v9 - index: - prefix: cassandra_table - period: 168h - -storage_config: - cassandra: - username: cassandra - password: cassandra - addresses: 127.0.0.1 - auth: true - keyspace: lokiindex - - filesystem: - directory: /tmp/loki/chunks -``` - -### AWS S3 & DynamoDB - -Example config for using S3 & DynamoDB: - -```yaml -schema_config: - configs: - - from: 2018-04-15 - store: aws - object_store: s3 - schema: v9 - index: - prefix: dynamodb_table_name - period: 0 -storage_config: - aws: - s3: s3://access_key:secret_access_key@region/bucket_name - dynamodbconfig: - dynamodb: dynamodb://access_key:secret_access_key@region -``` - -You can also use an EC2 instance role instead of hard coding credentials like in the above example. -If you wish to do this the storage_config example looks like this: - -```yaml -storage_config: - aws: - s3: s3://region/bucket_name - dynamodbconfig: - dynamodb: dynamodb://region -``` - -#### S3 - -Loki is using S3 as object storage. It stores log within directories based on -[`OrgID`](./operations.md#Multi-tenancy). For example, Logs from org `faker` -will stored in `s3://BUCKET_NAME/faker/`. - -The S3 configuration is setup with url format: `s3://access_key:secret_access_key@region/bucket_name`. - -For custom S3 endpoint (like Ceph Object Storage with S3 Compatible API), if it's using path-style url rather than -virtual hosted bucket addressing, please set config like below: - -```yaml -storage_config: - aws: - s3: s3://access_key:secret_access_key@custom_endpoint/bucket_name - s3forcepathstyle: true -``` - -To write to S3, Loki will require the following permissions on the bucket: - -* s3:ListBucket -* s3:PutObject -* s3:GetObject - -#### DynamoDB - -Loki uses DynamoDB for the index storage. It is used for querying logs, make -sure you adjust your throughput to your usage. - -DynamoDB access is very similar to S3, however you do not need to specify a -table name in the storage section, as Loki will calculate that for you. -You will need to set the table name prefix inside schema config section, -and ensure the `index.prefix` table exists. - -You can setup DynamoDB by yourself, or have `table-manager` setup for you. -You can find out more info about table manager at -[Cortex project](https://github.com/cortexproject/cortex). -There is an example table manager deployment inside the ksonnet deployment method. You can find it [here](../production/ksonnet/loki/table-manager.libsonnet) -The table-manager allows deleting old indices by rotating a number of different dynamodb tables and deleting the oldest one. If you choose to -create the table manually you cannot easily erase old data and your index just grows indefinitely. - -If you set your DynamoDB table manually, ensure you set the primary index key to `h` -(string) and use `r` (binary) as the sort key. Also set the "period" attribute in the yaml to zero. -Make sure adjust your throughput base on your usage. - -DynamoDB's table manager client defaults provisioning capacity units read to 300 and writes to 3000. -If you wish to override these defaults the config section should include: - -```yaml -table_manager: - index_tables_provisioning: - provisioned_write_throughput: 10 - provisioned_read_throughput: 10 - chunk_tables_provisioning: - provisioned_write_throughput: 10 - provisioned_read_throughput: 10 -``` - -For DynamoDB, Loki will require the following permissions on the table: - -* dynamodb:BatchGetItem -* dynamodb:BatchWriteItem -* dynamodb:DeleteItem -* dynamodb:DescribeTable -* dynamodb:GetItem -* dynamodb:ListTagsOfResource -* dynamodb:PutItem -* dynamodb:Query -* dynamodb:TagResource -* dynamodb:UntagResource -* dynamodb:UpdateItem -* dynamodb:UpdateTable diff --git a/docs/loki/storage.md b/docs/loki/storage.md new file mode 100644 index 00000000..8bfcc520 --- /dev/null +++ b/docs/loki/storage.md @@ -0,0 +1,158 @@ +# Storage + +Loki needs to store two different types of data: **Chunks** and **Indexes**. + +Loki receives logs in separate streams. Each stream is identified by a set of labels. +As the log entries from a stream arrive, they are gzipped as chunks and saved in +the chunks store. The chunk format is documented in [`pkg/chunkenc`](../pkg/chunkenc/README.md). + +On the other hand, the index stores the stream's label set and links them to the +individual chunks. + +### Local storage + +By default, Loki stores everything on disk. The index is stored in a BoltDB under +`/tmp/loki/index` and the chunks are stored under `/tmp/loki/chunks`. + +### Google Cloud Storage + +Loki supports Google Cloud Storage. Refer to Grafana Labs' +[production setup](https://github.com/grafana/loki/blob/a422f394bb4660c98f7d692e16c3cc28747b7abd/production/ksonnet/loki/config.libsonnet#L55) +for the relevant configuration fields. + +### Cassandra + +Loki can use Cassandra for the index storage. Example config using Cassandra: + +```yaml +schema_config: + configs: + - from: 2018-04-15 + store: cassandra + object_store: filesystem + schema: v9 + index: + prefix: cassandra_table + period: 168h + +storage_config: + cassandra: + username: cassandra + password: cassandra + addresses: 127.0.0.1 + auth: true + keyspace: lokiindex + + filesystem: + directory: /tmp/loki/chunks +``` + +### AWS S3 & DynamoDB + +Example config for using S3 & DynamoDB: + +```yaml +schema_config: + configs: + - from: 0 + store: dynamo + object_store: s3 + schema: v9 + index: + prefix: dynamodb_table_name + period: 0 +storage_config: + aws: + s3: s3://access_key:secret_access_key@region/bucket_name + dynamodbconfig: + dynamodb: dynamodb://access_key:secret_access_key@region +``` + +If you don't wish to hard-code S3 credentials, you can also configure an +EC2 instance role by changing the `storage_config` section: + +```yaml +storage_config: + aws: + s3: s3://region/bucket_name + dynamodbconfig: + dynamodb: dynamodb://region +``` + +#### S3 + +Loki can use S3 as object storage, storing logs within directories based on +the [OrgID](./operations.md#Multi-tenancy). For example, logs from the `faker` +org will be stored in `s3://BUCKET_NAME/faker/`. + +The S3 configuration is set up using the URL format: +`s3://access_key:secret_access_key@region/bucket_name`. + +S3-compatible APIs (e.g., Ceph Object Storage with an S3-compatible API) can +be used. If the API supports path-style URL rather than virtual hosted bucket +addressing, configure the URL in `storage_config` with the custom endpoint: + +```yaml +storage_config: + aws: + s3: s3://access_key:secret_access_key@custom_endpoint/bucket_name + s3forcepathstyle: true +``` + +Loki needs the following permissions to write to an S3 bucket: + +* s3:ListBucket +* s3:PutObject +* s3:GetObject + +#### DynamoDB + +Loki can use DynamoDB for storing the index. The index is used for querying +logs. Throughput to the index should be adjusted to your usage. + +Access to DynamoDB is very similar to S3; however, a table name does not +need to be specified in the storage section, as Loki calculates that for +you. The table name prefix will need to be configured inside `schema_config` +for Loki to be able to create new tables. + +DynamoDB can be set up manually or automatically through `table-manager`. +The `table-manager` allows deleting old indices by rotating a number of +different DynamoDB tables and deleting the oldest one. An example deployment +of the `table-manager` using ksonnet can be found +[here](../production/ksonnet/loki/table-manager.libsonnet) and more information +about it can be find at the +[Cortex project](https://github.com/cortexproject/cortex). + +DynamoDB's `table-manager` client defaults provisioning capacity units +read to 300 and writes to 3000. The defaults can be overwritten in the +config: + +```yaml +table_manager: + index_tables_provisioning: + provisioned_write_throughput: 10 + provisioned_read_throughput: 10 + chunk_tables_provisioning: + provisioned_write_throughput: 10 + provisioned_read_throughput: 10 +``` + +If DynamoDB is set up manually, old data cannot be easily erased and the index +will grow indefinitely. Manual configurations should ensure that the primary +index key is set to `h` (string) and the sort key is set to `r` (binary). The +"period" attribute in the yaml should be set to zero. + +Loki needs the following permissions to write to DynamoDB: + +* dynamodb:BatchGetItem +* dynamodb:BatchWriteItem +* dynamodb:DeleteItem +* dynamodb:DescribeTable +* dynamodb:GetItem +* dynamodb:ListTagsOfResource +* dynamodb:PutItem +* dynamodb:Query +* dynamodb:TagResource +* dynamodb:UntagResource +* dynamodb:UpdateItem +* dynamodb:UpdateTable diff --git a/docs/promtail/api.md b/docs/promtail/api.md index 2d1cf892..a60d3b66 100644 --- a/docs/promtail/api.md +++ b/docs/promtail/api.md @@ -1,12 +1,12 @@ -# Promtail API +# API Promtail features an embedded web server exposing a web console at `/` and the following API endpoints: -- `GET /ready` +### `GET /ready` This endpoint returns 200 when Promtail is up and running, and there's at least one working target. -- `GET /metrics` +### `GET /metrics` This endpoint returns Promtail metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. diff --git a/docs/promtail/configuration.md b/docs/promtail/configuration.md new file mode 100644 index 00000000..e86a9a8c --- /dev/null +++ b/docs/promtail/configuration.md @@ -0,0 +1,185 @@ +# Configuration + +## `scrape_configs` (Target Discovery) +The way how Promtail finds out the log locations and extracts the set of labels +is by using the `scrape_configs` section in the `promtail.yaml` configuration +file. The syntax is equal to what [Prometheus +uses](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config). + +The `scrape_configs` contains one or more *entries* which are all executed for +each discovered target (read each container in each new pod running in the instance): +```yaml +scrape_configs: + - job_name: local + static_configs: + - ... + + - job_name: kubernetes + kubernetes_sd_config: + - ... +``` + +If more than one entry matches your logs, you will get duplicates as the logs are +sent in more than one stream, likely with a slightly different labels. + +There are different types of labels present in Promtail: + +* Labels starting with `__` (two underscores) are internal labels. They usually + come from dynamic sources like the service discovery. Once relabeling is done, + they are removed from the label set. To persist those, rename them to + something not starting with `__`. +* Labels starting with `__meta_kubernetes_pod_label_*` are "meta labels" which + are generated based on your kubernetes pod labels. + Example: If your kubernetes pod has a label `name` set to `foobar` then the + `scrape_configs` section will have a label `__meta_kubernetes_pod_label_name` + with value set to `foobar`. +* There are other `__meta_kubernetes_*` labels based on the Kubernetes + metadadata, such as the namespace the pod is running in + (`__meta_kubernetes_namespace`) or the name of the container inside the pod + (`__meta_kubernetes_pod_container_name`). Refer to [the Prometheus + docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config) + for the full list. +* The label `__path__` is a special label which Promtail will use afterwards to + figure out where the file to be read is located. Wildcards are allowed. +* The label `filename` is added for every file found in `__path__` to ensure + uniqueness of the streams. It contains the absolute path of the file the line + was read from. + +## `relabel_configs` (Relabeling) +The most important part of each entry is the `relabel_configs` stanza, which is a list +of operations to create, rename, modify or alter the labels. + +A single `scrape_config` can also reject logs by doing an `action: drop` if a label value +matches a specified regex, which means that this particular `scrape_config` will +not forward logs from a particular log source. +This does not mean that other `scrape_config`'s might not do, though. + +Many of the `scrape_configs` read labels from `__meta_kubernetes_*` meta-labels, +assign them to intermediate labels such as `__service__` based on +different logic, possibly drop the processing if the `__service__` was empty +and finally set visible labels (such as `job`) based on the `__service__` +label. + +In general, all of the default Promtail `scrape_configs` do the following: + + * They read pod logs from under `/var/log/pods/$1/*.log`. + * They set `namespace` label directly from the `__meta_kubernetes_namespace`. + * They expect to see your pod name in the `name` label + * They set a `job` label which is roughly `namespace/job` + +#### Examples + +* Drop the processing if a label is empty: +```yaml + - action: drop + regex: ^$ + source_labels: + - __service__ +``` +* Drop the processing if any of these labels contains a value: +```yaml + - action: drop + regex: .+ + separator: '' + source_labels: + - __meta_kubernetes_pod_label_name + - __meta_kubernetes_pod_label_app +``` +* Rename a metadata label into another so that it will be visible in the final log stream: +```yaml + - action: replace + source_labels: + - __meta_kubernetes_namespace + target_label: namespace +``` +* Convert all of the Kubernetes pod labels into visible labels: +```yaml + - action: labelmap + regex: __meta_kubernetes_pod_label_(.+) +``` + + +Additional reading: + + * [Julien Pivotto's slides from PromConf Munich, 2017](https://www.slideshare.net/roidelapluie/taking-advantage-of-prometheus-relabeling-109483749) + +## `client_option` (HTTP Client) +Promtail uses the Prometheus HTTP client implementation for all calls to Loki. +Therefore, you can configure it using the `client` stanza: +```yaml +client: [ <client_option> ] +``` + +Reference for `client_option`: +```yaml +# Sets the `url` of loki api push endpoint +url: http[s]://<host>:<port>/api/prom/push + +# Sets the `Authorization` header on every promtail request with the +# configured username and password. +# password and password_file are mutually exclusive. +basic_auth: + username: <string> + password: <secret> + password_file: <string> + +# Sets the `Authorization` header on every promtail request with +# the configured bearer token. It is mutually exclusive with `bearer_token_file`. +bearer_token: <secret> + +# Sets the `Authorization` header on every promtail request with the bearer token +# read from the configured file. It is mutually exclusive with `bearer_token`. +bearer_token_file: /path/to/bearer/token/file + +# Configures the promtail request's TLS settings. +tls_config: + # CA certificate to validate API server certificate with. + # If not provided Trusted CA from sytem will be used. + ca_file: <filename> + + # Certificate and key files for client cert authentication to the server. + cert_file: <filename> + key_file: <filename> + + # ServerName extension to indicate the name of the server. + # https://tools.ietf.org/html/rfc4366#section-3.1 + server_name: <string> + + # Disable validation of the server certificate. + insecure_skip_verify: <boolean> + +# Optional proxy URL. +proxy_url: <string> + +# Maximum wait period before sending batch +batchwait: 1s + +# Maximum batch size to accrue before sending, unit is byte +batchsize: 102400 + +# Maximum time to wait for server to respond to a request +timeout: 10s + +backoff_config: + # Initial backoff time between retries + minbackoff: 100ms + # Maximum backoff time between retries + maxbackoff: 5s + # Maximum number of retires when sending batches, 0 means infinite retries + maxretries: 5 + +# The labels to add to any time series or alerts when communicating with loki +external_labels: {} +``` + +#### Ship to multiple Loki Servers +Promtail is able to push logs to as many different Loki servers as you like. Use +`clients` instead of `client` if needed: +```yaml +# Single Loki +client: [ <client_option> ] + +# Multiple Loki instances +clients: + - [ <client_option> ] +``` diff --git a/docs/promtail/deployment.md b/docs/promtail/deployment.md new file mode 100644 index 00000000..4a95805e --- /dev/null +++ b/docs/promtail/deployment.md @@ -0,0 +1,150 @@ +# Installation +Promtail is distributed in binary and in container form. + +Once it is installed, you have basically two options for operating it: +Either as a daemon sitting on every node, or as a sidecar for the application. + +This usually only depends on the configuration though. +## Binary +Every release includes binaries: + +```bash +# download a binary (adapt app, os and arch as needed) +# installs v0.2.0. Go to the releases page for up to date URLs +$ curl -fSL -o "/usr/local/bin/promtail.gz" "https://github.com/grafana/promtail/releases/download/v0.2.0/promtail-linux-amd64.gz" +$ gunzip "/usr/local/bin/promtail.gz" + +# make sure it is executable +$ chmod a+x "/usr/local/bin/promtail" +``` + +Binaries for macOS and Windows are also provided at the [releases page](https://github.com/grafana/loki/releases). + +## Docker +```bash +# adapt tag to most recent version +$ docker pull grafana/promtail:v0.2.0 +``` + +## Kubernetes +On Kubernetes, you will use the Docker container above. However, you have too +choose whether you want to run in daemon mode (`DaemonSet`) or sidecar mode +(`Pod container`) in before. +### Daemonset method (Recommended) + +A `DaemonSet` will deploy `promtail` on every node within the Kubernetes cluster. + +This deployment is great to collect the logs of all containers within the +cluster. It is the best solution for a single tenant. + +```yaml +---Daemonset.yaml +apiVersion: extensions/v1beta1 +kind: Daemonset +metadata: + name: promtail-daemonset + ... +spec: + ... + template: + spec: + serviceAccount: SERVICE_ACCOUNT + serviceAccountName: SERVICE_ACCOUNT + volumes: + - name: logs + hostPath: HOST_PATH + - name: promtail-config + configMap + name: promtail-configmap + containers: + - name: promtail-container + args: + - -config.file=/etc/promtail/promtail.yaml + volumeMounts: + - name: logs + mountPath: MOUNT_PATH + - name: promtail-config + mountPath: /etc/promtail + ... + +---configmap.yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: promtail-config + ... +data: + promtail.yaml: YOUR CONFIG + +---Clusterrole.yaml +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: promtail-clusterrole +rules: + - apiGroups: + resources: + - nodes + - services + - pod + verbs: + - get + - watch + - list +---ServiceAccount.yaml +apiVersion: v1 +kind: ServiceAccount +metadata: + name: promtail-serviceaccount + +---Rolebinding +apiVersion: rbac.authorization.k9s.io/v1 +kind: ClusterRoleBinding +metadata: + name: promtail-clusterrolebinding +subjects: + - kind: ServiceAccount + name: promtail-serviceaccount +roleRef: + kind: ClusterRole + name: promtail-clusterrole + apiGroup: rbac.authorization.k8s.io +``` + +### Sidecar Method +This method will deploy `promtail` as a sidecar container within a pod. +In a multi-tenant environment, this enables teams to aggregate logs +for specific pods and deployments for example for all pods in a namespace. + +```yaml +---Deployment.yaml +apiVersion: extensions/v1beta1 +kind: Deployment +metadata: + name: my_test_app + ... +spec: + ... + template: + spec: + serviceAccount: SERVICE_ACCOUNT + serviceAccountName: SERVICE_ACCOUNT + volumes: + - name: logs + hostPath: HOST_PATH + - name: promtail-config + configMap + name: promtail-configmap + containers: + - name: promtail-container + args: + - -config.file=/etc/promtail/promtail.yaml + volumeMounts: + - name: logs + mountPath: MOUNT_PATH + - name: promtail-config + mountPath: /etc/promtail + ... + ... + +``` diff --git a/docs/promtail/examples.md b/docs/promtail/examples.md new file mode 100644 index 00000000..110515cb --- /dev/null +++ b/docs/promtail/examples.md @@ -0,0 +1,92 @@ +# Examples + +This document shows some example use-cases for promtail and their configuration. + +## Local Config +Using this configuration, all files in `/var/log` and `/srv/log/someone_service` are ingested into Loki. +The labels `job` and `host` are set using `static_configs`. + +When using this configuration with Docker, do not forget to mount the configuration, `/var/log` and `/src/log/someone_service` using [volumes](https://docs.docker.com/storage/volumes/). + +```yaml +server: + http_listen_port: 9080 + grpc_listen_port: 0 + +positions: + filename: /tmp/positions.yaml # progress of the individual files + +client: + url: http://ip_or_hostname_where_loki_runs:3100/api/prom/push + +scrape_configs: + - job_name: system + pipeline_stages: + - docker: # Docker wraps logs in json. Undo this. + static_configs: # running locally here, no need for service discovery + - targets: + - localhost + labels: + job: varlogs + host: yourhost + __path__: /var/log/*.log # tail all files under /var/log + + - job_name: someone_service + pipeline_stages: + - docker: # Docker wraps logs in json. Undo this. + static_configs: # running locally here, no need for service discovery + - targets: + - localhost + labels: + job: someone_service + host: yourhost + __path__: /srv/log/someone_service/*.log # tail all files under /srv/log/someone_service + +``` + +## Systemd Journal +This example shows how to ship the `systemd` journal to Loki. + +Just like the Docker example, the `scrape_configs` section holds various +jobs for parsing logs. A job with a `journal` key configures it for systemd +journal reading. + +`path` is an optional string specifying the path to read journal entries +from. If unspecified, defaults to the system default (`/var/log/journal`). + +`labels`: is a map of string values specifying labels that should always +be associated with each log entry being read from the systemd journal. +In our example, each log will have a label of `job=systemd-journal`. + +Every field written to the systemd journal is available for processing +in the `relabel_configs` section. Label names are converted to lowercase +and prefixed with `__journal_`. After `relabel_configs` processes all +labels for a job entry, any label starting with `__` is deleted. + +Our example renames the `_SYSTEMD_UNIT` label (available as +`__journal__systemd_unit` in promtail) to `unit** so it will be available +in Loki. All other labels from the journal entry are dropped. + +When running using Docker, **remember to bind the journal into the container**. + +```yaml +server: + http_listen_port: 9080 + grpc_listen_port: 0 + +positions: + filename: /tmp/positions.yaml + +clients: + - url: http://ip_or_hostname_where_loki_runns:3100/api/prom/push + +scrape_configs: + - job_name: journal + journal: + path: /var/log/journal + labels: + job: systemd-journal + relabel_configs: + - source_labels: ['__journal__systemd_unit'] + target_label: 'unit' +``` diff --git a/docs/promtail/overview.md b/docs/promtail/overview.md new file mode 100644 index 00000000..e6b55305 --- /dev/null +++ b/docs/promtail/overview.md @@ -0,0 +1,41 @@ +# Overview +Promtail is an agent which ships the content of local log files to Loki. It is +usually deployed to every machine that has applications needed to be monitored. + +It primarily **discovers** targets, attaches **labels** to log streams and +**pushes** them to the Loki instance. + +### Discovery +Before Promtail is able to ship anything to Loki, it needs to find about its +environment. This specifically means discovering applications emitting log lines +that need to be monitored. + +Promtail borrows the [service discovery mechanism from +Prometheus](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config), +although it currently only supports `static` and `kubernetes` service discovery. +This is due to the fact that `promtail` is deployed as a daemon to every local +machine and does not need to discover labels from other systems. `kubernetes` +service discovery fetches required labels from the api-server, `static` usually +covers the other use cases. + +Just like Prometheus, `promtail` is configured using a `scrape_configs` stanza. +`relabel_configs` allows fine-grained control of what to ingest, what to drop +and the final metadata attached to the log line. Refer to the +[configuration](configuration.md) for more details. + +### Labeling and Parsing +During service discovery, metadata is determined (pod name, filename, etc.) that +may be attached to the log line as a label for easier identification afterwards. +Using `relabel_configs`, those discovered labels can be mutated into the form +they should have for querying. + +To allow more sophisticated filtering afterwards, Promtail allows to set labels +not only from service discovery, but also based on the contents of the log +lines. The so-called `pipeline_stages` can be used to add or update labels, +correct the timestamp or rewrite the log line entirely. Refer to the [log +parsing documentation](parsing.md) for more details. + +### Shipping +Once Promtail is certain about what to ingest and all labels are set correctly, +it starts *tailing* (continuously reading) the log files from the applications. +Once enough data is read into memory, it is flushed in as a batch to Loki. diff --git a/docs/logentry/processing-log-lines.md b/docs/promtail/parsing.md similarity index 99% rename from docs/logentry/processing-log-lines.md rename to docs/promtail/parsing.md index 8b5c0ca3..1b876847 100644 --- a/docs/logentry/processing-log-lines.md +++ b/docs/promtail/parsing.md @@ -1,4 +1,4 @@ -# Processing Log Lines +# Log Parsing A detailed look at how to setup promtail to process your log lines, including extracting metrics and labels. diff --git a/docs/querying.md b/docs/querying.md new file mode 100644 index 00000000..562f8f90 --- /dev/null +++ b/docs/querying.md @@ -0,0 +1,111 @@ +# Querying + +To get the previously ingested logs back from Loki for analysis, you need a +client that supports LogQL. +Grafana will be the first choice for most users, +nevertheless [LogCLI](logcli.md) represents a viable standalone alternative. + +## Clients +### Grafana + +Grafana ships with built-in support for Loki for versions greater than +[6.0](https://grafana.com/grafana/download). + +1. Log into your Grafana, e.g, `http://localhost:3000` (default username: + `admin`, default password: `admin`) +2. Go to `Configuration` > `Data Sources` via the cog icon on the left side bar. +3. Click the big <kbd>+ Add data source</kbd> button. +4. Choose Loki from the list. +5. The http URL field should be the address of your Loki server e.g. + `http://localhost:3100` when running locally or with docker, + `http://loki:3100` when running with docker-compose or kubernetes. +6. To see the logs, click <kbd>Explore</kbd> on the sidebar, select the Loki + datasource, and then choose a log stream using the <kbd>Log labels</kbd> + button. + +Read more about the Explore feature in the [Grafana +docs](http://docs.grafana.org/features/explore) and on how to search and filter +logs with Loki. + +> To configure the datasource via provisioning see [Configuring Grafana via +> Provisioning](http://docs.grafana.org/features/datasources/loki/#configure-the-datasource-with-provisioning) +> and make sure to adjust the URL similarly as shown above. + +### LogCLI +If you do not want (or can) use a full Grafana instance, [LogCLI](logcli.md) is +a small command line application to run LogQL queries against a Loki server. +Refer to its [documentation](logcli.md) for reference. + +## LogQL +Loki has it's very own language for querying logs from the Loki server called *LogQL*. Think of +it as distributed `grep` with labels for selection. + +A log query consists of two parts: **log stream selector**, and a **filter +expression**. For performance reasons you need to start by choosing a set of log +streams using a Prometheus-style log stream selector. + +The log stream selector will reduce the number of log streams to a manageable +volume and then the regex search expression is used to do a distributed grep +over those log streams. + +### Log Stream Selector + +For the label part of the query expression, wrap it in curly braces `{}` and +then use the key value syntax for selecting labels. Multiple label expressions +are separated by a comma: + +`{app="mysql",name="mysql-backup"}` + +The following label matching operators are currently supported: + +- `=` exactly equal. +- `!=` not equal. +- `=~` regex-match. +- `!~` do not regex-match. + +Examples: + +- `{name=~"mysql.+"}` +- `{name!~"mysql.+"}` + +The same rules that apply for [Prometheus Label +Selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#instant-vector-selectors) +apply for Loki Log Stream Selectors. + +### Filter Expression + +After writing the Log Stream Selector, you can filter the results further by +writing a search expression. The search expression can be just text or a regex +expression. + +Example queries: + +- `{job="mysql"} |= "error"` +- `{name="kafka"} |~ "tsdb-ops.*io:2003"` +- `{instance=~"kafka-[23]",name="kafka"} != kafka.server:type=ReplicaManager` + +Filter operators can be chained and will sequentially filter down the +expression - resulting log lines will satisfy _every_ filter. Eg: + +`{job="mysql"} |= "error" != "timeout"` + +The following filter types have been implemented: + +- `|=` line contains string. +- `!=` line does not contain string. +- `|~` line matches regular expression. +- `!~` line does not match regular expression. + +The regex expression accepts [RE2 +syntax](https://github.com/google/re2/wiki/Syntax). The matching is +case-sensitive by default and can be switched to case-insensitive prefixing the +regex with `(?i)`. + +### Query Language Extensions + +The query language is still under development to support more features, e.g.,: + +- `AND` / `NOT` operators +- Number extraction for timeseries based on number in log messages +- JSON accessors for filtering of JSON-structured logs +- Context (like `grep -C n`) diff --git a/docs/usage.md b/docs/usage.md deleted file mode 100644 index 70547a7f..00000000 --- a/docs/usage.md +++ /dev/null @@ -1,136 +0,0 @@ -# Using Grafana to Query your logs - -To query and display your logs you need to configure your Loki to be a datasource in your Grafana. - -> _Note_: Querying your logs without Grafana is possible by using [logcli](./logcli.md). - -## Configuring the Loki Datasource in Grafana - -Grafana ships with built-in support for Loki as part of its [latest release (6.0)](https://grafana.com/grafana/download). - -1. Log into your Grafana, e.g, http://localhost:3000 (default username: `admin`, default password: `admin`) -1. Go to `Configuration` > `Data Sources` via the cog icon on the left side bar. -1. Click the big `+ Add data source` button. -1. Choose Loki from the list. -1. The http URL field should be the address of your Loki server e.g. `http://localhost:3100` when running locally or with docker, `http://loki:3100` when running with docker-compose or kubernetes. -1. To see the logs, click "Explore" on the sidebar, select the Loki datasource, and then choose a log stream using the "Log labels" button. - -Read more about the Explore feature in the [Grafana docs](http://docs.grafana.org/features/explore) and on how to search and filter logs with Loki. - -> To configure the datasource via provisioning see [Configuring Grafana via Provisioning](http://docs.grafana.org/features/datasources/loki/#configure-the-datasource-with-provisioning) and make sure to adjust the URL similarly as shown above. - -## Searching with Labels and Distributed Grep - -A log filter query consists of two parts: **log stream selector**, and a **filter expression**. For performance reasons you need to start by choosing a set of log streams using a Prometheus-style log stream selector. - -The log stream selector will reduce the number of log streams to a manageable volume and then the regex search expression is used to do a distributed grep over those log streams. - -### Log Stream Selector - -For the label part of the query expression, wrap it in curly braces `{}` and then use the key value syntax for selecting labels. Multiple label expressions are separated by a comma: - -`{app="mysql",name="mysql-backup"}` - -The following label matching operators are currently supported: - -- `=` exactly equal. -- `!=` not equal. -- `=~` regex-match. -- `!~` do not regex-match. - -Examples: - -- `{name=~"mysql.+"}` -- `{name!~"mysql.+"}` - -The [same rules that apply for Prometheus Label Selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#instant-vector-selectors) apply for Loki Log Stream Selectors. - -### Filter Expression - -After writing the Log Stream Selector, you can filter the results further by writing a search expression. The search expression can be just text or a regex expression. - -Example queries: - -- `{job="mysql"} |= "error"` -- `{name="kafka"} |~ "tsdb-ops.*io:2003"` -- `{instance=~"kafka-[23]",name="kafka"} != kafka.server:type=ReplicaManager` - -Filter operators can be chained and will sequentially filter down the expression - resulting log lines will satisfy _every_ filter. Eg: - -`{job="mysql"} |= "error" != "timeout"` - -The following filter types have been implemented: - -- `|=` line contains string. -- `!=` line does not contain string. -- `|~` line matches regular expression. -- `!~` line does not match regular expression. - -The regex expression accepts [RE2 syntax](https://github.com/google/re2/wiki/Syntax). The matching is case-sensitive by default and can be switched to case-insensitive prefixing the regex with `(?i)`. - -### Query Language Extensions - -The query language is still under development to support more features, e.g.,: - -- `AND` / `NOT` operators -- Number extraction for timeseries based on number in log messages -- JSON accessors for filtering of JSON-structured logs -- Context (like `grep -C n`) - -## Counting logs - -Loki's LogQL support sample expression allowing to count entries per stream after the regex filtering stage. - -### Range Vector aggregation - -The language shares the same [range vector](https://prometheus.io/docs/prometheus/latest/querying/basics/#range-vector-selectors) concept from Prometheus, except that the selected range of samples contains a value of one for each log entry. You can then apply an aggregation over the selected range to transform it into an instant vector. - -`rate` calculates the number of entries per second and `count_over_time` count of entries for the each log stream within the range. - -In this example, we count all the log lines we have recorded within the last 5min for the mysql job. - -> `count_over_time({job="mysql"}[5m])` - -A range vector aggregation can also be applied to a [Filter Expression](#filter-expression), allowing you to select only matching log entries. - -> `rate( ( {job="mysql"} |= "error" != "timeout)[10s] ) )` - -The query above will compute the per second rate of all errors except those containing `timeout` within the last 10 seconds. - -You can then use aggregation operators over the range vector aggregation. - -### Aggregation operators - -Like [PromQL](https://prometheus.io/docs/prometheus/latest/querying/operators/#aggregation-operators), Loki's LogQL support a subset of built-in aggregation operators that can be used to aggregate the element of a single vector, resulting in a new vector of fewer elements with aggregated values: - -- `sum` (calculate sum over dimensions) -- `min` (select minimum over dimensions) -- `max` (select maximum over dimensions) -- `avg` (calculate the average over dimensions) -- `stddev` (calculate population standard deviation over dimensions) -- `stdvar` (calculate population standard variance over dimensions) -- `count` (count number of elements in the vector) -- `bottomk` (smallest k elements by sample value) -- `topk` (largest k elements by sample value) - -These operators can either be used to aggregate over all label dimensions or preserve distinct dimensions by including a without or by clause. - -> `<aggr-op>([parameter,] <vector expression>) [without|by (<label list>)]` - -parameter is only required for `topk` and `bottomk`. without removes the listed labels from the result vector, while all other labels are preserved the output. by does the opposite and drops labels that are not listed in the by clause, even if their label values are identical between all elements of the vector. - -topk and bottomk are different from other aggregators in that a subset of the input samples, including the original labels, are returned in the result vector. by and without are only used to bucket the input vector. - -#### Examples - -Get top 10 applications by highest log throughput: - -> `topk(10,sum(rate({region="us-east1"}[5m]) by (name))` - -Get the count of logs during the last 5 minutes by level: - -> `sum(count_over_time({job="mysql"}[5m])) by (level)` - -Get the rate of HTTP GET requests from nginx logs: - -> `avg(rate(({job="nginx"} |= "GET")[10s])) by (region)` diff --git a/mkdocs.yml b/mkdocs.yml new file mode 100644 index 00000000..401f8f9c --- /dev/null +++ b/mkdocs.yml @@ -0,0 +1,10 @@ +site_name: Loki +theme: + name: "material" + palette: + primary: "grey" + accent: "amber" + logo: logo.png + favicon: logo.png +markdown_extensions: + - codehilite -- GitLab