Skip to content

Instantly share code, notes, and snippets.

@daniellee
Last active December 6, 2018 23:09
Show Gist options
  • Select an option

  • Save daniellee/40b712d34abf497349c9b0b0f382dbd8 to your computer and use it in GitHub Desktop.

Select an option

Save daniellee/40b712d34abf497349c9b0b0f382dbd8 to your computer and use it in GitHub Desktop.
Loki Overview

Overview

What is Loki?

Loki is a horizontally-scalable, highly-available, multi-tenant, log aggregation system inspired by Prometheus. It is designed to be very cost effective, as it does not index the contents of the logs, but rather a set of labels for each log stream. Plain text logging with labels.

Why Loki?

Almost all existing log aggregation solutions involve using full-text search systems to index logs (usually structured logs); at first glance this has some obvious advantages, with a rich and powerful feature set allowing for complex queries.

However, these existing solutions also have some disadvantages:

  • they are complicated to scale, resource intensive and difficult to operate.
  • The challenges and operational overheads of maintaining these systems have led to a migration to SaaS log aggregation. These SaaS systems have turned out to be excessively expensive. This leads to engineers logging less to cut costs and being forced to leave some of their systems with no log coverage.
  • An increasingly common pattern is the use of time series monitoring in combination with log aggregation. For incident responses, the initial alerting and querying is done with time series metrics and the majority of log queries just focus on a time range and some simple parameters (host, service etc). Therefore most of the advanced search capabilities that cost so much are rarely used.

Loki takes a different design tradeoff; instead of indexing the log data in log streams, Loki indexes the metadata for that log stream (server name, service name). This metadata is formatted as Prometheus-style multi-dimensional labels.

This has some major advantages:

  • Loki ingestion uses the same service discovery and label relabelling libraries as Prometheus, meaning time series/metrics in Prometheus and Loki log streams can have the same labels. Loki minimises the cost of the context switching between logs and metrics and this will help reduce incident response times and improve the user experience.
  • By storing plain text logs and only indexing the metadata, storage is hugely simplified. This makes Loki simpler to operate and provides some significant cost savings. This means engineers can store all of their logs.
  • Prometheus labelling goes hand-in-hand with Kubernetes label selection. This makes Loki a particularly good fit for storing Kubernetes logs.

Components

Loki consists of 3 components:

  • loki is the main server, responsible for storing logs and processing queries.
  • promtail is the agent, responsible for gathering logs and sending them to loki.
  • Grafana for the UI.

The promtail Agent

Promtail is a daemon that discovers targets, produces metadata labels and tails log files to produce streams of logs and uses the same service discovery and label relabelling libraries as Prometheus.

Grafana and Loki

Grafana 6.0 will have native support for Loki (it is already included in the latest master). The new Explore UI has custom support for querying log streams from Loki. Read about the Explore UI in Grafana and its Prometheus and Loki support here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment