In traditional systems, logs are lines of text intended for offline human consumption. With the advent of Cloud and Big Data, there is a paradigm shift in what can be logged. Systems can now log any piece of structured or unstructured data, application logs, transactions, audit logs, alarms, statistics or even tweets. Add to this the scale of logs. The earlier methodology of human analysis would not work in this kind of scenario. There has to be some automated mechanism for log analysis and deciphering useful information from them.
The trio of Logstash, Kibana and Elasticsearch is one of the most popular open source solutions for logs management. The three products together are known as the ELK stack and provide an elegant solution for log management.
Elasticsearch is a distributed, flexible and powerful, RESTful, search and analytics engine based on Apache Lucene Index. It gives the ability to move beyond simple-full text search. It categorizes data using indices which can be easily divided into shards (equivalent to partitions in RDBMS) and each shard can have zero or more replicas. This helps in providing near real-time search. Elasticsearch provides robust set of APIs and query DSL in addition to clients for most of the popular programming languages.
Elasticsearch was built from the ground up to handle any kind of data and. It can slice and aggregate data on the fly, based on any field in the logs. This creates valuable insights from raw logs.
Kibana is a data visualization engine used along with Elasticsearch. It helps in natively interacting with all data in Elasticsearch via custom dashboards. You can make dynamic, shareable and exportable dashboards. Data analyses becomes a breeze with Kibana’s elegant user interface using pre – designed or custom dashboards in real-time for on-the-fly data analysis. Kibana is easy to setup and can integrate seamlessly with different log aggregators like Logstash, Apache Flume, etc. See below for a sample Kibana dashboard:
Logstash is one of the most popular open source logs and events shipper/processor. It takes as input logs, processes and other time based events from any stem and stores data in a single place for additional processing. It scrubs logs and parses all data sources into an easy to read JSON format. This means that your logging data can now be analyzed in real time. You can then use Kibana to explore and monitor the analytics. The logstash – elasticsearch – kibana is illustrated below:
The ELK stack is very powerful tool for monitoring and analytics of cloud scale logs.