A Story About The Battle of Logging
Networking and computing systems generate a deluge of logs in seconds. Analyzing these voluminous logs without human interference plays a crucial role in autonomous management. Just as other computing systems, CDNs typically generate millions of logs per second. Dynamic real-time decision systems, and error control mechanisms highly depend on the successful log collection process in order to improve end-user experience and content delivery quality.
In this blog, we will compare some leading log management tools (a.k.a. log shippers) for not only CDNs systems but also for all other computing systems.
Widely-Utilized Log Management Tools:
- Elastic Beats
When someone asks for a log shipper, beyond doubt, Logstash is the top tool on the list. This is because it has many plugins for inputs, outputs, codecs, and filters, which enhance flexibility and compatibility.
- The event routing process is in the form of an “if-else” structure. This means that if you need a non-complex routing process, going forward with Logstash might leverage the setup.
- The number of plugins is mostly enough for any computing system and has power on log parser plugins.
- Logstash has a centralized repository for all plugins, making it easy to access and find a proper plugin.
- Logstash is the native tool of the ELK stack.
- Logstash is written in JRuby, which is a Java implementation of the Ruby programming language. Therefore, it is needed to have Java runtime and adds up the additional cost.
- Performance might be an issue compared to other lightweight log shippers. But having a lot of features compensates for high memory usage with flexibility.
- The persistent queue in Logstash might be helpful sometimes due to requiring less effort. But it might constraint developers as well when there are many logs to handle.
To solve some performance issues in Logstash, Elastic team proposed many Elastic Beats to run on end nodes. The main issue with the Logstash was leading to a lot of memory usage and low computational performance in heavy use cases. Thanks to Elastic Beats, hierarchical topologies can be used jointly with Logstash.
- Since they proposed for specific routines on end nodes, they are quite a lightweight tool.
- If you have servers with small capacity and relatively fewer resources, installing proper and minimized Elastic Beats instead of fully qualified Logstash is straightforward.
- If you have a heavy-duty streaming process that leads to back-pressure and requires recovery, then Beats are compatible with ELK stack.
- The ELK Stack needs Logstash as an aggregator besides Elastic Beats at the end. Therefore, if you plan to use only Beats, it might not be suitable for your system.
Fluentd, an open-source project of the Cloud Native Computing Foundation (CNCF), is a most similar project to Logstash. According to a presentation in OpenStack Summit 2015 , overall Fluentd and Logstash performances are quite comparable. Moreover, Fluentd has a complementary tool for end devices, naming Fluent Bit, just as Elastic Beats.
- In event routing, “tags” are used instead of the “if-else” structure, which makes it easier in complex event routing scenarios.
- Fluentd has over 500 plugins.
- Since Fluentd is supported by CNCF, it is definitely compatible for projects where Kubernetes, OpenTracing, or Prometheus are used.
- Fluentd has both in-memory and on-disk options, without fixed-size memory as in Logstash.
- Docker containers have built-in options for Fluentd, therefore no need for an extra plugin.
- Since Fluentd is based on CRuby, no need for Java runtime.
- Plugins are not in the centralized repo, which might require extra effort to find a perfect match.
- Some plugins do not support multi-threading.
- Even if Fluentd has many log parsing options, Logstash is more flexible in filtering, parsing, aggregation, etc.
Telegraph is a part of TICK (Telegraph, InfluxDB, Chronograph, and Kapacitor) stack, but it also suitable for ELK stack. It is mostly compared with the Metric Beat of the ELK due to their functional similarities.
- It is pretty lightweight and written by Go like Beats, which makes it easier to set up.
- It has more than 100 plugins for input streaming, including popular tools like Kafka, Redis, RabbitMQ, MySQL, MongoDB, PostgreSQL, Prometheus, Apache, Ngnix, etc.
- It can easily be integrated with more than 30 outputs to monitor.
- It does not become fully adaptable with ELK stack; thus, it is harder to set up, unlike Metric Beats.
- It isn’t easy to set up with Logstash ad Redis to send data from Telegraph.
Unlike the other log shippers, Flume is originally designed for collecting, aggregating, and forwarding a massive amount of log data. It is a part of Hadoop and stores output data in The Hadoop Distributed File System (HDFS).
- It provides quite better CPU utilization compared with Fluentd and Logstash.
- It is really successful to handle dense data streaming with low latency processing.
- Compatibility with HDFS is a plus.
- Lost and duplicate data problems are simply solved with Kafka.
- Unfortunately, JVM is a memory footprint.
- There are no many plugins like in Logstash and Fluentd, which decrease flexibility.
- There are three points to config: source, channel, and sink, which might be difficult in some use-cases.
Rsyslog is an open-source and default software tool for UNIX systems for logging and forwarding data. It is an extended version of Syslog and was realized in 2004.
- It is quite lightweight, simple, and fast.
- It is written in C, so setup and configuration in UNIX systems are straightforward.
- It is not applicable where back-pressure exists. In that cases, Beats or Fluent bit are more suitable.
- If there are many logging requirements, it is not functional enough.
At Medianova, we are always looking for optimized and autonomous ways to shape our systems intelligently. Because we care for data, and we know the value of data. Get in touch with us to learn more about how Medianova can build and manage an optimized and dedicated CDN for you.