Wednesday, September 12, 2012

Setting up a central log analysis architecture (with syslog and splunk)

Introduction

The larger the systems, the more headache it gives when troubleshooting problems. One of the things that really helps and which is relatively easy to achieve is to make sure that all logs are accessible from a central place and can be analysed and queried in real-time.

Setting this up is easier than you might think and in this post I will talk you through this process using rsyslog (the native syslog daemon on many linux distros these days) and Splunk.

Overview

In the diagram below, I have shown a high level overview of the architecture.


Typically, there will be many (logging) clients in the solutions, examples of these are the Web and Application Servers, Database Servers, Management servers and so on. These clients typically run one or more applications which either can log their messages directly to the (local) syslog daemon and/or write them to one or more log files. The syslog daemon is by default configured to write incoming log messages to a number of local log files, but can easily be configured to submit these messages to a remote syslog server.

This log server takes these inbound messages and stores them in a convenient folder structure. These local log files then can be indexed by the Splunk server, which allows for very powerful analysis of this data through a web interface.

If the application on the clients can not write to the syslog daemon but writes it to local log files instead, the rsyslog daemon can be configured to monitor these log files and submit these as syslog messages to log server. Not all syslog daemons can do this though, and even the rsyslog daemon has limited capabilities in this regard, e.g. the name of the input log file must be static. Another (optional) way of forwarding messages to the Splunk server is by using the Splunk Forwarder. Personally I prefer using syslog though as I feel it is a more lean and proven method and all messages on the log server are handled in the same way, but it is always good to have an alternative, right?

Configuration

Let's start with setting up the central log server. We are assuming an Ubuntu 12.04 instance, which comes with rsyslog by default, but setup on other flavours should be identical or similar.

Accept inbound messages from remote servers

To accept inbound messages from remote servers, ensure that in /etc/rsyslog.conf the following configs are present:
### Load TCP and UDP modules
$ModLoad imtcp
$ModLoad imudp


Rsyslog knows the concepts of templates and rulesets, which allows you to specify how particular messages must be dealt with. In this case we make a distinction between the incoming messages from remote clients, and the local messages by defining a template for both cases.

### Templates
# log every host in its own directory
$template RemoteHost,"/mnt/syslog/hosts/%HOSTNAME%/%$YEAR%/%$MONTH%/%$DAY%/%syslogfacility-text%.log"

### Rulesets
# Local Logging
$RuleSet local
# Follow own preferences here....

# use the local RuleSet as default if not specified otherwise
$DefaultRuleset local

# Remote Logging
$RuleSet remote
*.* ?RemoteHost  


Then bind these rule sets to a particular listener:

### Listeners
# bind ruleset to tcp listener and activate it
$InputTCPServerBindRuleset remote
$InputTCPServerRun 5140
$InputUDPServerBindRuleset remote
$UDPServerRun 514


This is it, all messages coming in on the TCP or UDP listener now will be stored in its own directory structure, conveniently grouped by host and date.

The syslog daemons on the clients in turn need to be configured to send their messages to this syslog server.

In this case we follow the convention by adding configuration snippets in the /etc/rsyslog.d directory rather than modifiying the /etc/rsyslog.conf file. By default, all *.conf files in this directory will be included.

In order to read a number of log files and process them as syslog messages, we add the following config file:

File: /etc/rsyslog.d/51-read-files.conf
#  Read a few files and sent these to central server.
#

# Load module
$ModLoad imfile #needs to be done just once
# Nginx Access log
$InputFileName /var/log/nginx/access.log
$InputFileTag nginx-access:
$InputFileStateFile stat-nginx-access
$InputFileSeverity info
$InputFileFacility local0
$InputRunFileMonitor
# Nginx Error log
$InputFileName /var/log/nginx/error.log
$InputFileTag nginx-error:
$InputFileStateFile stat-nginx-error
$InputFileSeverity error
$InputFileFacility local0
$InputRunFileMonitor


This will pick up the nginx access and error log files and process them as syslog messages.

To forward these messages to the syslog server, we add the following file:

File: /etc/rsyslog.d/99-forward.conf
#  Forward all messages to central syslog server.
#
$ActionQueueType LinkedList   # use asynchronous processing
$ActionQueueFileName srvrfwd   # set file name, also enables disk mode
$ActionResumeRetryCount -1     # infinite retries on insert failure
$ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down
*.* @@log.mydomain.com:5140 # Do the actual forward using TCP (@@) 


This will forward all messages using TCP to the log server. UDP (using one @) or a high reliability protocol (RELP, using :relp:) can be used as well.

After restarting the syslog daemon, this basically takes care of collecting all log files in one central location. This in itself is already very useful.

But having Splunk running on top of this is even more powerful. Splunk can be downloaded for evaluation purposes and can be used as a free version (with a few feature limitations) up to an indexing capacity of 500 MB per day. Note it does not limit the total index size, but only the daily volume.

Once installed, it allows you to log in, change your password and add data to the Splunk indexer. After all, this is what you want to do.

After clicking Add Data, you'll be greeted with the following screen:



In this page you can select 'From files and directories'. This takes you to the Preview data dialogue, which enables you to see a preview of the data before you add it to a Splunk index. Select Skip preview and click Continue.


This takes you to the Home > Add data > Files & directories > Add new view. Select the default source (Continuously index data from a file or directory this Splunk instance can access) and fill in the data to your path.

Normally you would index everything in a particular subdirectory (e.g. /mnt/syslog/hosts/appsrv1/2012/09/12/*) of set van directories (e.g. /mnt/syslog/hosts/.../*). It might be useful to address the individual files one by one in order to define how they are dealt with by Splunk.

Now select More Settings.

This enables you to override Splunk's default settings for Host, Source type, and Index. To automatically determine the host based on the path in which the log file is stored, select 'segment in path' with value 4.



Note this has to match the value as specified in the rsyslog template definition.
RemoteHost,"/mnt/syslog/hosts/%HOSTNAME%/%$YEAR%/%$MONTH%/%$DAY%/%syslogfacility-text%.log"

What about the Source type and Index settings? The source type of an event tells you what kind of data it is, usually based on how it's formatted. Examples of source types are access_combined or cisco_syslog. This classification lets you search for the same type of data across multiple sources and hosts. The index setting tells Splunk where to put the data. By default, it's stored in main, but you might want to consider partitioning your data into different indexes if you have many types.

Click Save and then you are ready to go and open the search app. If you want getting your feet wet with this search app, have a look at this tutorial. Happy splunking!

No comments:

Post a Comment