Using Splunk for automated log management

SkiStar’s vision is to create memorable mountain experiences as the leading operator of European alpine destinations. SkiStar manages the majority of its IT inhouse. Much of the company’s data centre infrastructure was supplied by, and is supported by Proact. A large portion of business operations runs on systems which SkiStar have developed over the past 25 years. These applications cover a wide range of services, from hotel receptions and ski rentals to the online reservation website.
Development is organised using agile methodology and integrated with applications maintenance and operations, in keeping with DevOps tenets. From this follows continuous application updates. SkiStar deploys new code releases practically on a daily basis.

Challenge

Under a DevOps regime, following up and reviewing each new software release is an essential undertaking. The review procedure gives developers relevant feedback, e.g. regarding response times prior to and following the release, and if bugs targeted for correction in the new release are decidedly gone.
Application log files is the most important source for these insights, and these contain large volumes of often impenetrable data. To extract the desired feedback, SkiStar’s developers have been relegated to manually scanning the log files. But as the data centre operating environment grows increasingly complex, this task becomes more difficult and time consuming. For instance, SkiStar runs a cluster of web servers which are connected and load balanced to allow traffic to be redirected between servers. This presents developers not with a single but perhaps ten different devices which require individual troubleshooting.
The mere difficulty to overview and identify patterns in the vast volumes of log data could render it practically impossible to obtain the desired answers. SkiStar therefore went searching for tools to help them automate log management and make troubleshooting and software release reviews more efficient.

Solution

SkiStar decided to use Splunk, which has emerged as one of the premier tools for log data analysis – operational data generated by applications and systems. Splunk enables real-time decision support and automated analysis of sprawling, and many times arcane log files.
“We have designed custom dashboards with graphs displaying our key performance metrics in real time, allowing us to track instances of errors as well as application performance,” says Peter Larsson.
SkiStar’s struggles to compile and analyse log files from multiple systems is now history, as all relevant operational data associated with each application is compiled into a single interface.
The deployment of log analysis was made together with Proact, and their Splunk expert team. SkiStar has a long-lasting working relation with Proact, including delivery of servers, storage, tech support, and as advisors on data centre matters.
The new solution uses snapshot-based backups within the MetroCluster and then replicates to the secondary storage platform. This drastically reduces the storage footprint while making the process much quicker. The shortened backup window enables an improved RPO (recovery point objective) target, as more recent datasets can be used for restore. 

“I can hardly see any limitations with Splunk as a platform. We can use any kind of data and make any type of queries on selected data, which is tremendously powerful.”