Rails JSON event logging to Elasticsearch with Filebeats

Applications that generate a stream of events need a way to store them centrally for search and analysis. Elasticsearch is a decent choice for aggregating event logs, but how do you get the data in there?

Although an application can write events directly to Elasticsearch, this can become a bottleneck as your application scales up.
Elasticsearch offers Filebeats as a way to efficiently load logs into Elasticsearch.

Filebeats is a lightweight Go-based system agent that uses "bulk indexing" to upload logs, which reduces server load. It offers a timeout for new logs, so that there's a cap on the delay before an event is loaded into Elasticsearch, during quiet periods.

Combining Filebeats with JSON log files has the benefits of a universal file format with structured logging and human readability, and the simplicity of buffering log files on disk on individual servers.

Event data log flow

1. Writing the event logs

Here's a sample Rails snippet for writing events to a JSON-lines log file.

class Event
  cattr_accessor :logger
  self.logger ||= Logger.new(Rails.root.join("log", "events.#{Rails.env}.log"))

  include ActiveModel::Serialization
  attr_accessor :attributes
  def initialize(attributes)
    @attributes = attributes
  end

  def log_to_file
    self.class.logger << "#{attributes.to_json}\n"
  end
end

...

Event.new(id: UUID.generate, user_id: 1, remote_ip: '51.25.1.66', event: 'login').log_to_file

2. Creating the Elasticsearch template

This step assumes you've already nailed down your event structure. Adding new fields later is easy, so don't worry too much about making it perfect. The following examples are in Ruby.

mapping = {
  :@timestamp => { :type => "date" },
  :id =>         { :type => "keyword" },
  :user_id =>    { :type => "long" },
  :remote_ip =>  { :type => "ip" },
  :event =>      { :type => "keyword" },
}

The @timestamp field is needed by filebeats and represents when the log line was uploaded. You can add your own separate timestamp field to your JSON structure if you like. Now lets upload the template to Elasticsearch. We normally use a rake task for this in a Rails app, to make it easy to update later.

client.indices.put_template(
  name: 'eventlogs', body: {
    template: 'eventlogs-*'
    settings: { number_of_shards: 3, number_of_replicas: 1 },
    aliases:  { 'eventlogs-all' => {} },
    mappings: { doc: { properties: mappings } }
  }
)

The asterisk in the template name is special, because Elastic will use it to decide to apply this template later, when the * is replaced by the year and month. The alias name (eventlogs-all) will be automatically tagged onto any indexes created with this template. That name is also how we will refer to the entire database for searching across all months.

3. Install & configure Filebeats agent

This part is quite easy thanks to the simple design of Filebeats. Here's a sample YAML config file :

filebeat.shutdown_timeout: 10s
filebeat.prospectors:
- input_type: log
  json.keys_under_root: true
  json.message_key:
  paths:
    - /path/to/application/logfiles/events.*.log

processors:
- drop_fields: # discard unrequired fields normally injected by filebeats
    fields: ["beat", "source", "offset", "input_type"]

output.elasticsearch:
  bulk_max_size: 100            # bulk insert up to 100 rows at once
  flush_interval: 60s           # insert less than 100 rows after 1m
  index: "eventlogs-%{+YYYYMM}" # index name based on Year and Month
  template.enabled: false
  hosts:
    - server1:9200
    - server2:9200
    - server3:9200

And lets get that uploaded and the agent installed to our Ubuntu servers with this sample ansible snippet:

- get_url: url=https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.5.1-amd64.deb dest=/tmp/filebeat-5.5.1-amd64.deb
- apt: deb=/tmp/filebeat-5.5.1-amd64.deb
- template: dest="/etc/filebeat/filebeat.yml" src="filebeat.yml"
- service: name=filebeat state=started enabled=yes

After that, filebeats will already have started reading your log files and be try to upload them. Check /var/log/filebeat/ for debugging information.

If the system shuts down, Filebeats saves its state into a local file (/var/lib/filebeats/registry by default), and then picks up where it left off If the log file gets rotated, Filebeats handles this properly by following the new file when it appears.