Rails JSON event logging to Elasticsearch with Filebeats
Applications that generate a stream of events need a way to store them centrally for search and analysis. Elasticsearch is a decent choice for aggregating event logs, but how do you get the data in there?
Although an application can write events directly to Elasticsearch, this can become a bottleneck as your application scales up.
Elasticsearch offers Filebeats as a way to efficiently load logs into Elasticsearch.
Filebeats is a lightweight Go-based system agent that uses "bulk indexing" to upload logs, which reduces server load. It offers a timeout for new logs, so that there's a cap on the delay before an event is loaded into Elasticsearch, during quiet periods.
Combining Filebeats with JSON log files has the benefits of a universal file format with structured logging and human readability, and the simplicity of buffering log files on disk on individual servers.
1. Writing the event logs
Here's a sample Rails snippet for writing events to a JSON-lines log file.
class Event
cattr_accessor :logger
self.logger ||= Logger.new(Rails.root.join("log", "events.#{Rails.env}.log"))
include ActiveModel::Serialization
attr_accessor :attributes
def initialize(attributes)
@attributes = attributes
end
def log_to_file
self.class.logger << "#{attributes.to_json}\n"
end
end
...
Event.new(id: UUID.generate, user_id: 1, remote_ip: '51.25.1.66', event: 'login').log_to_file
2. Creating the Elasticsearch template
This step assumes you've already nailed down your event structure. Adding new fields later is easy, so don't worry too much about making it perfect. The following examples are in Ruby.
mapping = {
:@timestamp => { :type => "date" },
:id => { :type => "keyword" },
:user_id => { :type => "long" },
:remote_ip => { :type => "ip" },
:event => { :type => "keyword" },
}
The @timestamp
field is needed by filebeats and represents when the log line was uploaded. You can add your own separate timestamp field to your JSON structure if you like. Now lets upload the template to Elasticsearch. We normally use a rake task for this in a Rails app, to make it easy to update later.
client.indices.put_template(
name: 'eventlogs', body: {
template: 'eventlogs-*'
settings: { number_of_shards: 3, number_of_replicas: 1 },
aliases: { 'eventlogs-all' => {} },
mappings: { doc: { properties: mappings } }
}
)
The asterisk in the template name is special, because Elastic will use it to decide to apply this template later, when the *
is replaced by the year and month. The alias name (eventlogs-all
) will be automatically tagged onto any indexes created with this template. That name is also how we will refer to the entire database for searching across all months.
3. Install & configure Filebeats agent
This part is quite easy thanks to the simple design of Filebeats. Here's a sample YAML config file :
filebeat.shutdown_timeout: 10s
filebeat.prospectors:
- input_type: log
json.keys_under_root: true
json.message_key:
paths:
- /path/to/application/logfiles/events.*.log
processors:
- drop_fields: # discard unrequired fields normally injected by filebeats
fields: ["beat", "source", "offset", "input_type"]
output.elasticsearch:
bulk_max_size: 100 # bulk insert up to 100 rows at once
flush_interval: 60s # insert less than 100 rows after 1m
index: "eventlogs-%{+YYYYMM}" # index name based on Year and Month
template.enabled: false
hosts:
- server1:9200
- server2:9200
- server3:9200
And lets get that uploaded and the agent installed to our Ubuntu servers with this sample ansible snippet:
- get_url: url=https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.5.1-amd64.deb dest=/tmp/filebeat-5.5.1-amd64.deb
- apt: deb=/tmp/filebeat-5.5.1-amd64.deb
- template: dest="/etc/filebeat/filebeat.yml" src="filebeat.yml"
- service: name=filebeat state=started enabled=yes
After that, filebeats will already have started reading your log files and be try to upload them. Check /var/log/filebeat/
for debugging information.
If the system shuts down, Filebeats saves its state into a local file (/var/lib/filebeats/registry
by default), and then picks up where it left off If the log file gets rotated, Filebeats handles this properly by following the new file when it appears.