Of Book Clubs and Discourse “There is a road from the eye to the heart that does not go through the intellect.” ― G.K. Chesterton Every organization lives and dies by its culture. For many years at Chargify we have put a lot of effort into making our engineering organization productive and healthy. This is
Opening EC2 firewall ports via Google login You've set up a new server in AWS EC2, and you want your team to be able to access it by SSH. This means authorising access via an EC2 Security Group. Best practice dictates you don't leave this port open to the entire world of hackers and miscreants, but how
Upgrading Elasticsearch v2 to v5 (without downtime) At Chargify, we lean heavily on Elasticsearch for data storage as well as search, but upgrading Elasticsearch major versions requires a full cluster restart. We came up with a plan for upgrading with no noticeable downtime, although the cluster was placed into "read only" mode for some time. Pre-upgrade codebase
Live Reindexing with Elasticsearch In order to upgrade Elasticsearch in-place from v2 to v5, one of the road blocks we faced was that the built-in upgrade feature did not support indexes created with v1 of Elasticsearch. We had several of those left over from a previous upgrade from v1 to v2. So prior to
Keeping Datadog In (Version) Control I recently migrated our system monitoring and analytics from the TICK stack [https://www.influxdata.com/time-series-platform/] to Datadog [https://www.datadoghq.com/]. One of the things I really wanted to get better about in this transition was keeping more of our config version controlled somehow. This post will just
Rails JSON event logging to Elasticsearch with Filebeats Applications that generate a stream of events need a way to store them centrally for search and analysis. Elasticsearch is a decent choice for aggregating event logs, but how do you get the data in there? Although an application can write events directly to Elasticsearch, this can become a bottleneck
Automating EC2 instances for database backups 😱 A nightmare scenario for any business: needing a backup that isn't there! Procedures for performing and verifying backups have always been necessary, and if this can happen regularly, automatically, monitored for any failures, this will give peace of mind and a high degree of confidence in the disaster recovery plans.
Making Dangerous Hard : Admin Functions If you read my last post about internal tools [https://inside.chargify.io/2017/07/10/make-your-tools-easier-to-use-than-the-alternative/] you will have noticed that I used an internal tool I built we call "Admin Functions" as the example throughout the post. This post will go a little more in-depth into what those
Rails background job queue tips part 3 In this final part, the locking techniques in Chargify's background job system will be explained. Commonly you need a job to run only one at a time, no matter how many there are waiting to run. We refer to this as job locking, where a job running excludes another one
Rails background job queue tips part 2 Here's a few more interesting aspects of our background job system. Make sure to check out part 1 first if you haven't already. Callbacks We've created several callbacks for job classes to use: * on_busy when a job lock could not be acquired within the timeout. This is used to
Make Your Internal Tools Easier to Use Than The Alternative One of the largest parts in my role as an Operations Engineer is creating "internal tools". This means coding features in our app (or building totally separate apps) that a customer never sees, but are instead used by team members to perform administrative duties, or support customers. I strive to
Rails background job queue tips part 1 Many Rails apps have some kind of job queue. While the "out of the box" experience with Rails and Sidekiq keeps getting better over time, here's a collection of ways we've improved on the basics at Chargify, to help keep things running smoothly and efficiently in our background job queues.
How we use Redis with High Availability Redis is an in-memory database which has become the default choice for background job queues in Rails applications, because it helps with scaling to large numbers of background job workers. The Chargify application was no exception, having started out years ago with the SQL-backed delayed_job, moving shortly thereafter to
Don't Assume You Know What The Problem Is TL;DR: Even if you're 100% sure you know what the problem you're trying to solve is, don't act like you're sure. Also, this story is mostly pictures, just read it. -------------------------------------------------------------------------------- Two small points to will help you understand this story. 1. "Incident Response" is a little Slack bot
Getting Started Welcome to the Chargify Engineering Blog We'd love to write and tell you about all the things our engineering team does behind the scenes that make our lives (and our product) awesome. Coming really soon. Promise!