At Simon we use Jenkins as our universal hammer. Its versatility is unparalleled, and we've gotten more mileage out of it than we ever imagined. Here's how we use it and what makes it great.
You probably use a continuous integration application. Chances are you’re using Jenkins. We used it at Etsy, previous startups, and even saw it at Apple.
At Simon, we use Jenkins for continuous integration, but we also use it for much more than that. It functions as an asynchronous job queue, a workflow manager for our Elastic MapReduce job flows, a context for executing one-off long running scripts, a vehicle for continuous deployment, and more.
Before diving into how we use Jenkins to support these use cases, let’s take a step back and look at Jenkins as a piece of engineering infrastructure: how is it so versatile? Jenkins is a deployed, multi-environment, production-grade application that can execute any scripts or commands that reside in our primary GitHub repository. Specifically, Jenkins is:
Audited: A fully monitored execution context that alerts on failure and handles logging of stdout / stderr plus standard application logs.
Productionized: Production environments that can execute in either a sandboxed test mode or a full production environment with access to databases, caches, S3, and our data pipelines.
Secure: Jenkins runs inside our VPC and sits within the same VPN as the rest of our system.
For anything that doesn’t make sense to run on our web application servers or in our Hadoop data pipe, we ask, “Can we use Jenkins for this?”
And most of the time, the answer is “yes”.
Jenkins use cases
We use Jenkins to run tasks that otherwise would have required components such as Luigi, Oozie, Deployinator, Gearman, AWS Lambda, and even cron. It allows us to operationalize one system instead of seven.
Asynchronous job queue. Jenkins jobs can be deployed to run in full production context with access to our OLTP production database. We have standard patterns we employ to use simple database queueing to coordinate jobs via Jenkins.
Data pipe dependency management. Jenkins supports post-build actions that can trigger downstream jobs on successful completion. We have distinct jobs that support each phase of our ETL pipes. A successful data Extraction from an originating source triggers a data Transform in our EMR pipe that is coordinated via a second job. After completing the second job, Jenkins runs a third and final job that Loads into our data warehouse.
Jenkins also supports fan-out dependencies via various plugins such as the Multijob package. Fan-in is a bit trickier, causing us to fall back on time-based sequencing.
One-off long running scripts. Jenkins runs in our production environment on our production network on the same VPC as the rest of our AWS infrastructure. Long-running scripts that are data-dependent and don’t fit into our MapReduce frameworks are great fits for Jenkins. Why run something locally that may take hours, consume network bandwidth, and require your localhost to be available for an extended period? Call up Jenkins, and you get fast pipes and a fully logged environment.
Simple admin interface. At its rawest form, Jenkins can set shell variables via its web frontend as inputs to your shell scripts. We’ve used this functionality to set up simple GUIs for the entire company (both engineers and non-engineers) to publish blog posts or re-run analyses with new inputs (among other things).
Crons that require email. At Etsy, we had an automated list-serv that blasted out source code diffs on check-in. GitHub provides post-checkin email hooks, but the content is limited to commit logs and author. To get full diffs, we had to write a script. Of course, this script runs on Jenkins. It blasts out our commit logs with full source deltas to our engineering list.
Our engineering tendencies favor intense pragmatism: use existing and/or basic tools that we already have to get the job done. Better to fail fast with a known quantity than to overbuild only to learn that requirements still aren’t met, burden of operation is sky high, or the underlying technology isn’t fully baked.
We don’t know what the future holds for Jenkins and us; we plan to push it as hard as possible. At the same time, we expect to outgrow Jenkins across specific use cases as we expand beyond its limits.
Then, we’ll ask, “Do we really need to use something else other than Jenkins for this?”