Tyler Montgomery, Trailhead's engineering director, and Shaun Russell, its principal engineer, kick off a conversation with Chris Castle as to how Trailhead came about. One of Salesforce's developer evangelists, Josh Burke, wanted to create some teaching material for classes he taught. The idea was that students wouldn't just read some content and take a quiz; they would perform real actions, such as making a dummy user an admin, and an API call would assert that they accomplished the task successfully.
Due to its tight deadline of just six weeks before Dreamforce, the Trailhead team built the app using Ruby and Rails, and hosted the site on Heroku. Although they've seen huge growth, a lot of naive technical decisions have lead to a mix of addressing performance issues as well as launching new features over the past few years. Their largest near-outage came about when hundreds of thousands of students in India began to use the site all at the same time in order to participate in a competition. Heroku was able to scale up, but this exposed many problems which, when the traffic subsided, better prepared the Trailhead team for future scaling issues after all the code fixes were in place.
The conversation concludes with advice on scaling up an application on Heroku. Shaun suggests an APM tool like New Relic to stay on top of performance problems before they become an issue. Tyler suggests implementing an entire pipeline of tooling--PagerDuty, errors logged into Slack, segmented environments for staging and production--before continuing work on any feature code.