Back to Case Studies

How Cortex helps ensure the reliability of BigCommerce’s global SaaS platform

BigCommerce follows a democratic process for technical decision making; everyone has the ability to make an impact. To ensure alignment across engineering standards introduced, the use Cortex.

How Cortex helps ensure the reliability of BigCommerce’s global SaaS platform

Program Snapshot

Introduction

BigCommerce (Nasdaq: BIGC) is an Open SaaS composable ecommerce platform enabling fast-growing and established B2C and B2B brands and retailers across 150 countries to build, innovate and grow their businesses online. From their founding in 2009 through IPO in 2020, BigCommerce’s customer-first core value has long been mirrored by an internal ethos of boundless growth for employees. Encouraging innovation from all parts of the business has contributed heavily to BigCommerce’s recent flurry of product, customer, and market wins.

But to preserve a system where every employee is empowered to innovate, every employee must also adhere to standards designed to reduce subjectivity in decision making and ensure interoperability between systems. This is the story of how BigCommerce’s engineering team uses Cortex to drive best practice in building the open and flexible platform experience BigCommerce customers need.

When spreadsheets won’t cut it anymore

Shaun McCormick is a Principal Engineer at BigCommerce responsible for driving alignment between the Architecture and Platform disciplines in a model where teams, rather than titles hold the most influence. Shaun explains, "We don't have a top down technical decision-making structure at BigCommerce. That makes us more agile, but also requires us to be rigorous about standards of quality and measurement.”

Initially, just codifying standards wasn’t difficult. A simple spreadsheet could do the trick, given the team’s existing organization by disciplines like observability, security, and quality. But as the team surpassed 250 engineers, questions like, “Is this service up to par?” became harder to answer. It was clear the organization needed a common baseline of measurement void of subjective interpretation.

“To provide enterprise grade quality, you have to get feelings out of the system,” Shaun asserts. “That means you can’t think you’re performing well—you can’t hope your service is moving fast—you have to measure, and constantly integrate tons of data to do so accurately. The only way to do that at scale is to be rigorous with standards and thoughtful with automation that eliminates manual toil.”

Why BigCommerce chose Cortex

Before BigCommerce found Cortex, they were looking to improve in four key areas of engineering excellence:

  1. Continuous standards enforcement: Ensure every service is aligned with compliance definitions prior to and post-production.

  2. Strong post-incident playbooks: Once incidents are resolved, take necessary measures to ensure they aren’t repeated.

  3. Always-on DORA Metrics: Connect already strong observability data with more information on cycle time, deploy frequency, and change failure.

  4. Workflow optimization: Uncover bottlenecks that lead to disruption in team-wide metrics like PR cycle time.

Unsure which tools would address all four areas of interest, the team evaluated several cloud-based vendors with different specialties. While multiple tools could meet some requirements, BigCommerce identified two key characteristics that set Cortex apart from the rest: Flexibility in data and scoring frameworks, and automation that would ensure continuous alignment.

The importance of data model flexibility

Shaun elaborates on his need for a more flexible solution, “One of the primary reasons we chose Cortex is because the platform assumes foundational data modeling is highly personal to every organization. Other vendors were either very opinionated about how to measure a SaaS engineering team—and rigid in their architecture as a result—or just required an immense amount of rebuilding. We wanted something that allowed us to codify our culture, our values, and our business requirements—out of the box.”

Automation to ensure alignment while reducing noise

Once the team was confident Cortex could act as a true system of record for engineering, the next hurdle was to assess its ability to drive action against initiatives. Shaun knew that when it comes to driving expectations, success has nothing to do with where you sit in the organization. “You could be an executive making a request via Slack, but 90% of people won't be able to act on that before it scrolls away. You really need constant, targeted communication.”

He continues, “Cortex really stood out because not only can we define initiatives with data from across our stack, we can trust that data is automatically assessed for changes. It’s really helped us normalize a culture of ownership without resorting to constant emails.”

A look at BigCommerce’s customer-centric scorecards

Cortex has helped BigCommerce bring standards of quality to life for their team. But when it came to determining exactly which quality metrics the team would prioritize, BigCommerce took a very customer-centric approach. “We prioritize Scorecards and rules focused on customers. If it’s a problem for our merchants—something stopping them from handling a routine transaction, or communicating with their customers, it’s a problem for us.”

Domain scorecards

BigCommerce categorizes scorecards in two major groups. Those focused on mission critical customer support (quality, security, and incident management) fall into the “Domain” grouping. These scorecards demonstrate a conscious decision to focus on certain metrics, and not others. In practice, that means that while the team tracks the number of incidents, they weigh “time to resolve,” higher. Shaun explains, “Our customers know that no SaaS service is perfect—but they choose us, and stay with us, because they know issues are resolved quickly. And our Scorecards reflect that prioritization.”

BigCommerce tracks "follow ups," which are action items created during incidents or in retrospectives that the teams have identified as priority.

Action items are prioritized post-incident to ensure they don't happen again. This is reflected in their Incident scorecards.

Service scorecards

BigCommerce also has a “second tier” of Scorecards that are more inward looking. These were built to continuously raise the bar for operational maturity, observability, and performance. Shaun adds, “The Service-level scorecards are how we drive incremental improvement. They help us answer, “What’s slowing us down?” and “Where can we be more efficient?” These second tier scorecards are also huge drivers of interoperability between systems. Keeping services built to standards decreases friction in their service-oriented architectures, and lowers on-ramp costs when engineers move their development work streams between services.

Measuring the impact of Cortex

BigCommerce’s first use case for Cortex revolved around persistent visibility into service quality and ownership. This need was most apparent in the frequent questions Shaun heard regarding software ownership and dependencies.

"Just understanding who owned what, or where we were at on any given initiative, was a pretty tedious exercise. With Cortex, we save tons of time each week just having automated discovery of ownership and dependencies in services. Just knowing who’s on call and what is all connected to a service has also helped us more quickly resolve incidents and lower development time.”

Importantly, the team wanted to ensure that learnings from incidents would be programmatically captured to prevent similar situations in future. Shaun continues, “Scorecards and follow-up actions have been invaluable for helping us add more rigor to the time after an incident. Because the reality is that anyone can help fix an issue, but it’s more important to do the hard work to ensure it doesn’t happen again. Cortex has been by far our best tool for maintaining this level of accountability."

But programmatizing this work has been equally beneficial for the developers themselves. With self-serve templates used to build according to new best practice, and targeted notifications, developers can stay in the flow. “We know if an engineer gets pulled out of what they're doing, it takes them 30 minutes to re-engage. Cortex lets us reduce noise and keep our team focused on the highest priority work,” said Shaun.

Advice to new program owners

For those looking at Internal Developer Portals, Shaun has three key tips:

Executive alignment

While BigCommerce doesn’t have a top-down framework, executive tool fatigue was still top of mind for Shaun. “I really wanted our leaders across the organization to understand this wasn’t just another tool—it would be in service of getting more value out of existing tools, which would in turn improve our output, and customer satisfaction.”

The PoV roadshow

As part of their Proof of Value, BigCommerce and Cortex teamed up to build a Terraform provider to auto-populate their catalog in minutes, effectively onboarding the team’s projects into the portal. This made it easy for Shaun to really bring the value to life. “I talked to our Engineering leaders, and honestly, it wasn't hard to sell. To have something that could codify and auto-assess alignment to standards WE created - rather than some other company - was a slam dunk.”

Easing into dev adoption

Shaun didn’t want Cortex to feel like a sudden shift in developer workflow, especially as the team was already working hard to meet existing standards. The flexibility in the portal lets engineering teams onboard in a way that makes sense for their unique culture. “We actually kept notifications off for the first month, so we could slowly introduce the concept, and folks could organically venture in when searching for information. This helped us position it as an asset first, rather than an ask.”

For more information on how Cortex can help you drive alignment to standards of engineering excellence, visit our website, or take a tour today.

Talk to an expert today