Searching for quality and speed? Observability can help | by Global Technology | McDonald’s Technical Blog | Jan, 2023

A new observability platform is helping Global Technology optimize its CI/CD process and enables the quick, easy delivery of quality code.

by Carol Glennon, Director, Technology & Infrastructure and Sam Fishman, Project Manager, ADLC Analytics

Consumers have elevated expectations for their digital experiences, and McDonald’s customers and crews have the same expectations and may even abandon their online carts if the checkout process is too difficult.

It is critical to our customers and crew that our Engineering teams can deliver the code to make their experiences fast, easy, and performant every time. Having a fully optimized CI/CD process with advanced automation, early anomaly detection, and quality tooling is essential to our ability to deliver fast, easy, and performant experiences to our customers and crew, while ensuring our systems remain resilient and highly scalable.

As a large engineering organization working on a range of codebases for digital ordering, loyalty, and point-of-sale systems, McDonald’s Engineering team needed observability to quickly pinpoint issues during CI/CD phases without having to undertake hours of manual investigation. The teams also needed single-pane-of-glass views into the quality of builds that span several types of test automation frameworks, steps in the quality-check process, and sometimes disparate CI/CD tooling. Ease of use was another key callout from teams. We knew the observability platform had to be easy to learn, and simple to add new views or double-click into an issue’s origin. These were essential so that new team members could quickly get up to speed and would not encounter added friction when under pressure to troubleshoot an issue.

The teams evaluated a range of observability options that supported the single-pane-of-glass state, as well as the ability to double click into areas of interest, and conduct root-cause work when things were not going well, alerts for various thresholds or issues in the build process that could be displayed in that same observability tool were something that teams identified as a wish-list item. The discovery process involved interviewing members of the Engineering and DevOps teams, as well as stakeholders in other areas of the business. We asked teams to whiteboard their ideal state, so we could match the wish lists with various types of observability approaches.

The team found that Datadog’s unique CI/CD and test visibility solutions brought production-level observability into McDonald’s pre-prod environments for the first time in the organization’s history.

Over the past six months, McDonald’s DevOps and Global Engineering teams have successfully integrated back-end pipelines across various CI tools into one unified Datadog view. This is helping McDonald’s tech teams become more efficient by automatically tracking key performance metrics (i.e., job duration, failure rate, success rate, error rate, throughput, queue time) in areas that required manual and inefficient work to monitor and troubleshoot. Global Engineering is now correlating pipeline and test errors with their associated log and cloud infrastructure data to make developers more efficient, reduce overall defect leakage into production and create actionable insights.

A rapid implementation path allowed engineers to start using the tools immediately and then expand on dashboards or other observability features incrementally, as needed. The various dashboards and views are suited to a range of users and skill levels. For example, business stakeholders can easily see key metrics indicating the quality of an upcoming release as it proceeds through the testing process. Release engineers can see multiple CI tool readouts from a single view and engineers can rapidly double click into the parts of that same view to see underlying issues with a build.

Unified observability provides teams with a common platform to share information about the pre-production development process and the systems that support it. This common platform has sparked conversations about workstreams and areas for improvement that help us deliver on our mission of providing high-quality, performant digital experiences to our customers and crew.

Going forward, our teams will likely continue to use observability as a key part of our process and the tools we use to uncover new opportunities in our CI/CD flows, as well as empower teams to find solutions as a team faster and easier than ever before.

Source link