Ready for anything with the PagerDuty Operations Cloud
Article by PagerDuty VP product Dormain Drewitz.
In a world of digital everything, teams face increasing complexity. Ever-growing dependencies across systems and processes put customer and employee experience, not to mention revenue, at risk.
To be ready for anything in light of this increasing digital complexity and dependency, operations must transform from manual, rigid, and ticket queue-based, to a continuously improving system that allows focus on customer experience, delivers operational speed AND resilience, and is heavily automated and augmented by machine learning and AI. Only then can teams move toward a more proactive posture to reduce the burden of manual toil, avoid burnout, and preserve focus.
PagerDuty’s mission is to revolutionise operations so that teams spend less time on reactive, break-fix work, and more time delivering new innovation while meeting their desired resiliency objectives.
At Summit 2022 we have announced several updates to the PagerDuty Operations Cloud that move teams towards this vision.
PagerDuty announced new automation capabilities for creating custom incident response workflows, focusing responders on issues that matter, more ways to speed up MTTR through automated diagnostics and remediations.
Incident Workflows: First, we’re excited to share that PagerDuty is integrating the powerful Catalytic workflow engine to support customisable incident workflows. With Incident Workflows, you define the workflow logic to configure a sequence of common incident actions such as adding a responder, subscribing stakeholders, or starting a conference bridge into an orchestrated response.
Auto-Pause Incident Notifications: Responders can also use automation to avoid unnecessary disruption. With PagerDuty Event Intelligence, responders can suppress transient noise with Auto-Pause Incident Notifications. This feature applies machine learning to automatically detect and pause transient alerts that historically auto-resolve themselves. For customers who experience these types of alerts, this can make a big difference in helping teams stay more productive, or better yet, stay asleep when they don't need to be woken up by something that will self-heal. In just the first three months after release, Auto-Pause Incident Notifications paused more than 350,000 flapping alerts.
Automation Actions… everywhere: We’ve heard from our customers that they want to empower their first responders to diagnose and remediate common issues through automation. Last year, we introduced PagerDuty Automation Actions to securely invoke automation directly from PagerDuty. Now we’re making Automation Actions available through the PagerDuty mobile app, and within Slack. Responders can now resolve issues faster from wherever they respond. But why wait for a responder to run common diagnostics? Event orchestration users can now trigger these same automated diagnostics proactively even before your first responders acknowledge an incident, giving them the information they need to act faster.
Finally, we’ve extended this ability to provide automation to your customer service teams. Your customer service agents can now invoke automated validation tests from PagerDuty Customer Service Ops to determine if a customer’s issue is related to a system problem, and proactively receive information about known problems that could potentially be affecting their customer’s experience.
PagerDuty Runbook Automation: PagerDuty is making it easier for our customers to orchestrate automated diagnostics, remediations, and day-to-day operations through our recently launched SaaS offering PagerDuty Runbook Automation.
Connect everyone and everything
The number of different systems in the modern digital landscape has exploded. Keeping systems and teams synchronised is critical to have a complete picture of what’s happening and what needs attention. With APIs, webhooks, and over 650 integrations, the PagerDuty Operations Cloud integrates with your tech stack, breaking through silos so that teams can work better together and provide customers with a better experience.
Status Update Notification templates: PagerDuty has added new ways to keep internal stakeholders updated. Now you can standardise internal communications during incident response with HTML templates. For example, customise responses in a rich text editor and add images, screenshots, or graphs.
PagerDuty for Salesforce Service Cloud updates: Customer service agents are on the front lines during an incident and shouldn't be left in the dark. This is where PagerDuty Customer Service Operations comes in. With our deepening integration with Salesforce Service Cloud, PagerDuty for Customer Service Operations creates a real-time link between DevOps, ITOps and Customer Service teams with Salesforce Incident Objects and cases. This creates a single source of truth that brings all teams together.
CollabOps updates: CollabOps has become the way in which many teams work and communicate in real-time. We've simplified our integration with Slack with a single connection management page.
Deliver speed and flexibility
When it comes to incident response, it often feels like nothing can happen fast enough. Time is often wasted swiveling between systems to get information, manually reviewing notifications, and digging through interfaces to find what you need. Helping teams focus helps them work and drive to resolution – faster.
Custom Fields on incidents: It starts with surfacing information where you need it most. Today, PagerDuty is announcing Custom Fields on incidents. Custom fields give responders access to critical contextual information from any surface, whether API, web, mobile app, or SMS. With more information pulled into custom fields, responders can triage and resolve issues faster.
PagerDuty mobile app updates: We've also redesigned the home screen of the PagerDuty mobile app. Key responder careabouts are now front and center, further accelerating incident resolution. With a single tap, responders can view details and take action. The carousel displays all options for responders to gain understanding while on the go.
Terraform support for event orchestration: Event orchestration helps teams cut down on manual event processing by harnessing complex logic and rule nesting. We're seeing customers replace ten event rules with a single event orchestration — that's 90% more efficient! You can now configure orchestrations in Terraform to easily create, manage, and modify orchestrations at scale.
An incident doesn’t end with resolution. Investing in culture, implementing best practices, and learning from what happened in previous incidents helps teams build resilience.
Service Standards: As customers mature their digital operations, they often want to standardise what ‘good’ looks like across teams. With Service Standards, teams can configure services according to best practices.
Next generation reports: Developing a more proactive operational posture starts with understanding how things are working today. From there, teams can identify opportunities for fine-tuning and improvement. Newly enhanced reports in PagerDuty help teams make better data-driven decisions.
First up: the Service Performance Report has new interactive visualisations, intuitive service drill-down capabilities, a new Response Effort metric measuring the engagement time to solve an incident(s), and more filtering options to help prioritise your incidents.