NetApp's SRE team has expanded their monitoring scope to include internal AI developer tools, creating new operational challenges alongside their existing critical infrastructure.
Lead SRE Dustin Sorge will demonstrate how they've adapted their 8-year InfluxDB implementation to handle these resource-intensive workloads, including the specific monitoring strategies they've developed for AI tools and the optimization techniques that enable faster trend detection and coordinated incident response across their hybrid environment.