AI/ML Execution Monitoring

Building a scalable monitoring solution for Capital One's ML/AI applications, processes and tools throughout the enterprise.

Context

Engineers need to know what's going on with their models and processes as they run (execute) to ensure they're meeting SLA and business obligations.

Execution monitoring is just that -- it helps modelers react when things go wrong and understand their systems better, in order to troubleshoot and make proactive improvements.

This type of monitoring required the building of an in-house solution that could scale to other ML applications, such as model scoring and system status alerting.‍

Impact

Defined scope from business needs, led user research, and developed UX/UI design for an internal AI/ML monitoring tool. Collaborated with product partners to deliver pilot design to 300+ model developers
‍
Strategically defined experience of multiple features, like AI anomaly detection and automated notifications, for future releases

Role

Lead Designer

Team

1 PM and 4 Tech

Time

6 Months

skills

DISCOVER

Objective  

Research and develop a UX/UI solution that allows users to view all services and jobs running on Capital Ones machine learning platform, information about their status, the creation of dashboard views and alerts at different scopes.

Contribution

Translated business needs into project brief, build moderator/user guide, recruited model developers and documented existing constraints.

Result

Aligned on project scope with partners, kicked off research and established design objectives 

‍

No items found.

Define

Objective

Synthesize collected interviews into prioritized set of recommendations and prototype

Contribution  

Moderated and led 10 interviews, distilled information into research report, documented findings and provided strategic recommendations

Result

Research report was leveraged in prioritization conversations with partners

‍

No items found.

Design

Objective

Create a scalable design and back-end system that allows for incremental roll out of features.

Contribution  

Built an object-oriented UX (OOUX) framework and approach to collaborate with product and tech partners. Delivered multiple data requirements and hierarchy of components.

Result

Prioritized a set of features, like anomaly detection, email notifications, rules engine, etc. and the technical data requirements needed to support those features.

No items found.

Synthesize

Objective

Test the OOUX framework as a low fidelity design, gather feedback from users on the data elements and information architecture

Contribution 

 Led 5 workshop sessions with model developers to gather feedback on early prototype design, synthesized findings into refined visual

Result

Developed clear understanding of UX/UI design and necessary data elements.

No items found.

Deliver

Refine the design based on feedback and design expertise into QA pilot

Built a high fidelity, click through prototype of the experience in Figma.
Created a hand off document. Worked with product and tech to prioritize features based on data feasibility.
Limited release of central dashboard for execution monitoring compute jobs for 300+ model developers

No items found.