Yes, I've named all my design files something clever. No, I will not be sharing.

Yes, I've named all my design files something clever. No, I will not be sharing.

Yes, I've named all my design files something clever. No, I will not be sharing.

Simplifying Grafana's agent management through centralized monitoring

Simplifying Grafana's agent management through centralized monitoring

Impact: Enables our client to efficiently manage over 100,000 agents with 2.3 GB of in-memory data and a monthly storage cost under $20. Developed into Grafana Alloy.

Role: I conducted in-depth research and led the design efforts for this initiative, which were presented at a company-wide event

Contribution

Research Prototyping Visual Research

Contribution

Research Prototyping Visual Research

Contribution

Research Prototyping Visual Research

Team

Melody Yu Adrienne C. Robert Fratto Erik Baronowski Kristina Durivage Paulin Todev

Team

Melody Yu Adrienne C. Robert Fratto Erik Baronowski Kristina Durivage Paulin Todev

Team

Melody Yu Adrienne C. Robert Fratto Erik Baronowski Kristina Durivage Paulin Todev

Timeline

June 2023 - August 2023 (3 months)

Timeline

June 2023 - August 2023 (3 months)

Timeline

June 2023 - August 2023 (3 months)

A few of the 100+ companies that run on Grafana

Overview

Overview

Managing many Grafana Agents is difficult because each one needs its own setup and monitoring. This makes it tough to grow, for both regular users and big companies.

How it currently works

Grafana Agent is a tool that works on different computers or devices, collecting information about how they're doing and sending it to a central place.

In the current setup, agents collect data from devices and send it to a central location like OSS LGTM or Grafana Cloud for monitoring. However, to make changes or fix issues, you need to do so individually for each device.

The Problem

The Problem

How can organizations efficiently and cost-effectively monitor the real-time state of over 100,000 Grafana Agents while minimizing storage costs?

13 Grafana users
1 B2B Client

To comprehend the needs of both individual users and companies utilizing Grafana Agent, we conducted a 60-minute one-on-one user interview with 13 Grafana users. Additionally, we sought insights from one of our partnering clients regarding their challenges.

3 Days to design

Within a tight one-week deadline, my team and I undertook the task of designing and implementing the agent page. Leveraging Grafana's established design system, I ensured consistency throughout the process. Valuable feedback played a crucial role, beginning with insights from experienced UX Designers.

Solution Phase

Solution Phase

What are the details, changes, and improvements with the new design?

On-page configuration

Allows users to copy arguments and use a direct action button to quickly access and configure the agent's source file, streamlining the process and ensuring consistency.

Categorized agent atatus

Centralized view of all agents and their components, with filtering, search capabilities, and a hierarchical presentation of information, making it easier to handle and scale large deployments efficiently.

Filters for scaleability

Categorizing agents by health status, along with "healthy" and "unhealthy" filter buttons for developers, prioritizes truly unhealthy agents for resolution, aiding large-scale deployments and cost savings.

Added documentation

Documentation is now integrated directly into the agent page, providing immediate access to necessary information and reducing user confusion.

12 hours saved per issue

12 hours saved per issue

Our solution integrated into Grafana Alloy enables efficient management of over 100,000 agents with minimal costs, handling 2.3 GB of in-memory data and keeping monthly storage expenses under $20. Additionally, it saves 12 hours per issue or downtime, ensuring smooth operations.

Learnings

Learnings

Effective Cross-Functional Communication

  • Recognizing the importance of clear and efficient communication when collaborating with diverse teams, ensuring everyone is aligned and informed.

Storytelling for Impact

  • Mastering the skill of storytelling to convey ideas in a compelling and engaging manner, igniting interest and buy-in from others.

Seeking Help

  • Recognizing the importance of asking the right questions when joining a team with limited engineering background, ensuring a quick understanding of the context and scope to ramp up productivity efficiently.