[1] NoNameFile             2024

Simplifying Grafana's agent management through centralized monitoring

Simplifying Grafana's agent management through centralized monitoring

Impact: Enables our client to efficiently manage over 100,000 agents with 2.3 GB of in-memory data and a monthly storage cost under $20. Developed into Grafana Alloy.

Role: I conducted in-depth research and led the design efforts for Grafana's company wide hackathon, which were presented at a company-wide event

Contribution

Research Prototyping Visual Research

Contribution

Research Prototyping Visual Research

Contribution

Research Prototyping Visual Research

Team

Melody Yu Adrienne C. Robert Fratto Erik Baronowski Kristina Durivage Paulin Todev

Team

Melody Yu Adrienne C. Robert Fratto Erik Baronowski Kristina Durivage Paulin Todev

Team

Melody Yu Adrienne C. Robert Fratto Erik Baronowski Kristina Durivage Paulin Todev

Timeline

June 2023 - August 2023 (3 months)

Timeline

June 2023 - August 2023 (3 months)

Timeline

June 2023 - August 2023 (3 months)

A few of the 100+ companies that run on Grafana

Context

What are agents?

Grafana Agent is a tool that works on different computers or devices, collecting information about how they're doing and sending it to a central place.

In the current setup, agents collect data from devices and send it to a central location like OSS LGTM or Grafana Cloud for monitoring. However, to make changes or fix issues, you need to do so individually for each device.

Problem

How can organizations efficiently and cost-effectively monitor the real-time state of over 100,000 Grafana Agents while minimizing storage costs?

To comprehend the needs of both individual users and companies utilizing Grafana Agent, we conducted a 60-minute one-on-one user interview with 13 Grafana users. Additionally, we sought insights from one of our partnering clients regarding their challenges.

The Challenge

3 Days to Design!!

Within a tight one-week deadline, my team and I undertook the task of designing and implementing the agent page. Leveraging Grafana's established design system, I ensured consistency throughout the process. Valuable feedback played a crucial role, beginning with insights from experienced UX Designers.

Solution

…the Final Experience ⤵

What are the details, changes, and improvements with the new design?

On-page configuration

Allows users to copy arguments and use a direct action button to quickly access and configure the agent's source file, streamlining the process and ensuring consistency.

Categorized agent status

Centralized view of all agents and their components, with filtering, search capabilities, and a hierarchical presentation of information, making it easier to handle and scale large deployments efficiently.

Filters for scaleability

Categorizing agents by health status, along with "healthy" and "unhealthy" filter buttons for developers, prioritizes truly unhealthy agents for resolution, aiding large-scale deployments and cost savings.

Added documentation

Documentation is now integrated directly into the agent page, providing immediate access to necessary information and reducing user confusion.

Reflections

Things I've learnt from this project.

Looking back on my summer at Docker, I've gotta say, it was a wild ride that taught me way more than I expected. Two major takeaways stick out in my mind.

Cross-functional communication skills: Working with diverse teams taught me the critical importance of clear, efficient communication. I learned to adapt my communication style to ensure everyone, from engineers to product managers, was aligned and informed. This skill became essential in driving projects forward and fostering collaboration.

Good storytelling creates big impact: I discovered the power of storytelling in design presentations. Beyond showing mockups, I learned to craft compelling narratives that engaged stakeholders and secured buy-in for new ideas. This approach transformed how I present designs, making concepts more relatable and exciting for diverse audiences.

Embrace the learning curve: Joining a team with limited engineering background highlighted the importance of asking the right questions. I learned that seeking help is a strength, not a weakness. By efficiently gathering context and understanding project scope, I was able to ramp up quickly and contribute effectively, despite initial knowledge gaps.