Impact: Enables our client to efficiently manage over 100,000 agents with 2.3 GB of in-memory data and a monthly storage cost under $20. Developed into Grafana Alloy.
Role: I conducted in-depth research and led the design efforts for this initiative, which were presented at a company-wide event
A few of the 100+ companies that run on Grafana
Managing many Grafana Agents is difficult because each one needs its own setup and monitoring. This makes it tough to grow, for both regular users and big companies.
How it currently works
Grafana Agent is a tool that works on different computers or devices, collecting information about how they're doing and sending it to a central place.
In the current setup, agents collect data from devices and send it to a central location like OSS LGTM or Grafana Cloud for monitoring. However, to make changes or fix issues, you need to do so individually for each device.

How can organizations efficiently and cost-effectively monitor the real-time state of over 100,000 Grafana Agents while minimizing storage costs?
13 Grafana users
1 B2B Client
To comprehend the needs of both individual users and companies utilizing Grafana Agent, we conducted a 60-minute one-on-one user interview with 13 Grafana users. Additionally, we sought insights from one of our partnering clients regarding their challenges.
3 Days to design
Within a tight one-week deadline, my team and I undertook the task of designing and implementing the agent page. Leveraging Grafana's established design system, I ensured consistency throughout the process. Valuable feedback played a crucial role, beginning with insights from experienced UX Designers.

What are the details, changes, and improvements with the new design?
On-page configuration
Allows users to copy arguments and use a direct action button to quickly access and configure the agent's source file, streamlining the process and ensuring consistency.
Categorized agent atatus
Centralized view of all agents and their components, with filtering, search capabilities, and a hierarchical presentation of information, making it easier to handle and scale large deployments efficiently.
Filters for scaleability
Categorizing agents by health status, along with "healthy" and "unhealthy" filter buttons for developers, prioritizes truly unhealthy agents for resolution, aiding large-scale deployments and cost savings.
Added documentation
Documentation is now integrated directly into the agent page, providing immediate access to necessary information and reducing user confusion.
Our solution integrated into Grafana Alloy enables efficient management of over 100,000 agents with minimal costs, handling 2.3 GB of in-memory data and keeping monthly storage expenses under $20. Additionally, it saves 12 hours per issue or downtime, ensuring smooth operations.
Effective Cross-Functional Communication
Recognizing the importance of clear and efficient communication when collaborating with diverse teams, ensuring everyone is aligned and informed.
Storytelling for Impact
Mastering the skill of storytelling to convey ideas in a compelling and engaging manner, igniting interest and buy-in from others.
Seeking Help
Recognizing the importance of asking the right questions when joining a team with limited engineering background, ensuring a quick understanding of the context and scope to ramp up productivity efficiently.