If you’ve ever attempted to cultivate a backyard vegetable garden, flower garden for your balcony, hydroponic tower, or other system to grow plants, you’ll understand that it’s not a matter of tossing down some seeds in soil and letting nature take its course. Without proper monitoring and management, ecosystems may grow out of control, or simply wither and die off. And although there are plenty of edible weeds in the wild, the majority of us would likely prefer well-curated tomatoes and cucumbers. Any great ecosystem, be it a garden or an internal developer platform, requires continuous care and feeding to flourish.
Tending the garden
Nipping problems in the bud—no pun intended—requires vigilant monitoring and crop management. By constantly monitoring our gardens, we can identify and address anomalies quickly, before they take over like an invasive species or insect infestation that can spiral out of control and decimate your harvest. Utilizing monitoring tools and processes that enable fast feedback loops, a good gardener or engineer can cultivate a functioning system with optimized performance for the best end product possible no matter if it’s strawberries or software.
Just as a gardener is responsible for tending to their greenhouse ecosystem, platform engineers are critical for maintaining and managing a balanced application ecosystem. They need to monitor different metrics, such as throughput, response time, and latency, to identify potential issues and make decisions on component optimization and resource allocation. Monitoring also allows them to detect and address performance bottlenecks and resource utilization patterns. Gardeners need to tend their plant beds with weeding, watering, pruning and pest control, and platform engineers must focus on managing the planning, scheduling, budgeting, and security of internal developer platform components.
Policy enforcement and governance are a crucial part of the application development and delivery process. Keeping wildlife like rabbits and deer out of your garden is important in maintaining a safe place for plants to grow. Platform engineers keep their ecosystems safe by making sure that applications and services are delivered optimally in a secure and stable way, and that resources are allocated efficiently. They must also maintain device and software updates, along with secure configurations. Any changes should be documented and monitored to ensure they are kept up to date, and that any undesirable changes are identified quickly. Platform teams and engineers must find the delicate balance between enforcement and innovation, monitoring and balancing enterprise security standards, regulatory requirements, and developer experience to ensure they can deliver great software more quickly and securely.
Five benefits of monitoring and managing your internal developer platform
- Improved system performance – By monitoring for bottlenecks, spotting trends and identifying potential issues well before they impact performance, engineering teams can implement proactive strategies to ensure optimum availability, reliability, scalability, and response times. Further, regular maintenance such as patching and upgrading hardware and applications will also ensure greater efficiency and user experience. All of this puts organizations in a better position to reduce system disruptions, anticipate future needs and, ultimately, achieve greater efficiency and profitability.
- Cost reduction – Optimizing and managing an application platform properly can offer organizations various ways to cut costs. Keeping an eye on the performance and use of the platform can unveil areas for optimization, leading to enhanced resource utilization and decreased hardware and license fees. Proactively keeping tabs on the application's health can spot prospective issues quickly, saving the costly consequences of service outages and downtimes. Moreover, thorough monitoring and oversight allow IT teams to discern where resources can be redistributed to make a budget-friendly difference.
- Scalability – Through monitoring, platform engineers can achieve scalability more efficiently by using data to replicate or adjust resources according to performance and capacity demands. This helps streamline the manual scaling and compliance management processes, as well as supply in-depth proficiency into application performance, resource utilization, and user experience over multiple servers and services.
- Enhanced security – Security is a main concern in platform engineering. By providing guardrails for developers through golden paths, and tracking various application activities engineers can detect any malicious or unauthorized activity, making it easier to respond quickly and mitigate the potential damage. Monitoring can also provide insights into what security measures may need to be implemented or improved.
- Improved feedback loops – Developers can keep their applications running at their best and continually monitor user engagement by leveraging feedback loops that collect usage data on performance in production. With this centralized control in hand, engineers can make modifications to their apps with speed and efficiency, and then see the results instantly. Automated alerting and notifications will also help engineers quickly detect and repair any issues that may arise.
Getting started: planting the seeds of success
Here are some quick ways to start monitoring and managing your application platform for more optimal results:
- Secure your deployment environment – Start by securing the server and network configuration of the application platform and make sure the system is updated with the latest security patches. You can also implement security practices such as controlling access to the platform, monitoring for suspicious activity, and staying up to date on threats and vulnerabilities.
- Establish system baselines – After assessing your platform’s system components, establish baseline measurements of hardware and software configurations, as well as usage statistics. Next, define a baseline of acceptable performance metrics, which could include average uptime, system response times, network throughput, or any other relevant metrics. Set up a system to meet the defined baseline performance by adjusting system configurations, fine-tuning hardware, adding or removing resources, or optimizing code.
- Establish alerting rules – Identify the events that you want to trigger an alert, and determine the best way to receive them. Use the platform-specific tools available in your software to create the alerting rules. Depending on the platform, you may need to define the conditions for when the alert will be triggered (such as a threshold for a certain metric, or an event name or type). Monitor these new rules for accuracy, and adjust as needed.
- Monitor application performance – Within the application architecture, identify the components, how they are connected, and their overall flow. Monitor the application layer and capture performance metrics, such as request latency, errors, and response times. Configure the monitoring system to alert you when certain thresholds are breached.
- Automate processes – Take time to explore monitoring and management automation solutions–cloud-based services, open source, and enterprise-level solutions are all viable options. After finding the best tool and technology for your needs, begin setting up your monitoring and management system by configuring the monitoring agents, collecting data, storing logs, and establishing alerts. Once you have a basic system in place, review the performance of your automation periodically to ensure accurate data and efficient operation.
Tools for tending your platform ecosystem
VMware offers solutions to help you monitor and manage your processes and optimize your platform. These solutions include VMware Tanzu Application Platform, VMware Tanzu for Kubernetes Operations, and VMware Aria. Together, these services enable platform engineers to quickly identify and address any issues, ensuring that their platform is secure, stable, and efficient.
Tanzu Application Platform helps platform engineers to manage, monitor, and build secure software supply chains. It provides tools for deploying containerized applications, and simplifies and accelerates application development through its integrated developer portal, integrated development environment (IDE) plug-ins, and command line client. Tanzu Application Platform includes tools for monitoring and managing container-based applications and can schedule them to run as workloads in your Kubernetes clusters. It automates the deployment of applications, scans them for vulnerabilities, auditing and logging each of these activities. Tanzu Application Platform also integrates with VMware Tanzu Service Mesh to ensure secure connectivity and networking for applications, providing platform engineers with full visibility into their applications and services, allowing them to quickly identify and address any issues.
Tanzu for Kubernetes Operations simplifies container management at-scale with tools, automation, and data-driven insights enabling platform teams to boost developer productivity, secure applications and data, and optimize infrastructure and workload performance for any Kubernetes cluster, running on any cloud. It provides a robust policy engine to simplify multi-cloud, fleet-wide Kubernetes cluster management and brings consistent control over cluster, security, and data protection policies, while providing developers a choice regarding where workloads run. It also enables end-to-end encrypted connectivity with built-in service discovery, traffic routing, and load balancing to enhance application availability, resiliency, and scalability across multiple clusters and clouds. Platform engineers can easily apply policy or view and understand baseline cluster health of the attached clusters or provisioned clusters running in any cloud.
As part of Tanzu for Kubernetes Operations, VMware Aria Operations for Apps provides continuous in-depth observability across modern cloud applications. Combined with the other VMware Aria Operations offerings, they paint the full picture of a platform’s infrastructure performance, capacity requirements, network traffic, and dependencies, along with smart troubleshooting with logs. VMware Aria Automation brings a full workbench to set up and self-service configuration with governance and integrated security (through secure hosts and secure clouds). The brand new VMware Aria Guardrails ensures preemptive control through templatized policy configuration and continuous drift monitoring, geared towards public clouds.
Finally, VMware Aria Cost helps businesses optimize and control their cloud spend and improve their FinOps practice across cloud environments. With Aria Cost, you can easily understand cloud usage and cost for custom groupings such as applications, Kubernetes clusters, engineering teams, or lines of business; optimize cost by eliminating zombie resources, rightsizing infrastructure, and making data-driven decisions around reservations and savings plans commitments; and control your environment at scale via policy-driven automation.
Monitoring and management are key components of platform engineering and should be addressed as part of a healthy platform ecosystem. They ensure that platform engineering projects are completed on time and within budget and save companies time and money down the line. In addition, they encourage collaboration among team members, enhance quality and accuracy in the final product, and facilitate the proper documentation of the entire process. Ultimately, monitoring and management are essential for successful platform engineering projects.
More ways to learn
You can learn more about our vision for developer platforms watching recent PlatformCon talks, Michael Coté's "7 Lessons from 7 Years of Running Platforms" and Bryan Ross's "Platform-as-a-Superpower." You can also find out how VMware can help with your app and platform monitoring and management goals at tanzu.vmware.com. Last, check out the below resources for more information on monitoring and management: