The joint Pivotal-client team celebrates the first release to market, helping to serve the citizens of New South Wales, Australia.
The answer to the simple question “Which problem do we solve first?” often determines the success of any product. The response, however, is more complex—especially when we are talking about strangling an existing monolith. If we dial up the challenge further, with the pressures of the government sector and the innate complexity of the payments domain, then the answer to the original question can almost sound improbable.
At VMware Tanzu Labs Sydney (formerly Pivotal Labs), we found a way to address this problem in a methodical fashion. The biggest differentiator among the different rewrite approaches is the view of the return on investment (ROI) curve over time. The success of such an initiative is highly correlated to the speed at which we see ROI. The earlier the sponsor of the initiative sees a return, the longer the initiative can run to reach its intended end state.
A real-life case study
Late in 2018, the Tanzu Labs Sydney team started an engagement with a government agency that had a payments platform that had been operational since 2015. It enabled the collection of payments over online, staff-assisted, kiosk and IVR channels. For the first couple of years it was operational, the platform provided a net savings of $3 million per year. However, as the platform was scaled across the government, issues arose that turned the platform from a cost-saver into a cost center. Our discovery projected a net loss of over $10 million per year by 2022.
Our mission was to modernize the platform in a way that could both consolidate and transform the customer payments experience, while optimizing costs for the government. However, we identified problems across the legacy platform that made it not suitable for achieving this vision.
The key issues were:
-
Multiple cloned channel systems all sharing a common database.
-
Performance issues when handling the increased load.
-
Financial write-offs caused issues in the reconciliation of payments with the bank statements and requests.
-
System outages resulting in the loss of more than 200,000 productive hours every year.
-
High maintenance costs and enhancements costs.
-
Tight coupling with the transaction systems, resulting in multiple cloned copies of the same payments system running in production.
-
Lack of monitoring and alerting resulting in delayed knowledge of any issue.
-
Poor integration with different payment gateways, resulting in missed payments.
-
Lack of safety net, resulting in huge regression cycles.
-
A high percentage of manual involvement in reconciliation.
Our strategy
After understanding the problems, the team decided to build a modular, scalable, and robust platform that improves agency onboarding, reduces the cost of change, improves operational efficiency, and provides analytics for continuous learning. The key metric to improve is the rate of straight-through processing, with the goal of reducing cost by reducing manual intervention.
Our approach
Key to strangling the monolith is coming up with a framework that will enable the team to prioritize the work. The framework should identify the independent modules that will generate the maximum return in the shortest duration with the least resistance to implement.
The first step is to carve out the modules, which are solutions to achieve an outcome and, in the process, address the problems discovered. Each module is a distinct business capability that can be enhanced, scaled, and deployed independently.
The second step is to build a prioritization process that will enable the team to pick the most impactful module to build. For this we defined four assessment criteria:
-
Return on investment
-
Cost of development
-
Challenges to go live
-
Learning potential/risks
Return on investment
The return on investment for each module will be the difference between the benefits (costs saved/avoided, revenue change) and the cost of ownership of the module on an ongoing basis. When we calculated the value, we did a quick assessment of each module against the four criteria listed above. The module which came out highest in terms of ROI and learning potential, and low on costs and resistance, was chosen first. We have continued to use this approach to prioritization during the course of the initiative and it continues to serve us well.
Based on the ROI assessment above, the team first looked into the back-office pain points and then steadily moved to build a new online payments-processing solution. The back-office processing module went live first. After that, a small team continued to incrementally enhance the module while the rest of the team went to the next module.
The platform now has a new recording and reconciliation function for back-office processing. The recording function provides a channel-agnostic recording mechanism across any processing systems to enable centralized investigation, reconciliation, and analytics across the entire platform. The reconciliation function is also channel-agnostic, calling and processing systems to enable seamless reconciliation for the finance team across the entire platform.
The online payments module was the second module to be prioritized. The online payments module can collect payments through card, PayPal, and BPAY (an electronic bill payment system in Australia), and enables agencies to have an easy and consistent integration pattern to help fulfill transactions. This has been built in a way that simplifies the addition of new payment methods like ZIP (an Australian digital wallet), NPP (an industry-wide payments platform for Australia), and mobile wallets.
Great performance results
In addition to all the automation we have in place to build out the platform—such as Test-Driven Development, CI/CD, and trunk based development—we also wanted to put the new platform through some rigorous security and performance testing. The payments platform was put through an independent assessment before it went live. This involved testing on high stress, load, and security. We are pleased to report great results from these tests.
Resilient – Can handle more than 6 million requests in a day
The payments platform handled a peak load of 75 transactions per second for a duration of 20 minutes, processing 100% of the requests in less than 2 seconds with zero failures.
Secure – Has zero vulnerabilities to exploit
Independent security experts conducted penetration tests and found zero vulnerabilities on the application.
Scalable – Can scale linearly with load
The application has been designed to scale by increasing the number of instances linearly with increasing load.
Robust – Confident and continuous production deployments
The system has very high automated test coverage, as well as automated and zero-downtime deployments. This enabled the team to continuously deploy to production, adding features any time of the day with confidence.
Expected ROI and benefits
The new online channel and back-office modules are expected to achieve an additional financial benefit of more than $10 million per year for the government. It is expected to improve straight-through processing rates from the current 85% to over 95%.
To recap, by taking an ROI-based approach to strangle the monolith, the team ensured that the delivered product added value immediately. The continuous prioritization of the modules was performed based on a consistent criteria. And each module, once completed, shared a common metric to ensure they all drive the collective success of the platform.
This ROI-based approach also helped in ensuring that the pain points that were causing issues in the legacy system didn’t carry over to the new one. The new platform is expected to generate the government $40 million per year in benefits when completed in its entirety—a phenomenal return on the investment they have made to develop it.