JPMorgan Asset and Wealth Management’s Technology Resiliency team wins the 2017 AFTA for the best infrastructure initiative, taking the crown from last year’s victor, Northern Trust.
The resiliency initiative was, in part, a move to streamline a process the firm constantly has to perform, says the asset manager’s head of production management, Joe Pedone.
“We’ve leveraged some cutting-edge automation techniques to take that manual work, and basically shifted it to the left and made it highly automated and low-resource intensive,” Pedone says. “We’ve reduced up to 80 percent of the work required to execute a resiliency event, which is a significant amount of time. The good news is it’s not just saving on labor but we can be resilient faster.”
The project entailed automating many of the tasks associated with resiliency tests, such as making sure runbooks are updated, the failover event to the secondary data center occurs, and collection of evidence is put in a document repository. The process uses a microservices library, so that when programmers need to undertake a task such as flipping the domain name system over to the secondary server, they do not need to write out a script, but instead can take a pre-written one from the library.
The process used to be manually intensive, Pedone says, and with the number of times JPMorgan has had to undertake resiliency events, it entailed a lot of tedious work. The firm normally runs fail-over tests almost every week for each application it offers, moving from a primary data center to a secondary one.
“Every critical system at JPMorgan is required to have high levels of stability and resiliency. A lot of the work that’s done in the resiliency program has traditionally been manual,” Pedone says. “We’re also required by our regulators to validate and test that this resiliency works at a minimum of an annual basis and in most cases multiple times a year. It generates an incredible amount of labor from preparing for that work, conducting the testing, and doing the validation.”
One of the reasons why the asset manager’s resiliency program moved so quickly was down to the decision to create a team whose sole focus was to build out the automation program. Pedone says development of the initiative took around three to four months, with the first live, fully automated fail-over test occurring in the first quarter of 2017.
Ultimately, he notes, automation is a good strategy for firms looking to improve resiliency systems and says it’s important that it becomes a priority for the industry as a whole.
Linedata's Dave Remy and Chris Condron discuss all things CQRS and James Rundle goes over some of the big news breaking in the crypto space.Subscribe to Weekly Wrap emails