JPMorgan’s chief data officer wowed CEO Jamie Dimon with a video depicting the bank’s databases as a solar system. It set him up to lead one of largest Wall Street data projects.

Rob Casper JPMorgan

  • As JPMorgan’s chief data officer, Rob Casper is involved in one of the largest data projects on Wall Street today.
  • His role, and that fact that it even exists, shows how important data is to Wall Street’s plans to hold on to customers and markets, despite the generational upheaval being brought about by technology.

Rob Casper stands up and takes a piece of laminated paper from behind his desk.

Standing in his 39th floor office in JPMorgan’s glass-walled midtown Manhattan headquarters, the bank’s chief data officer wants to make a point.

The card he holds lists iterations for the name of Long Term Capital Management, the hedge fund that collapsed more than 20 years ago. Full or partial acronyms, hyphens between different words, three- and four-letter abbreviations of the same word. The different titles take up the card’s front and back.

They’re just a sample of the names that Casper’s former firm, Morgan Stanley, used to refer to the hedge fund in its computer records, making it difficult to calculate how much the bank might lose in 1998 after Russia’s debt default sent the fund spiraling down. Wall Street bailed out LTCM in part because each firm didn’t know how much it would lose.

"You’d come in one day and say we found another $50 million in exposure, and your bosses would say, ‘Why didn’t you see this earlier,’" Casper said. "And the reason was, in this case, management was spelled ‘MGT,’ instead of ‘MGMT’ or ‘management’ in other places."

Read more: JPMorgan is in the middle of a ‘massive process’ of cleaning up thousands of databases, and it’s hoping to unleash AI once it’s finished

The experience proved formative for Casper, and laid the foundation for the work he’s now doing at JPMorgan. In what is one of the largest data projects on Wall Street today, Casper is consolidating the bank’s data architecture across hundreds of thousands of databases acquired during decades of acquisitions. He’s working with the firm’s machine-learning experts to clean the data and make it easier to use. And he’s developed a system of governance around how new data gets handled.

His role, and that fact that it even exists, shows how important data is to Wall Street’s plans to hold on to customers and markets, despite the generational upheaval being brought about by technology. In the coming years, the ability to find signals in the reams and reams of information owned and collected by banks may mean the difference between the industry’s winners and losers.

Rob Casper JPMorganCasper brings an unlikely background to the task. He started his career as a lawyer, at Cravath, Swaine, and Moore, the white-shoe law firm, representing the International Swaps and Derivatives Association on derivatives matters. Three years later, he joined Morgan Stanley, where he stayed until 2014. A stint as GE Capital’s first data chief preceded his 2016 move to JPMorgan. He has no technical training.

"I like the methodical approach that good data governance requires," he said, adding that in corporate America many people don’t take the time to address one issue before moving onto a second. "It’s really hard to execute a complex program like data governance. Thinking like a lawyer has helped in that regard."

At JPMorgan, Casper has taken a hard look at the hundreds of thousands of databases the bank maintains and begun to sort out which ones are integral to the bank’s functioning and which ones should be decommissioned. The bank maintains 390 petabytes of data storage. It would take 745 million floppy disks to store one petabyte of data.

He’s done so in a novel way. He asked staff to identify the most proximate databases for their needs, the ones they pull data from and those they send data to. Their answers will be collected and stitched together into a map of how JPMorgan’s databases are connected. He expects to see find orphaned databases that he can decommission.

In April 2018, Casper presented a video to JPMorgan’s management team to explain his work. The video shows the passage of time, starting with a cloud that looks like a solar system. As the video plays, lines begin to form between some of the stars, which grow in size and brightness. As the stars with the most links get larger, the solar system disappears into the background, leaving a network of connected stars linked by brightly colored lines.

The stars are databases and his video serves as a representation of how they communicate. CEO Jamie Dimon and copresident Daniel Pinto immediately got it, Casper said.

Read more: A subtle shift is underway at JPMorgan, and it illustrates Amazon’s influence on the Wall Street giant

That’s given him the budget and mandate to pursue what might otherwise seem rather mundane.

But its effect on JPMorgan may be huge. Shutting down databases ticks a number of helpful boxes: It saves money that was going to maintain them, improves customer-data safety by shrinking the number of hacker targets, gives the bank more control over its tech infrastructure, and helps to quickly identify and fix outages, according to Casper. JPMorgan spends nearly $11 billion a year on tech, and any money it can save could fall to the bottom line or be reinvested in innovation.

It also means wrangling the data into a form that can more easily be used for artificial intelligence and machine learning initiatives, he said. Casper is already working with Manuela Veloso, who heads the bank’s artificial-intelligence research. And he’s planning to design a search function for employees (a prototype already exists) to more easily query JPMorgan’s databases.

Read more: Credit Suisse’s CTO says that AI could create huge opportunities on Wall Street and that banks haven’t even scratched the surface

Rob Casper JPMorgan"Everybody loves artificial intelligence and machine learning, but without good data governance the benefits are likely compromised," Casper said, referencing the Wall Street buzz around AI and the promise of using it to predict client trades or future deals. "You need to make sure you have the best data for your machine-learning efforts to be successful."

One downside of shrinking the number of databases is that it makes those that remain more important, and increases the risk if one of the key databases goes down, Casper said.

More broadly, he’s established a governance framework that involves 51 taxonomies for the bank’s data. Employees are now applying those taxonomies and English words to databases, replacing what may seem like random numbers and letters with easily understood descriptors. They’re also examining hundreds of applications across the bank to ensure that the data they’re using matches up with the reference data.

It all goes back to an approach Casper developed at Morgan Stanley. He finally came up with a way of keeping all those LTCM accounts straight, taking steps to boil them down to a common denominator: the legal entity. The framework categorizes third-party account relationships, organizing legal entities into corporate hierarchies based on greater than 50% ownership, and can then be expanded to look at the people tied to those entities, as well as security issuers, derivatives or other product types, or vendor relationships.

The approach proved so groundbreaking that Casper and Jeff McMillan, now Morgan Stanley’s chief analytics and data officer, received a patent for their work, called "system and method for managing financial account information."

The model allows banks to more accurately measure their exposure to trading counterparties, as in the LTCM example. It also makes it easier to share information about an entity with various business units, removing some of the friction from the on-boarding process. And it makes complying with regulations easier since some authorities demand an ability to trace data from a specific cell in an Excel model back to its original source, while others expect lenders to monitor their international clients for anti-money-laundering controls.

"It was anything but rocket science; it was literally just having a very disciplined data model around legal entities and their relationship to each other," Casper said. The approach has "truly stood the test of time."

Join the conversation about this story »

NOW WATCH: WATCH: The legendary economist who predicted the housing crisis says the US will win the trade war

from SAI