Exclusive: CIO Mark Steed on Mastering Artificial Intelligence for Portfolios

Watch Video here:

Video Transcript:

Christine Giordano: Hi, I'm Christine Giordano, Editor-in-Chief at Markets Group. My next guest is Mark Steed. He's Chief Investment Officer of one of the fastest growing funds in the world. He's heading Arizona's $21 billion Public Safety Personnel Retirement System, or PSPRS, and he's been supplementing his investment office's capabilities with artificial intelligence. In 2018, Mark started as CIO and quickly grew the $10 billion fund to the $21 billion that it is today, which made it, at the time, one of the fastest growing funds in the world.

We're talking about 2021-2022. Now during this interview, we'll review what he's seeing, what are some case scenarios that artificial intelligence software can be used effectively, and challenges and risks around it, as well as best practices from his investors' perspective. Mark Steed was known early in his career as a rising star. He's known for his innovations and his focus on new and old solutions that work for portfolios. Welcome, Mark.

Mark Steed: Thank you. Happy to be here.

Giordano: Can we start from the top with the fundamentals of artificial intelligence?

Steed: Sure. It's a pretty broad area, as you might suspect. I feel like there are terms that get thrown around in casual conversation. We'll say AI, or machine learning, or big data, or predictive analytics, and these are all part of the same family. AI generally refers to models, and you have machine learning, which is really the algorithm part of it. You can't really have AI without the machine learning, and so it's a subset. That's where I think we and a lot of other groups spend their time. Generally, when we step back and we refer to this space, we're talking really, I find, about two or three different kinds of fundamental models.

We're talking about supervised learning, so that's where you train a machine on a set of labels or data, and then you're asking it to make accurate predictions about what that is. You're looking for it to basically make decisions off of a pretty well-known, well-defined causal relationship. An example of that would be you show images of skin lesions, and you want it to draw a conclusion as to which one of these lesions might be representative of cancer, and you have a whole library of data sets that you can train it on, and you know what the labels look like, and what the output looks like.

Those are supervised learning models, and that's one big category of this whole paradigm. Then you have unsupervised learning models, and I think maybe with unsupervised learning, you're basically dealing with unlabeled data, if that makes sense. You've got input examples, and you don't really have corresponding outputs, so here's a bunch of information, and I don't really know what I'm looking for, and we want to discover patterns or structures or relationship, and we want the model to do that on its own.

The primary goal of unsupervised learning, then, is to really discover the hidden patterns or structures, relationships, and data that might not be obvious. There are some key differences. You've got supervised learning, really requires labeled data, and a clear indication of what are you trying to solve for? The really objective is mapping the inputs to the outputs, and that's really the goal. While with unsupervised learning, what you're trying to do is discover patterns and structures in the data that you might not be able to recognize.

Supervised learning traditionally is used for prediction tasks, and the unsupervised learning is used for discovering exploration tasks. Those are really the two main models I find where most people spend their time. Then you have this other category or paradigm that's called deep learning, and deep learning really is just sort of, you're working with just richer data sets, bigger data sets, but deep learning can be either used in supervised models or unsupervised models. That's maybe, in my simplistic world, how I lay it out.

Giordano: What are you seeing from within investment offices as far as its applications are concerned?

Steed: It's an interesting question because oddly enough, for as much as everybody talks about it, it's really hard to figure out what everybody's doing with it because I think my experience is a lot of the people that are talking about it don't really understand it all that well at a really technical level. They can describe what's going on, but they don't really want to describe it at a technical level because frankly, in most industries, you're dealing with executives and leaders who just weren't trained in the vernacular of AI, of analytics.

There's a heavy math, heavy stats, heavy computer programming, and these aren't really languages that a lot of leaders grew up learning. They're business or marketing or finance. Everybody's trying to play catch up, and I think that is partly why adoption has been slow is because you do need executives to be able to speak the language and understand what's going on. It's really hard to tell what everybody's doing with it. Lots of groups are talking about it. Another reason why I think they're pretty quiet about it is because they consider these algorithms fairly proprietary.

There's, I think, some truth to that, but I think what people talk about generally in at least in finance institutional investing are really using both supervised and unsupervised models. They're doing this to ideally make better decisions. They're using a lot of it for predictive tasks. Lots of quantitative trading models, using some of these supervised and unsupervised learning methods. They're basically taking reams of information, about circuitry prices and every other thing you can imagine, interest rates, they might be using geospatial data, and they're trying to feed all of this in to a model that will tell them, "Hey, this security is attractive to buy or sell or go long or go short."

There's all sorts of models like that. There's lots of them that are using it for what's called sentiment analysis. They're scanning earnings calls, transcripts, things like that, and trying to figure out if the tone of the CEO or whoever is constructive, optimistic, or pessimistic. Ideally, you're trying to uncover something that may not be obvious to the human eye. I think those are a lot of the main ways that we see groups using these models.

They're really doing things that humans were trying to do before or were doing to various levels of success and trying to do them with more accuracy, and then identifying things that maybe a human might not catch and bringing those to the forefront. That's why we see a combination of supervised and unsupervised learning. It is hard a lot of times because just because of it, I just think that the asymmetry between leadership, the outward-facing group, and the people building the models, and then just the proprietary nature of some of the models.

Giordano: This stuff is fascinating to me. I remember a case study where they analyzed the earning reports, and the more flowery or bigger vocabulary the executive used, the worse the stock was going to do. I was curious to know, what in your sphere have you seen as perhaps some interesting case studies that you've used?

Steed: I think there are case studies like that. One that, I think is probably, it's not so much a particular case study, but an application maybe. I feel like this discipline is really expanding in the liquid markets, your stocks and your bonds. That's because that information is fairly well-structured. You have security prices, you've got a lot history, you have long time series of security prices. Then you have a lot of other publicly available information that you can conveniently download in spreadsheets or you can access.

Now, groups are moving towards like unstructured data. Stuff that is not in an Excel spreadsheet or a CSV file. It's in a PDF or it's an image of something. You're tracking jets flying across the country and you're tracking where the CEOs are flying to, and trying to glean some information from that. I think that's becoming more and more like table stakes. If you're not doing that and you're an active fund, then you're probably going to be behind. Also at the same time, don't see a lot of value to it because more and more groups are doing that.

It's just becoming one of the things that, it's like personal hygiene. You don't get really any points for doing it, but if you don't, it's really going to affect you. I think where the applications are getting really interesting is in the private markets because I see a lot of opportunity for value there. It's hard to get information. The information's, it's just not as accessible. It's often proprietary. If you're a private equity fund, you're not posting a lot of information about your firm or your underlying companies on the web for people to access and analyze.

If you're a private lender, same thing. I see more and more interest in applying some of these tools to the private markets. They can be supervised or unsupervised. Effectively, they're trying to figure out if you're a private equity company, what makes a better portfolio company investment? Who are the candidates on your team who are probably better promotion candidates? What makes a good partner? They're trying to be real rigorous and structured in their way of thinking and trying to really systematize their logic. I think those applications are pretty interesting.

Giordano: Can you tell us some nitty gritty secrets on what might make a good performer or characteristic (that indicates a good performer)?

Steed: A lot of this stuff is interesting. A lot of this stuff is really just in beta at these firms. It's something that we have learned to incorporate in our own due diligence process when we're talking to GPs, because what we really want to see is a commitment to the space and learning to be more rigorous and objective in the analysis. I've said this a lot, that humans make models better, and models also make humans better. What we're trying to do is remove all the subjectivity to the extent that you can. You're never going to be able to remove it 100%.

Because models can be wrong, you have to have that human judgment piece, but humans can be wrong too, and we miss things so you have to have the model. I think everybody, we see firms using this in various ways, but it's also very particular to the firm. If you're a distressed debt investor, there might be a certain set of qualities that are better suited to that field than somebody who, let's say, does venture capital or middle market buyouts. What we look for are just groups that are starting to acknowledge this and moving in that direction towards understanding the data science or having data scientists or trying to improve their own understanding and fluency in this space.

Right now, I think we'd probably be a little uncomfortable if anyone just said, "Hey, we have this model, and we're just letting it run and tell us what to do, and we're just following it." That doesn't really make people very comfortable, but I think what we're seeing now and what we're doing, and what other groups are doing is we have our own human judgment, we're tracking that. We're getting good about that, and we're tracking our own decisions and the quality of those decisions. Then we're also running models here and tracking those.

We're just going to look over time to see are we close, or was one of us right and wrong more often? I think that's the discipline you have to create. If we're looking at the old baseball analogy, and we're like, are we in the second inning or the third inning? I feel like we're just getting dressed to go out onto the field. I really think we're very early stages here. We don't see too much widespread adoption yet. Lots of groups looking at it, developing it.

Giordano: In your experience, what are some of the things that AI can pick up that a human cannot?

Steed: Yes, it's really interesting. We all know that the AI can pick up things that humans can't. Maybe an example of this would be what's called like an isolation forest. An isolation forest is an unsupervised machine learning algorithm that's used for anomaly detection. Probably, the most common application of that is credit card companies using something like this to identify potential fraud. This algorithm is really effective in just identifying rare events or outliers in larger data sets.

The idea behind an isolation forest is that anomalies are easier to isolate than normal data points because they have like really unique characteristics that distinguish them from the majority of the data. If you think about asset management, this can be really interesting because I might want to find, if you're a public pension plan and you're wanting to only work with the best general partners, whether it's buyout funds or venture capital funds, you really want to find those groups that are going to be outliers.

Just because they were one historically doesn't necessarily mean that they will be one going forward. That again could be an input into the model, past outperformance. What you're trying to do is identify, hey, I want to just invest with groups that are in the top decile or going to be in the top decile of their peer group. I'm going to take all this data I have, some of which I have a sense for the correlation. There's a lot of stuff that I might not be thinking of.

Might be the number of partners that they have, or the partner to portfolio company ratio, or the salaries. There's all these things that might be more indicative of an outlier than I think. You have this isolation forest, you can spit the data in, and it's basically going to separate each data point until it's isolated all of them. Then basically the thinking is if you think about all of these data points as a branch growing and they're each like a leaf on a branch, the ones that are closest to the trunk are probably the anomalies because you isolated them really quickly.

This is an example of where you can use machine learning to maybe enhance the manager selection process. There's lots of other ways to do that, but that might be one way that these models can maybe identify things that humans might miss because we all sort of have a vague idea. Probably past performance is it's better that you did well than didn't, because if you did well, then it's going to be easier for CEOs and management teams to want to bring you in as an investor if you're a private equity firm, or you're going to get your pickup of the best candidates, if you've done well.

There is some persistence to that, but there are also some other things that might matter that investment teams aren't looking for. I think that's an area where you could have an algorithm like this, like an isolated forest that you would use to supplement your own due diligence. It might highlight some things that you weren't aware of. Now, they have problems like all of these models do, but this would be a way that I think one investment team could actually improve the underwriting process.

Giordano: What should you be careful not to do when using this?

Steed: The problem with, for example, like isolation forests, you need a pretty large data set. That I think is a big gating item for a lot of investors, and certainly a lot of pension plans. Pension plans just aren't structured to be scraping just tons of data and storing it in a way that you can use to build models. You do need huge data sets for a lot of both supervised and unsupervised learning. You need to have these large data sets. In this particular example, this won't necessarily apply to all models, but all models have weaknesses. I think that's what you have to be aware of.

Also another problem with the isolation trees is you really need the outliers to be unique. To the extent you've got a lot of randomness in the underlying data set, the characteristics that distinguish or that might predict future performance may just be fundamentally random. They may not be there. You do need there to be something exceptional that a model can pick up. Otherwise, it's just going to say, look, it's just random. There seems to be really no identifiable factor, which by itself is actually also revealed. That's also very helpful because if you're an investment team and you just think, wow, okay, look, there's a lot of subjectivity in the underwriting process as it is.

If you're pension plan, you're looking at a private equity fund or a venture fund. You've got a lot of, on the one hand, on the other hand, kind of comments. Then if you have a model that's supporting the models telling you, we don't have a lot of clarity here either. That's useful. That's useful because it helps you also approach your own underwriting with a degree of skepticism. In some worlds, I think it's just more appropriate to express uncertainty than it is to express certainty. Every model has weaknesses like that.

I think it's important to be able to identify what they are. The risk for an investor is that a vendor could approach you and a vendor could say we use machine learning to help you make better decisions. If you want to make better decisions about underwriting private equity funds, well then, here you just feed all your data into this model. We tell you of all the funds you're looking at, we think this one's going to probably be the outperformer. If you don't know, if you're not fluent as a group as to what the risks are of that model, you can make some bad decisions and really burn a lot of social capital. I think it's important to really be aware of those.

Giordano: How do you tap out what the risks are?

Steed: I think part of that is just understanding, like there's a handful of shared issues. One is there are knowledge of the specific models. If you're using a neural net, for example, without having to have to explain like what a neural net is, you just need to know there's a bunch of steps in the process. You have the architecture. You've got an input layer, and a hidden layer, and an output layer. Just as an example, if you're a casual observer and somebody says, "Hey, we're using a neural network," and there's a neural network called like an MLP, "and we're using an MLP, and we have an input layer, a hidden layer, and output layer."

If you're the casual observer, and you ask that part of the question, that might be enough for you but someone who's more trained on these models might say, "Well, how many hidden layers do you have?" "Four." "Why four, why not eight, why not twelve?" Then you have to know that the next step is for propagation. Then when you do that, you apply like an activation function, and that's to introduce a level of non-linearity to it. When you do that, you can use what's called a sigmoid activation, or a rectified linear unit activation. Both of those can have significant impacts on the output.

Without delving into the nuances of the model, you just have to know that they all have these idiosyncrasies, and they're more appropriate for certain use cases than others. That's why I think as a group, you've got to be fairly fluent in the pluses, the advantages and disadvantages of the individual models. That's one. Two is, people have to recognize that a lot of these models, especially if we're talking about like large language models now, just lack a fundamental understanding of the way the world works. There's a logic that's missing.

They're working strictly on probabilities. I did this one time ask ChatGPT to create like a biography, write my biography. It said that I graduated from Harvard when I didn't. It's because what it's doing is figuring out like probably, that there's words chief investment officer, in institutional management. It was probably drawing the conclusion based on probabilities that I graduated from Harvard, which I didn't. You can say maybe a safe bet if you didn't actually know, that might be a safe assumption.

I think that's an issue with a lot of large language models that they just lack that understanding. We don't really have, in the grand scheme of things, enough data in most cases to make these models really robust. I think the industry recommendation is somewhere, you need like a million to a hundred million, observations to train a neural net or have a high confidence that is going to yield robust results. For a lot of investors, you don't have that information. I think that can because some anomalies in the outputs.

Giordano: I think Mike Cembalest referred to one of these language models as saying that you could have hay for breakfast...

Steed: Yes.

Giordano: In my own experience with transcripts, it's actually put false words into people's mouths when we’ve asked it to transcribe recordings, and actually eliminated data from transcripts. To that point, it makes things so much easier and there must be that temptation to set it and forget it. Where do you decide how much to check over what it does? Because, if you're going to do the work anyway, it's going to be the same amount of time put in, whether AI does it or you do it, whether or not you're checking over it thoroughly. How do you decide?

Steed: I think you have to start somewhere. If you're running a model, I don't like recommend cold turkey, "Hey, this is now going to be automated. We're going to use, algorithms or whatever to handle this." I think you can identify tasks and say, "Look, here are some low risk tasks." For example, you might say, "Hey, we got all these PDFs, we get quarterly statements from our private equity funds that we invest in. What we'd really like to be able to do is pull a lot of the information that's in those PDFs about the underlying companies and pull those out and stick them into like a CSV file."

You can have somebody look over that, but that's a fairly low risk function, from that standpoint. That's using something that's like a robotic process and that's a subset of AI, but a little bit different because you're not running like algorithms. That's a fairly low risk task. I think as an entity, you can start and say, "Well, what are all the things that we do? What are the ones where if there was a mistake, it would be fairly low risk?" That's very much different than, making an investment purchase decision.

You can even say a sale decision might be less risky than a purchase because if we have a false negative, for example, we sell something that maybe we shouldn't have, that's not as bad as investing in something that we shouldn't have really in the grand scheme of things because all we have to do is worry about the things that we actually are invested in. My recognition is to look at like all the things that you do and to put them on a risk spectrum, where if this thing made a mistake, it would be very bad.

Now, the important thing though, is that there are mistakes probably already happening with just the human involvement. The question is how accurate do you expect a human to be versus how accurate do we expect the algorithm of the model to be? I think part of the best practices is just saying, where's that trade-off? This is something that Tesla does with their self-driving cars. They can say, "Hey, these self-driving cars can drive for so many million miles before they get into an accident. By the way, here's how many millions of miles humans can drive before they get into an accident. It's a lot less."

I think that's the objective analysis, but there's also this objective piece, which is people just feel more comfortable if a human's looking at things. Even if the model is more accurate, you still need to have a human looking at it. That's why pilots, they are flying the plane mostly on autopilot, but you wouldn't feel great if there was no pilot in the cockpit. Even though their perceived involvement might not be significant. It's a bit of a moving target for each organization, but I think you have to step back and say, "Well, what do we do? What are the high-risk functions? What are the low-risk ones?"

If you can use AI to streamline some of the low-risk ones, I think it's low-hanging fruit and you can build that culture and the fluency. Then I think with the high-risk stuff, you have to run that in incubation to just get a sense for, is it more accurate? At least as we're watching it, we're actually not going to use it. We're just going to run it in training here, and then be real clear about what do we do when we put it into practice live. I think having that framework and understanding is really important.

Giordano: How did you learn so much about it? Do you have any advice for people who might want to?

Steed: I just got really curious about it back at the end of 2008, on the back of the global financial crisis. I just thought, hey, there's got to be better ways to make decisions absent, just tons of historical data. It was very much thinking about like the risk, how do you avoid losing a lot of money? We all have these models that are using historical datasets, but the future is different, and it's not always linear. What are the rules of the road? That's just how I got interested. I started looking around and doing some research. The more I started using models, and I just started coding in Python and R, and it really just started, you start with the linear stuff.

Linear regression and logistical regressions, which help you identify binary outputs. Good investment, bad investment, good employee, bad employee, whatever, but that's assuming that you have linear relationships. I just started looking at things that were non-linear. That's where you get into some of the supervised learning and neural nets and these random forest and things like that. The great thing about it is there's just lots of tools out there online. Kaggle's a really good one. You've got all sorts of courses through Coursera and all these other online platforms that can teach you coding. I think you just have to approach it with an open mind and be flexible.

Then I think what's really important if you're in a leadership position and you want all your employees to broadly come around to this, I think is create a curriculum for your team and encourage where you can or make it mandatory that they develop these skill sets. I think that the organizations that are really front footed about that will be better off in the future. They'll know when to apply these models. They'll know the risks of the models and they'll understand that there's mistakes everywhere, but I think you'll make fewer of them. Hopefully, the good decisions will outweigh the bad decisions if you're front footed and clear out about stuff like this.

Giordano: Do you have any favorite data sets or points of comparison?

Steed: I don't really, I don't. I'm a big fan of, look, if you want to practice, you can practice, on a Kaggle or one of these other websites, they'll have reams of data and libraries you can go through and access. You can access a lot of good economic data from the St. Louis Federal Reserve, but there's all sorts of libraries out there that you can access just to practice. That's what I would recommend. Longer term, I think the organizations need to get really good. This is what we're trying to work on right now. That is, trying to wrangle your unstructured data.

We're swimming in just PDFs and there's a lot of information in those PDFs. We're about extracting all the information we can from those PDFs and putting those in a system that we can use to build algorithms off of. I think building your own data sets is really important because you're trying to outperform peers or do relatively better. You've got to be doing things differently than everybody else. I think that involves having your own data and building your own algorithms. Yes, I wish I had a better answer for that one. I think the training data sets out there through some of these online competitions is pretty good.

Giordano: Fascinating. As far as where are you going next with this, what are your plans?

Steed: I think it's going to take us some time to capture all the data that we want. The hard thing about data is if you're investing in funds or even direct positions, and you're extracting all of the unstructured data and trying to make it structured so you can use it but you're also creating your own variables along the process. All of that is assuming now you're looking backwards based on what happened. You're saying, "Well, it'd be really great if we had a variable for this thing." That's not something you would have known at the time.

In this process, you're trying to build out and expand our data. We're also trying to be careful not to overfit and where we can be very careful about what information will we actually have known. That as you're building algorithms to help you help support decisions, that they're as pure as they can be. I think for us, we'll continue to systematize a lot of our investment logic. Then we'll see where it goes from there. I think a year from now, things will look a lot different than they do today. You never know. I think wrangling all of our data is going to take us some time. I suspect that will be out for the next year or two.

Giordano: That makes sense. How big is your team, your investment team?

Steed: At any point in time, we have the investment management team. It's about, 13, 14 people. We have an operations team, which also includes two data scientists, which is obviously important for what we're doing. Then we have three people in legal. Maybe, 20, 22 people at any point in time.

Giordano: How many are trained in this?

Steed: Right now, it's myself and the two data scientists. The team is coming up the curve pretty quickly.

Giordano: Would you say it's added a certain amount to your fund or is it just part of the whole?

Steed: No, tough to say. It's a good suggestion in terms of being able to quantify the value add. Right now, the way that decisions work at my shop is you have to be very specific about what your recommendation is. If it's a manager, you have to say, "Look, this manager is going to outperform its benchmark by X% within this period." One year or two years. You have to say how confident you are that that's going to happen, 60%, 70%. We catalog that and we go back and review everyone's decisions to see for all of those decisions in which you were 70% confident, were you 70% right?

That's really important to us because that does a few things. It's a more egalitarian approach to decision making. Whether you use a model to arrive at your conclusion or just your gut, it doesn't really matter. Although, you have to support the rationale one way or the other. What we have right now is we're tracking everybody. Some of them are more qualitative in their decision making. Some of them are more systematic and they're using models to make a forecast and we're tracking that. It's going to take us, I'd say a year or two longer before we have enough data to start to see where our model is helping us and when our human intuition is better.

It's tough to quantify right now, but at least at the start, I thought we have a pretty good baseline and it's fair. If people are allergic to models, that's okay. If people are allergic to others' gut instincts, that's okay. Our goal is to get to truth. In order to do that, you need to de-bias that conversation and just track, well, who's being more accurate right now. We need to at least be able to say that and how much more accurate is say an algorithm versus a human. That's an important first step. Then we can get to, what do we think the impact is on our AUM. Haven't gotten there yet. I don't think we have enough sample size to make that conclusion.

Giordano: Thank you, Mark Steed, for joining us today. It's a changing world out there and very happy to hear from you. Thank you for helping us visualize the future. Are there any trends that we're forgetting about or any key takeaways here?

Steed: No, the good thing is I think the industry is well-covered at this point. There's lots of interest in it. I think, look, with the advent of Anthropic and ChatGPT and some of these other tools, they're very robust, but I think people just have to be careful that they still have weaknesses. I think we're all aware of that, but we do have to fight against that evolutionary bias to just be able to say, hey, if we can explain something, we can control it and models make us feel like we can explain things.

There is a bias to just set it and forget it and follow these models because it just makes us feel comfortable, but I think we all have to be pretty suspicious. I'm optimistic. I think these changes will be good for everybody. I'm optimistic that we'll create the safeguards that we have to, but very exciting. There're some risks, but I think I'm optimistic for the future for sure.

Christine: Great. Good to hear it. Thanks again.

Mark: No problem.