David Gerth: Kelly Ranking the FanGraphs Top 100 Prospects

This article was submitted to the FanGraphs Community Blog, but not picked up yet.

With the Top 100 Prospects list out here at FanGraphs and the SABR Analytics Conference coming up soon, I thought it would be exciting to release a Top Prospects list using the methodology I presented at last year’s SABR conference. For those who were not in attendance (I assume this is most readers, and also recommend you go this year), the presentation I gave was on how to use a version of the Kelly Criterion to rate prospects in the draft. The Kelly Criterion is a formula that takes in probabilities of certain events and their corresponding payoffs and returns a percentage of bankroll someone would risk for this wager that would maximize the geometric growth rate. This is an important concept in investment and advantage gambling and there is a lot of literature on the subject.

Here, the setting that will be used is to imagine we are betting on a 5 sided weighted die that represents a player’s possible outcomes, with weights that represent how likely we are to roll a certain outcome. On the Top 100 list, in addition to the scouting reports, for each player on the list there is a corresponding distribution of outcomes for the player and a probability of that even happening. The five categories are “Bust”, 40/45 FV, 50/55 FV, 60/65 FV, and 70+ FV, and we will let each one of these categories be the face of the die.

The payoffs are a bit tricky and there are multiple ways of doing this. I opted to do a simple payoff structure, where I assume a standard $9M/WAR. The formula then for Payoff for a given FV is WAR/YEAR X $M/WAR X 6. The 6 is for the 6 cost controlled years a team has.

From the Prospect Week Primer from early in the week, we have the following table.

FV	WAR/Year	Payoff
40/45	1.0	40.5
50/55	2.5	135
60/65	5.0	270
70+	7.0	378

Therefore, for each player, we get the Kelly Function:

f(x) = p_Bust*P(1-x) + p_40/45*P(1+40.5x) + p_50/55*P(1+135x) + p_60/65*P(1+270x) + p70*P(1+378x) that we have to maximize by finding x (the percentage of bankroll) that gives us the highest value.

Ironically, while this is a math based exercise, I had to use the eye test to gauge what the correct probabilities are for each outcome.

Here are the top 25 prospects based on the probability distributions given, the payoff matrix calculated above and apply the Kelly Criterion to them.

Name	Pos	Team	Kelly	FG Rank
SPENCER TORKELSON	1B	DET	0.924	5
JULIO RODRIGUEZ	RF	SEA	0.899	4
ADLEY RUTSCHMANN	C	BAL	0.879	1
JEREMY PENA	SS	HOU	0.874	30
BOBBY WITT JR	SS	KCR	0.849	2
RILEY GREENE	RF	DET	0.849	6
GRAYSON RODRIGUEZ	SP	BAL	0.849	3
VIDAL BRUJAN	2B	TBR	0.848	55
AUSTIN MARTIN	CF	MIN	0.848	56
STEVEN KWAN	CF	CLE	0.848	57
SHEA LANGELIERS	C	ATL	0.848	70
GERALDO PERDOMO	SS	ARI	0.848	83
JOSH H SMITH	2B	TEX	0.848	89
ALEK THOMAS	LF	ARI	0.823	23
JOSH JUNG	3B	TEX	0.798	9
GABRIEL MORENO	C	TOR	0.798	10
TRISTON CASAS	1B	BOS	0.798	16
NICK YORKE	2B	BOS	0.798	29
BRYSON STOTT	SS	PHI	0.798	34
TYLER SODERSTROM	1B	OAK	0.798	36
CRISTIAN PACHE	CF	ATL	0.798	72
GREG JONES	CF	TBR	0.798	77
GEORGE KIRBY	SP	SEA	0.798	28
SHANE BAZ	SP	TBR	0.773	11
HENRY DAVIS	C	PIT	0.772	22

You’ll notice that the top risers are the perceived “safe” player types such as Steven Kwan and Geraldo Perdomo, who have a lower probability of busting than average. This is one of the consequences of Kelly. For example, if we are comparing two wagers where we have a 5% edge on, with one being a -800 favorite and the other being a +650 underdog, Kelly prescribes to bet more on the favorite than the underdog, even though there is the same amount of edge.

To better separate true ability from just being farther along the development curve, I separated the list into two groups, those whose highest level reached was AA or higher, and then those who had not reached AA yet. I then computed the average Kelly staking size for the two groups, recorded in the table below.

Group	Average Kelly Size
AA and above	0.720
Below AA	0.620

Then, I subtracted each player’s Kelly size from the average Kelly size of their group to get a “level adjusted” Kelly size. The top 25 from this method is below, with the full list in the GitHub link at the end of the article.

Name	Pos	Team	Highest Level	Kelly	Level Adjusted Kelly	FG Rank
SPENCER TORKELSON	1B	DET	AAA	0.924	0.204	5
JULIO RODRIGUEZ	RF	SEA	AA	0.899	0.179	4
NICK YORKE	2B	BOS	A+	0.798	0.178	29
TYLER SODERSTROM	1B	OAK	A	0.798	0.178	36
ADLEY RUTSCHMANN	C	BAL	AAA	0.879	0.159	1
JEREMY PENA	SS	HOU	AAA	0.874	0.154	30
HENRY DAVIS	C	PIT	A+	0.772	0.152	22
BOBBY WITT JR	SS	KCR	AAA	0.849	0.129	2
RILEY GREENE	RF	DET	AAA	0.849	0.129	6
GRAYSON RODRIGUEZ	SP	BAL	AA	0.849	0.129	3
FRANCISCO ALVAREZ	C	NYM	A+	0.748	0.128	7
ANTHONY VOLPE	SS	NYY	A+	0.748	0.128	12
CORBIN CARROLL	CF	ARI	A+	0.748	0.128	14
VIDAL BRUJAN	2B	TBR	MLB	0.848	0.128	55
AUSTIN MARTIN	CF	MIN	AA	0.848	0.128	56
STEVEN KWAN	CF	CLE	AAA	0.848	0.128	57
SHEA LANGELIERS	C	ATL	AAA	0.848	0.128	70
GERALDO PERDOMO	SS	ARI	MLB	0.848	0.128	83
JOSH H SMITH	2B	TEX	AA	0.848	0.128	89
JACK LEITER	SP	TEX	R	0.748	0.128	24
PATRICK BAILEY	C	SFG	A+	0.747	0.127	76
ALEK THOMAS	LF	ARI	AAA	0.823	0.103	23
JOSH JUNG	3B	TEX	AAA	0.798	0.078	9
GABRIEL MORENO	C	TOR	AAA	0.798	0.078	10
TRISTON CASAS	1B	BOS	AAA	0.798	0.078	16

After applying the change, we have mostly the same names, though with a slightly different ordering near the top of the list and a few new players. One thing that is noticeable is that there are only two pitching prospects here in the top 25. This is roughly inline with what the original top 100 list, but it’s good to note that pitchers have a relatively riskier profile due to the injury risk, and therefore is hard to rank a pitching prospect highly.

To wrap up, Kelly bet sizing favors players that have a low risk of busting, which results in prospects that are “safer” to be rated higher than the toolsier players with higher risk of not making the major leagues. In fact, there is a 1-1 correspondence between bust percentage and Kelly size. This is why Torkelson was the #1 prospect; he has the lowest chance of being a bust. There are some implications here for farm system building. One is that it is important to have significant amounts of depth in a farm system. It is not a guarantee that both Adley and Grayson Rodriguez are All-Stars, and in fact it is unlikely. Relying on the best possible outcome happening is not a sustainable way to build a team.

From a draft standpoint, it makes sense to allocate most of the draft pool to players with a low chance of busting. One reason why Nick Madrigal is a great draft pick even though he may not have the best career is that he had a low chance of busting given that his hit tool was so strong. His best case scenario is not as valuable as others, but that’s ok because we are getting a quality major leaguer for little money. This isn’t to say that drafting high schoolers or high risk players should not happen. In fact, an advantage a team with good coordination between the draft department and player development may have over another team that just uses analytics is that they will be able to identify high schoolers who may be risky from a model standpoint, but have traits that PD thinks they develop well, and thus have a lower risk than what the model says, which makes them more valuable.

David Gerth

Tuesday, March 29, 2022

Kelly Ranking the FanGraphs Top 100 Prospects

No comments:

Post a Comment

Blog Archive

Featured Post

Adverse Selection in the Draft