As someone who does a good amount of programming and math, but also spends a lot of time scouting players in person, I think a lot about how the two skills interact. The best mental model I have come up with so far is to think of each group as an element of a stacking algorithm, a common technique in machine learning that answers the question: “Given multiple machine learning models that are skillful on a problem, but in different ways, how do you choose which model to use (trust)?” (Machine Learning Mastery). I think this best sums up the question at hand.
For stacking algorithms to work the best, you want to
combine models that have skill in prediction but produce results that are mildly
uncorrelated with each other. In this framework, the two models we have are the
scouting reports and a statistical model. Both have predictive power, but
scouts tend to be more optimistic on toolsy players, whereas models tend to prefer
safer players. A stacking algorithm finds value in both and figures out what
weight to give each model. Even if a model has a better overall track record of
picking major leaguers than a scout, if the model has blind spots that the
scout is able to cover, stacking will pick up on that and return a projection
that accounts for said blind spot. It is a mutually beneficial relationship,
and one I think about when scouting live. If I know a model is good with projecting
tool X, but struggles with tool Y, I will spend my time at the park prioritizing
tool Y so that I can provide value. I know that if there is a proper stacking
model in place, both sides will be valued properly.
It's not groundbreaking or controversial that eye test
opinion and model predictions should both be listened too. Where I think this
conversation at times loses nuance is that there is a default assumption that
we should weigh each group equally. This is a bad idea and shows a lack of
understanding of each group’s skillset. If I know I struggle with a certain
demographic, I do not want my opinion weighed equally with a more skilled
predictor. On the flip side, if I have a history of projecting something well,
I want it to be weighed more heavily overall. Having a commitment to back
testing and knowing what your scouting, player development, and analytical
groups do well and what they struggle with is an important exercise to do to
make sure everyone feels validated and to make the best possible prediction,
even if it is difficult.
Further Reading:
https://machinelearningmastery.com/stacking-ensemble-machine-learning-with-python/
No comments:
Post a Comment