Wednesday, June 1, 2022

Research Plans

I wanted to use this entry as an evergreen page of what type of research I am working or topics I am thinking about. Going into job interviews last year, I was not really sure what to expect. I had spent a lot of type working on more abstract ideas and live scouting reports, which are nice, but most lower level jobs require high levels of technical competence, which I was lacking. So I have shifted away from that stuff and have been spending most of my free time trying to get competent in various machine learning packages (xgboost, Catboost, etc) and Bayesian statistics and programming in Stan. The latter is fun and rewarding, but takes a lot of time to get competent with, whereas the machine learning stuff is a little less interesting to me, but pretty easy to bootstrap. That being said, I haven't put aside my scouting (you can see various scouting reports here already, though they are more flowery than what a real scouting report looks like) and I am still doing more in depth research. In no particular order, here is my list of current projects:

1) Bayesian Stuff Model: There's no shortage of Stuff+ models nowadays, but I wanted to try to make something that was fully Bayesian, and also returned estimated swing result distributions (given a swing, it will either be S&M, GB, FB, etc.) rather than a single number. I like the distributions more than a unitless number because I think it's more instructive to see what happens if you add 1 MPH of velocity to your fastball. With a standard Stuff+ model, it will say something like Stuff+ went up 5, which isn't that informative in my opinion. The model I am designing says that if your fastball increases by 1 MPH, your swinging strike rate increases by X%, ground ball rate decreases by Y%, etc. This is done in Stan, which is very time intensive from a programming and a run-the-model stanpoint, and since I want to run it through a Shiny app it's pretty tricky. This is my white whale project.

2) Integrating Domestic Amateur and International Draft Portfolios: I really like considering the domestic amateur draft as a subset of your overall amateur pool. For instance, if you draft three risky high schoolers with your first three picks, this doesn't happen in a vacuum. Your entire international class is also high variance, and thus your youth intake for the season is very high risk. Every team has roughly the same allotment of international talent, and so maybe it's fine to treat domestic and international separate, but I'm curious to see if there's something here that inform how to draft domestically given an international pool and vice versa.


No comments:

Post a Comment