I'd call it a reach... on error
Some time around mid-season 2019, I was talking with the man, myth, and College Park legend Michael "Mustache Mike" Duberstein. We were talking about Hanser Alberto, and Mike says to me that he is pretty sure that Hanser Alberto leads the league in reaching on error. Naturally, I dismissed the notion and pointed to the small sample size and recency bias, as I pulled up Baseball-Reference. As I would find out, none of the major sports sites keep public-facing leaderboards for reaching on error.
Then, a few months later, I happened to go to the three games in an Orioles series at Fenway Park. In the first game, it just so happened that Hanser Alberto reached on error against Nathan Eovaldi. Not only that, but earlier in the week, during a late night work session, I was streaming a west coast Orioles game, where, sure enough, Hanser reached on an error. The next day, Mike happened to meet me for the game, and I told him that I was convinced, and that I was eventually going to find the data. The next day, as I had at that point come to expect, Hanser Alberto reached on error again, this time against Matt Barnes. I texted Mike and promised him that I would eventually get to the bottom of it.
Over the last couple of weeks, I've been building out and validating a PostgreSQL database using the
mlbgameday package for the R language. For the uninitiated, these databases carry basic information about every quantizable part of an MLB game, from the pitch, to the atbat, to the inning, to the game, and codifies them. It's the platform that drives, for instance, the MLB At Bat app, and other third party game feeds. It turns out it's really good for sabermatrics dorks as well.
Given this wealth of data- and most importantly newfound confidence in its completeness- I could finally adventure into discovering the truth about our pal, Hanser.
Hanser Alberto: reachin' machine
The first thing that I wanted to discover was, well, how often does Hanser Alberto get on base because the fielder made an error? The answer, unsurprisingly, is a lot. He reached on error 11 times through 550 plate appearances in 2019, tied for second with one other guy. He had 160 hits- 125 of them singles- and, for reference, was walked 16 and hit-by-pitch 4 times. My point is this: Hanser Alberto's talent for reaching on error accounts for his touching first base that is approaching 10% of his singles, and about 50% of his traditional non-hit base acquisition options. These facts are impressive alone but even more so when you consider his performance against the field:
The orange one is our man, Hanser. As is evident, he's outpacing more than 250 other players on his reach-on-error (RoE) count, and trailing only one. But of equal interest is the rate at which he reaches on error, RoE-per-plate-appearance. By that measure, it's Hanser vs. all, and he's crushing it. He reaches in 2.00% of his 550 PAs, and the next player with at least 500 PAs in 2019 is Elvis Andrus with a 1.69% RoE rate in 648 PAs, totaling 11 RoE. His clip is morbid, which we can see by looking at all the players with 300 PAs or more again:
Again, Hanser is in orange, and his rate shares an otherwise very lonely bin with Curtis Granderson's 1.93% rate (in 363 PAs).
As one might imagine, there is a definite interplay between getting on-base and reaching on error, which we can take a look at in terms of another chart, which honestly makes Hanser look like a freak of nature:
Here we see 2019 batters from the same dataset as the last two plots, and again Hanser stands out. He's not a crazy on base percentage guy, by any means, but I think that this graph points to the notion that Hanser Alberto's value to a team is fundamentally different than players at the same OBP, because his RoE counts distinguish him from similar players at that point.
This can be looked at through in terms of an "augmented" on-base rate. Consider the fact that, in the statistics we typically look at, RoEs are simply not counted. But if you reach on error, you're on base, but you haven't changed anything in the OBP calculation. So we can look at the difference in traditional on-base events per plate appearance and on-base events including RoE per PA:
How to think about this graph? Well you can't negatively reach on error, so you can only be above and to the left of this plot. The more above-and-to-the-left you are, the more hidden value you might have via RoEs. Of course, Hanser Alberto is a RoE legend. But you can also divine from this plot where Alberto's effective OBP is: around 20 points higher than anyone realizes. This is really cool.
All of this, to me, is insane. Hanser Alberto, in some sense, is getting on base invisibly in terms of the accounting of many methods 2% of the times he steps up to the plate! How is nobody counting this!? It's, as said earlier, about 20 points of OBP that's invisible, pushing him from .330 and slightly below average to .350, a considerably above average mark:
The real value that is nuts is how many runs and wins he brings to the table. Using FanGraphs's wOBA page we can stitch together an estimate for how many runs and how many wins Hanser produced by RoE this season. If we assume that RoE is equivalent to getting a walk, we can find that Hanser's RoEs generated about 4.77 runs above average and, therefore, about 0.46 wins above replacement. But reaching on error is dynamic- unlike walking- and batters can advance freely, often more than they would on a single. If we weight instead using the run value of a single, we find that his RoEs were worth 6.14 runs: 0.60 wins above replacement. That's a solid 0.6 WAR that might be hiding in plain sight, purely from Hanser's ability to induce defensive chaos.
In all, Hanser is quite an outlier. The next natural question, to me, is how does he do it? Take a look at his Baseball Savant page. You'll find that he's almost comically bad at hitting the ball hard (in the first percentile for average exit velocity and percentage of hard hit balls), mediocre in his expected wOBA and expected slugging percentage (20th percentile or just less), then 88th percentile in expected batting average, and last but far from least 51st percentile in sprint speed.
I think these metrics paint the picture of a player who is just a different type of baseball player than almost everyone else in the league. Remember that xwOBA and xSLG are based on the batted ball profile of a player. For Hanser, he lies in a region of the baseball universe that, it seems, might just be untested. It's always possible that this season was a fluke, that he'll regress to the mean. But one can't help but think that this guy might actually have a very unique set of skills. I'd point to another page on Savant to argue that this is in fact the case: his spray chart against the shift and without it.
Hanser, we saw, is not a fast guy. Instead, to generate hits (and generate errors), he sprays the ball everywhere, to either field. The only place he doesn't seem to hit the ball is wherever the defense is. Is this perhaps a glance inside the mind of Mike Elias and Sig Mejdal? Eh, maybe. But it does show us that a type of player we don't see too often in the modern MLB can secretly accrue value and be a guy who brings home quite a bit of value and is, as Mike Duberstein said to me as we watched him reach on error yet again, one of the most fun players to watch at the very same time.
this is the hard part... ↩︎
note that I use on-base rate, e.g. number of times on base divided by plate appearance, not on base percentage which has a specialized denominator which doesn't compare well to other per-PA rate stats, especially if they include outlier events like RoEs. ↩︎
I guess you could count some TOOTBLANs as negative reaches-on-error? ↩︎