The Wall Street Journal recently rattled off a variety of statistics regarding the performance of the replacement officials. The metrics chosen, however, are grossly superficial, inherently misleading, and in one instance factually incorrect.
(Please, Florio, tell us what you really think.)
First, the “audit” reviews the issue of replay review. Coaches have thrown the red flag 29 times, an 11 percent increase over last year. But only 31 percent of the calls have been overturned — down from 52 percent in 2011 and 42 percent in 2010. The article then claims that a “challenge call sends the play to an upstairs booth, where it’s reviewed by an official who isn’t a replacement.”
With that statement alone, the entire article should be ignored. Anyone who watches any amount of NFL football knows that, when the red challenge flag is thrown, the decision is made on the field by a referee. And since the referee is now a replacement, the “audit” ignores the reality that the percentage of calls overturned could be skewed by the fact that the replacement referee who reviews the call may be getting it wrong, too. (The “audit” also offers no stats at all regarding replays initiated by the booth, a procedure that applies in the final two minutes of each half, after any scoring play, and after any turnover.)
Second, the “audit” says that, on average, the games are six minutes slower. While explaining that the difference is not “apocalyptic” (what a relief), it ignores the reality that one or two games that last a ridiculously long time make the entire system look bad. That’s happening this year, all too often.
Third, the “audit” looks at the number of flags thrown. Through two weeks, the yellow flag has flown 470 times. It’s a difference of 11 from last year. (The article doesn’t say whether it’s 11 more or 11 less.) Throwing around terms like “consistency,” this portion of the “audit” creates the false impression that, if the total number of flags are the same, the replacements must be doing an equivalent job.
Anyone with a functioning brain should be offended by that one. Comparing raw data on the number of penalties sheds no light on whether the right decisions are being made. The “audit” tells us nothing about flags that were mistakenly thrown — or about penalties that mistakenly weren’t called.
Fourth, the “audit” claims that the replacements “punish,” with an increase in pass interference and holding calls, along with a spike in personal fouls from four to 21. (It’s a little hard to believe that through two weeks of the 2011 season there were only four personal fouls called.) This glosses over the fact that the replacements aren’t calling illegal contact with receivers (a rule that doesn’t exist at lower levels of the sport), and that defensive backs apparently are continuing the forbidden pushing, pulling, and/or shoving after the ball is in the air — which is when pass interference can be called.
Also, the increase in holding penalties could be a product of the fact that holding can be called on every play. Astute officials call holding only when it occurs in the vicinity of the ball carrier; stuff away from the play doesn’t matter, and rarely is called by the regular officials. Replacements may be calling it beyond the area of the man with the football.
Fifth, the “audit” claims that the replacements are ignoring certain types of penalties, like illegal shift and illegal man downfield. Again, this kind of basic “analysis” tells us nothing. Maybe teams are engaged in fewer illegal shifts. Or maybe they’re sending fewer illegal linemen down the field.
All told, it was a useless exercise that makes the replacements look better than they really are. And it’s hard not to wonder whether the Wall Street Journal came up with the various categories on their own, or whether the NFL instigated the exercise. (Indeed, unless someone at the Journal did all of the counting of penalties by hand, the raw data could have come from only one place.)
Here are the two primary truths that this trumped-up “audit” ignores. First, the only way to know how the replacements are performing is to review the play-by-play grades that are generated by the league based on a review of coaching tape, and to compare those grade to the grades from the first two weeks of 2011. After the first week of games, during which the replacements looked the part, acted the part, and sounded the part, the aftermath resulted in multiple sources telling PFT that the average officiating errors per game exceeded 30. In contrast, the regular officials had average mistakes in the single digits.
Second, the replacement officials in Week Two failed the eyeball test. So regardless of meaningless stats like those in the Journal article or relevant numbers like the actual grades generated by the league office, when the product coming through the TV looks and sounds different, people notice.
Then again, people still don’t care. And until the NFL feels a pinch to the bottom line, the NFL won’t care, either.