The NFL Combine

            Growing up, watching as much of the Combine as I could through (we didn’t have cable), I would always look through the numbers of the top players and try to see where I would measure up. This was the ultimate place to display your athletic wares to the world. The tests would rank you and your peers “objectively” for the first time. No more trying to compare strength of opponents, it was every man versus the timer (hand-timing still allows for some error).

            As a high school and college football player, I loved the idea of going to the NFL Combine. I was never very confident in much that I did, but I knew that if I got to run the 40 yard dash at the NFL Combine, I could make every scout and coach turn to one another with big eyes, as if to say, “Did you just see that?!” I wanted the opportunity to shut Mike Mayock up. To be fair, I’ve never met the guy (and he could be wonderful), but he seems like the ultimate Debbie-downer when he is talking about prospects at the Combine.

            During the Fiesta Bowl, my final game at Stanford (on January 2nd), I sustained a high ankle sprain. I would miss any All-Star games. This may have been a blessing in disguise, as I’m not a big fan of risking injury in games like that (see previous post ). It would be a race to heal, rehab, and train in order to get ready for the Combine on February 22nd. A few weeks went past, and it was clear my ankle was not going to be ready. I talked to many people who advised me every way possible: run- it will make the coaches think you’re tough (but you’re time will be terrible), don’t run- you don’t want to post a terrible time (but the coaches will think you’re soft), do some drills- “if you can do this, why can’t you do all of the drills young man?” and so on… I ultimately decided to just bench press, and do all the non-physical requirements.


            The decision not to run in my case was a good one. Unfortunately, I still had to deal with the worst parts of the Combine:

1. Medical Evaluations

-The first thing you do is go to the Hospital. You are asked repeatedly about the injuries you have sustained since EVER. Every injury you’ve ever had is scanned in as many ways as you can think of. They also do a bunch of basic tests that everyone must undergo (blood test, EKG, urine test, etc.). This sucks, but it’s not the worst of the medical evaluations.

The worst is when you have to spend a whole separate day listening to doctors evaluating the scans. This is the portion that people probably refer to as the meat market portion. On this day you’re given a basic physical, you go over your medical history again, and then you’re led into one of the 6 or so different rooms the divide up the NFL Doctors. Rather than having one report that will go to all the teams, each team’s doctors must see you for themselves. Yes, you’re right, that does sound terribly inefficient doesn’t it? As annoying as the process is, I can understand how teams might not feel comfortable drafting someone and paying them lots of money based on the assessment of another team’s doctors.

2. Cognitive Testing

Remember scantron tests from school (SAT or ACT)? If you go to the NFL Combine, you’ll never forget them. I think I spent 6 straight hours filling in scantron answers to questions like, “What would your coach say about your work ethic?” It’s not fun. You know a test is boring when they intentionally put in unrelated questions to ensure that you’re still paying attention.

One of the tests asked, “When was the last time you threw a baseball during one of your football games?”

I was confused, so I got up and quietly asked the person giving the tests what was going on. He then explained that it was included to make sure guys aren’t just filling in all the same answer to finish the test. Nobody else got up to ask a question…

I went in to the Combine excited. I left feeling sleep-deprived, grumpy, and thankful that after 3-4 days of little sleep and doing horribly boring monotonous tasks that I didn’t have to try to do the on-field drills that I had previously dreamed about doing.


Odds and ENDS

1. The True Value of the NFL Combine

The true value for the NFL teams at the Combine comes in the form of medical evaluations, and meetings with the player (especially if they have had off the field issues).

Last year, in my meeting with the Pittsburgh Steelers, I proceeded to tell them that I believed that the Combine was overrated as a tool to measure future success of NFL prospects… and I was willing to back it up. I told Mr. Colbert, the GM of the Steelers that I would be happy to send him a copy of a paper that I wrote for a class that looked at the Combine as a predictor of future success. This is what I sent him (it’s a little long and boring… it was a school paper): 

2 Replies to “The NFL Combine”

  1. From what i’ve heard about the cognitive testing the NFL uses, it sounds questionable. I think it would be interesting to give a neuropsych battery to a group of players and track them longitudinally to see what, if any, scores predict outcomes. Assessing areas such as processing speed, memory, language, and visuospatial abilities seems much more relevant (again, from what i pick up in the media) and highly reliable and valid.

  2. I read a good portion of your paper. Clearly there are a lot of faulty assumptions drawn from standardized tests. However, I could predict a litany of things from tests that do not correspond directly to the behavior in question. For example, I could give someone the NART, a word reading task, and predict with a good deal of accuracy what their FSIQ is. If, however, FSIQ is measured by the WAIS, it entails no activity which is very similar at all. Indeed, just giving someone the vocab subsections of the WAIS would generate a score that correlated .80 (1.0 being perfect) with FSIQ. FSIQ, in turn, has been shown to predict many things such as performance in school. In fact, the SATs tend to correlate highly with FSIQ. What is more, John Gottman is famous for being able to predict a couple’s divorce with 94% accuracy based upon whether or not the husband accepts influence from his wife. It seems to me that marriage is a great deal more complex than that.

    If a test is reliable, it’s not a question of whether or not it is measuring something. Rather, the question is what is it measuring? Thus, we have the different forms of validity which highlight a tests utility (e.g., content, construct, criterion). A test may be highly reliable and valid for measuring depression, but if used for measuring anxiety will lose some of its power. What is more, if that same depression measure is used to measure something more unrelated, say personality, it may lose its validity all together.

    The issue may not be standardized testing per se, but rather the measures used and subsequent interpretations drawn from them. For example, It doesn’t sound like the Wonderlic is measuring intelligence, really, but novel problem solving which is not the same thing. Perhaps that would correlate highly with situations where a novel, split decision is necessary; but when we engage in habitual patterns of behavior we rely less on conscious processing and more on unconscious/subcortical neurological activity. Driving is the perfect example of this. We generally start driving when we are around 16 and at first it is ALL cortical and conscious. We are hyper aware of everything. Yet, with time and practice the behavior becomes more second nature and virtually everyone can relate to a time when they “zoned out” and suddenly realized they had been driving for a significant period of time without being really “aware”. Subcortical structures, namely the basal ganglia, are particularly active at these times and the body requires less of conscious awareness or what some call the “dynamic core”.

    On the other hand, just because there are some or many false positives does not mean it is not a useful tool. For example, while there are many players with incredibly fast 40 times that did not pan out as players there are probably no WRs in the modern NFL who ran a 5 second 40 and were successful. Also, one “test” could just as easily be considered an “item” within a larger measure which could generate a score that predicted success. A multiple regression analysis can incorporate different measures for explaining variance which in turn highlights to what extent you can predict an outcome.

    At any rate, as far as intelligence goes, perhaps it would be interesting to measure the more generally agreed on domains of intelligence, which i suggested in my previous post, and see how those correlate with different kinds of performance.

