I've also been seeing a lot of blowback and objections to this study, in many cases on the grounds that naming conventions have changed over the years. From 1952 to 1978, all storms were given female names (e.g., Barbara, Florence, Carol, etc.) obviously because hurricanes are very interested in keeping their fingernails well-manicured and enjoy watching Say Yes to the Dress with their girlfriends while drinking skinny margaritas, thus conforming more to female than male stereotypes. I'll just go with that... From 1979 on, they began alternating female and male names (e.g., Gloria, Juan, Kate). Pre-1952, I don't even know what they were doing (e.g. Easy, King, Able)... adjectives? improper nouns?
Anyway, in this study, the researchers had volunteers rate the perceived masculinity/femininity of each name. This masculinity/femininity index (MasFem in their dataset, which you too can download here!) was then thrown into a negative binomial regression with some other shit relating to how strong the hurricane was, and out pops some significant coefficients on the MasFem index and its interactions with how strong the storm is. Declare victory!! Publish paper!!!!!!!!!!!!
But, hold it there, cowboy/cowgirl. What if the effects we are seeing are due to the fact that earlier storms both tended to have female names and also tended to kill more people, even after accounting for their severity, due to the fact that people in the 60s and 70s were dirty hippies, too stoned from their marijuana cigarettes to take cover? Or the technology to issue early warnings wasn't as good. Whatever.
One quick and dirty way to address that concern would be to throw some additional variables into their model that allow for different effects of storm severity pre- and post- 1979. I just made an extra indicator variable for whether we were talking about the "early" or late hurricanes and re-ran the model, now including my early indictor, its interaction with the two severity variables, and everything else that was in the authors' original model 4 in Table S2. We get this...
|Estimate||Std. Error||z value||Pr(>|z|)|
Yeah, so the "significance" of the the MasFem index almost entirely disappears under this scenario. Only a ghost of it remains in its interaction with a variable representing the damage caused by the hurricane-- one of those severity-related variables I keep mentioning.
This analysis is pretty unsatisfying, though. If you look at the model, it assumes that pre-1979, the effect of wind speed and damage on the number of deaths caused by a hurricane is different than the effect post-1979. There's nothing magical about 1979 that would make us believe that the relationship between windspeed and the number of deaths should suddenly change. The only thing that changed in that year was the naming convention. I'll note that the authors mention that they tried including a linear trend in time in their model but it wasn't significant. But there are a couple of problems with this. First, the time trend should have been interacted with the severity variables. Second, who says the trend is linear? At that point we get into polynomial trends and so forth, and we're starting to run a little low on degrees of freedom.
Instead, let's approach this in an unconventional but more satisfying way that respects the real process as we understand it. The authors are trying to test the hypothesis that, all else being equal, storms with feminine names kill more people than storms with masculine names. Critics point out that the masculinity/femininity of the name is highly correlated with time, and that the time period during which the storm occurred probably does matter. The quick and dirty model with the "early" variable gives some indication that this is true. Also, I'm not showing it here for brevity, but the linear time trend interacted with the one of the severity variables was highly significant when I added that into the model instead of the "early" indicator.
If it were indeed the case that the explanatory power of the femininity index is driven entirely by the time period's naming convention, then the estimates obtained for the authors' model using the real data should be roughly equivalent to estimates obtained using the same model applied to a dataset in which the MasFem indices are simulated according to the naming convention of the time but otherwise random.
I simulate alternate versions of history in which each hurricane in the dataset is randomly assigned a new name, the only caveat being that the name it is assigned must be from the correct epoch, i.e. a hurricane from 1965 may be reassigned the name of a hurricane from 1975 but not the name of a hurricane from 1985, as 1965 is pre-1979 and 1985 is post-1979. This creates a new dataset in which MasFem is conditionally independent of the number of deaths given the epoch. Thus, in each of these alternate histories, any detected relationship between MasFem and the number of deaths (a significant non-zero coefficient on MasFem) is only present because the death toll is related to the epoch of the hurricane and the femininity of the name is also related to the epoch of the hurricane. This creates a simulated null distribution against which we can test whether the estimated coefficients from the real data are likely different than what we would expect if the death toll were conditionally independent of MasFem given the epoch and the associated naming convention of the time in which the hurricane took place. If you're familiar with graphical model representations of dependence, we are re-sampling new datasets from this model.
From these simulations, we can calculate a kind of empirical p-value that tells us whether the effect the authors find using the original model would be surprisingly large relative to what we should expect if it were true that the relationship between how deadly a hurricane is and how feminine its name is controlled entirely by the epoch in which the hurricane occurs. The distribution of the three coefficients that form the basis for the conclusion that name femininity is related to the number of deaths are shown below.
Each histogram shows the empirical distribution of the estimated coefficients under the assumption that the relationship between deadliness and femininity only exists because both are related to the time period of the hurricane. The yellow dot shows the estimated coefficient in the real data. Here, we see that the coefficient on femininity is pretty similar to what we would expect under our null hypothesis-- in fact, in 35% of the simulated datasets, the estimated effect of femininity was larger than that discovered in the original paper. That is, if it's true that deadliness is conditionally independent of femininity given epoch, we would expect to estimate an even stronger relationship almost 35% of the time. The other two coefficients, those on femininity interacted with minimum pressure and femininity interacted with total damage, show slightly more promising effects. In each of these cases, the estimated effect in the real data is in roughly the 88th percentile. Again, not especially convincing.
Taking all of this on its own, I'd happily conclude that the we shouldn't start giving menacing names to hurricanes-- like DeathScourgeMonster-- as I've seen suggested around the internet. This analysis, however, is only one piece of evidence among a larger of body of evidence presented in the paper. The other studies they present in the paper seem more sound and do support the possibility that people underestimate the danger associated with female-named hurricanes. Taken all together, I'd actually be more modest than most of the statistical witch hunters I've run across on the topic and say that I think the jury is still out on this one. They may be on to something.... maybe.
-- This analysis brought to you by the letter J, for James Johndrow, who helped write this post.