To obtain a ballot you stood in a line devoted to a portion of the alphabet. Last names beginning with A-D, E-K, L-R, and S-Z each had their own line. I stood waiting in my line (L-R) behind 7 or so people, and in the entirety of my time in line not more than a single person entered any of the other lines.
Perhaps the nice volunteer handing out ballots for L-R was just very slow. But another possibility is that the way names were grouped created an imbalance of line lengths. I can think of lots of last names staring with L-R, but fewer for E-K for example. This could be important because many people may be discouraged from voting if lines are too long.
To test whether or not the groupings they chose were reasonable, I downloaded some of the 2010 census data on names. They have compiled a file of last names and how many times they occur. This file only includes names that have appeared nationwide at least 100 times, so extremely rare names are not represented.
I counted the number of individuals with last names beginning in each of these four groupings, and the proportion of names falling in each grouping are shown in the bar plot. As you can see there is a clear overrepresentation of people whose last names start with L-R, and an underrepresentation of names starting with S-Z.
But these census data are nationwide; I was unable to find data specific for Durham county (North Carolina). It is possible that the distribution of names in Durham is different from that nationwide. The reason is the ethnic composition of the area.
For example, in California where the Asian population is very high, I would expect the frequency of names beginning with S-Z to be very high, because there are many Chinese surnames that begin with W, X, Y, and Z. Durham county, and especially my precinct, is a largely black community. If the last names of blacks in Durham tend to begin with L-R more than the national average, this could magnify the already existing bias.
Whether or not this is true is not an easy question. The 2010 census data show that some names are very indicative of ethnicity (see the table reproduced here). For example, If your last name is Yoder the chance you are white is 98.1%. There are many such names that correlate closely with ethnicity for whites, Asians, and Hispanics. But among blacks the names are not as telling. Besides Washington (89.9%), Jefferson (75.2%), and Booker (65.6%), there is no name in the dataset where the probability of being black given that you have that name is over 60%. This is not at all true for the other ethnicities. This means that black names are perhaps more likely to represent the national average than are those of other ethnicities.
Perhaps most interestingly to me, this was the kind of problem that would be easier addressed in an analog fashion than a digital one. If I only had a Durham county phone book, I could simply count the pages of names A-D, E-K, L-R, and S-Z. This would have controlled for geography and demography, and would have probably been faster too!
Most people seem to love the music they grew up with. Not me. I spent junior high, high school, and most of college in the 1990’s, yet I strongly dislike the music of that era. Now this is not a post about whether or not I am right that the music of the 90’s was terrible (it was), but about how to quantify just how much I dislike that music, and maybe learn something about myself in the process. To Science!
I exported a playlist of all the music in my iTunes library, which at the time of export included 11080 songs (excluding classical and spoken recordings). The file created contains all the information about each song in your library.
I plotted a histogram (shown in pink) where each bar represents the number of songs in my music library that were released that year. There are two major peaks: one in the past 5-10 years, the other around 1960, with a huge valley between them that bottoms out around 1990.
Notice there is another deep trough between 1941-1945, corresponding to the drastically reduced output of the recording industry during the Second World War.
I suspected those two big peaks were for different reasons. If we remove all the jazz from the collection (shown in blue), the size of the earlier peak is greatly reduced.
What I found a little surprising was that the real bottom of the trough occurs between 1988-1995. This is a bit earlier than I had guessed it would be. Maybe I dislike the late 1980’s and simply didn’t realize it.
If we look at only the jazz songs (shown in red) the peak around 1960 is quite strong, but it also becomes apparent that the peak in the late 1930s is due almost entirely to jazz recordings. In particular this is due to my fondness for stride piano. And apparently I didn’t much care for the jazz in the 1980’s.
1. Music ownership is equivalent to liking. Partly true. I don’t own things I don’t like and actively remove things I don’t like. But I don’t like everything I own equally. I don’t have ratings of every song, but I expect the distribution of music I “love” is a bit more flat, with great music from all eras.
2. There are no underexplored eras. You might say “Well, you just haven’t heard the really awesome music of the 90’s.” Possible, but unlikely. I’m very active about seeking out new music.
3. With the exception of WWII, the number of records released has generally increased over time. In addition to this, many older recordings have never been released in electronic formats. This creates an overrepresentation of recent eras.
4. Years are representative. Automatic music databases often will provide you with the re-release date of music instead of the original release date. This is important because it might otherwise bias the results, particularly if you have a lot of music that was originally released on LP. I am very diligent about making sure that the years of all my songs are correct.
The hardest part of doing this to your own music library is getting the data in the right format. Here is the R script I used to make the plots, preceded by a lot of comments on how to do some find-and-replaces to get the data so R can read it. Happy exploring!
How much do you think about meat? Maybe you think a lot about the ethics of meat eating. Perhaps you are highly conscious of the healthiness of different types of meats. Or maybe you just daydream about how delicious meat is. However much you think about meat, it isn’t as much as Josh Miner does, I’d wager.
He has created a 4-way Venn diagram in which he looks at four properties of meat: deliciousness, healthiness, ethicality, and ‘realness’ (how unprocessed it is). Below is his diagram, complete with his examples of each of the 15 possible partitions of those four categories. One example I thought I could improve upon was for his Group #6: a meat that is delicious and real (not highly processed), but unhealthy and very unethical: foie gras! Here are Josh’s figure and examples, reproduced with his permission. Enjoy!
1: A real food that is unpalatable, unhealthy, and immoral (ex. fatty, tough, nasty piece of industrially-produced chicken)
2: An edible food-like substance that is delicious, but unhealthy and immoral (ex. McDonald’s chicken nugget)
3: An edible food-like substance food that is unhealthy, unpalatable, but moral (this is a bit of a tough one — pre-cooked, highly-processed skinless hot dog made with local, grassfed beef but that is high in fat and sodium and that for whatever reason just doesn’t taste very good)
4: A real food that is unpalatable, unhealthy, but still moral (ex. the same fatty, tough, nasty piece of chicken from #1, except that the chicken itself was truly ‘free-range’ and raised on a small-scale farm located within a couple hundred miles of where it was purchased)
5: A real food that is unhealthy, but still delicious and moral (ex. same thing as #4, except now it is a nicely-prepared, deep-fried wing or thigh)
6: A real food that is delicious, but unhealthy and immoral (ex. fatty rib-eye steak from industrially-raised beef)
7: A real food that is delicious, good for you, but immoral (ex. a nice, grilled, industrially-raised chicken breast)
8: An edible food-like substance that is delicious and good for you, but immoral (nicely grilled highly-processed, low-fat, low-sodium chicken apple sausage made with industrially-raised chicken)
9: An edible food-like substance that is good for you but tastes bad and is immoral (ex: prison loaf or pink slime — and yes, to all you skeptics, both of these edible food-like substances will give you plenty of health-promoting macro- and micro-nutrients; if you were stranded on a desert island, you’d be lucky to have either of these as sustenance, especially if they were fortified with Vitamin C and could ward off scurvy)
10: An edible food-like substance that is delicious and moral but unhealthy (ex. hot dog from #3 except that this one tastes really good)
11: An edible food-like substance that is delicious, moral and healthy (ex. hot dog from #10 that is low in both fat and sodium — I had about 10 cases of these made for me using the trim from the last steer we bought and still have some of in my freezer right now; they are a great option for the National School Lunch Program)
12: THE HOLY GRAIL: a real food that is delicious, moral and healthy (ex. just about any pasture-raised, humanely-slaughtered piece of meat that is well-prepared; probably the quintessential example would be something wild that you’ve shot and dressed yourself, like deer — bonus points if it is also a destructive invasive species like feral pigs)
13: A real food that is good for you and moral, but that doesn’t taste good (ex. the chicken from #4, except now it is an over-cooked, under-seasoned chicken breast)
14: A real food that is good for you, but doesn’t taste good and is immoral (ex. chicken from #13, except that the chicken was industrially-raised)
15: An edible food-like substance that is good for you and moral, but doesn’t taste good (ex. highly-processed, unseasoned, low-fat chicken apple sausage from happy, local, free-range chickens)
It is a well-established candy-scientific fact that peanut butter and chocolate are two great tastes that taste great together. But not all combinations are equal. I have long held the opinion that the regular size Reese’s Peanut Butter Cup is superior in tastitude to the miniature size. You might disagree, but maybe you have also noticed a difference.
Why should there be a difference? Maybe the shape plays a role: the miniatures are taller than the regulars, but are barely half the diameter. But could it also be that the chocolate to peanutbutter ratio is different for these different sizes? To Science!
I obtained two regular and two miniature Reese’s peanut butter cups and weighed each using a scale I borrowed from a friend. I then carefully separated the chocolate from peanut butter using only my hands and a dull knife, weighing each portion separately. This was easier than you might think, and I believe this experiment could easily be done with kids to encourage curiosity. It involves both dissecting and candy!
Now I suspected there would be a difference from inspecting the nutritional information for the two sizes. For roughly the same serving size, the regular cups have more sodium and protein while the minis have more calories and sugars. Since peanut butter contains salt and protein I thought the regular cups might have more peanut butter than the minis.
Just how much more was pretty surprising: the regular cups are 46% peanut butter by weight, while the minis are only 33% peanut butter. The rest is chocolate. This might explain why someone people prefer the regular and others the minis: I like the creamier, saltier regulars more than the sweet minis.
An interesting aside: while the regular cups are manufactured in Hershey, PA, the minis are made in Mexico.
Today the Supreme Court ruled unanimously (9-0) that police cannot install a GPS device in a car to track a suspect without first obtaining a warrant to do so. I’m always kind of satisfied when the Court has unanimous decisions; it makes me feel like the issue is well-settled. No disagreement. Done.
Yet it has seemed in recent years that so many decisions are 5-4 splits. That the Supreme Court has become politicized in a way unlike ever before. But is this really true? Is the High Court any more ‘divided’ than it has been in the recent past? Let’s look at some data.
I downloaded the Supreme Court case record from 1946-2010 from the Supreme Court Database and plotted the proportion of cases decided by majorities of different sizes, by year. If the court were ‘more divided’ now than before, we would expect more cases won by a majority of 5 than in the past. That is shown in purple, and unanimous decisions are in yellow.
Some interesting trends come from this. You can see that in the 90s there seemed to be quite a lot of cases decided unanimously. And since around 1990 or so there seem to be fewer cases decided by a majority of 6 (in red) than there had been before. And it does appear that the past 10 years or so have had slightly more decisions with a majority of 5 than the average for the past 60 years.
It it important to note that I haven’t ‘proved’ any of these conclusions in any sense. But by exploring the data in this way we can find patterns that help us to understand where to look more closely. Then we can perform an explicit statistical test. What patterns can you find from exploring this data?
UPDATE: Rethinking things, a ‘divided’ court should still be expected to decide unanimously sometimes: some cases are just very clear. An ‘undivided’ court should be one in which cases are divided by margins other than 5 or 9. If we plot the proportion of cases won by a majority of 5 or 9 as a function of time we get this plot. Here, there is a really clear trend: over the past 30 years the court has been increasingly either voting unanimously or narrowly divided.
When I got my cell phone over two years ago it came pre-installed with a demo version of a game called Jewel Quest 2. The demo version allows you to play the very first level, then closes the game. This first level, like most first levels, was very easy. Too easy for me to buy the full version. But I would often find myself with a few minutes to kill, so I have been playing the demo version of this game regularly for over two and a half years now. How and Why would I do such a thing? Read on.
Jewel Quest 2 is a type of “match-three” game, based on the game Shariki. Bejewelled is another such game. The way the game works is simple. You swap the position of two adjacent items, and in doing so you must create either a row or column that contains three (or more) of the same item consecutively. When you do, those items disappear and everything above them falls down to fill their places, with new items appearing at the top. The level ends when you have made an item disappear from every space on the board or the time runs out, whichever comes first. You can try playing the game for free on the web here (this web version had 5 types of items instead of the 4 on my phone).
As I continued to play I would make the game more challenging by giving myself new restrictions on how I could play. First I decided I would try to complete the level as fast as I possibly could. Pure speed. After I tired of this, I changed my objective to completing the level in as few moves as possible. This required more planning, so that making one move cause chain reactions on the board. I eventually tired of this and started instead placing limits on where I could make moves. Can’t make any moves in the top two rows. Then in the top four rows. And so on. The hardest version so far is that I can only make moves within rows three and four from the bottom (not in the bottom two rows nor in the top four). This is pretty difficult. The only way you can reach the bottom row of a column is to first get the item in the second row to match the item in the bottom row (remember you can’t touch the second row directly), then move another of that item into the third row above them to complete the set.
Sometimes you would get lucky and would have a few columns like this at the start of the game. Let’s call when you start with the bottom two row the same kind a ‘success’. In the screenshot above I started this game with 3 such successes. So I wondered what the chances were of starting with 0, 1 ,2, 3, or more ‘successes’. The tricky part is that the starting board cannot have any 3-in-a-row (or column) because they would of course disappear. So at the start of the game the contents of each space are not statistically independent of each other. If there are two in a row, the next one can’t be that kind. And this works both side to side and up and down.
What I think is happening is that each of the four kinds is equally likely, unless there are already 2 in a row or column preceding it. In that case, the kind in that space is equally likely from the remaining possible kinds. If I’m right about this, I can think of two ways to calculate the probability of having 0,1,2,3, or more ‘successes’ at the start of the game. One way is to simply do the mathematical calculation. I won’t tell you about how this is done now, but I may in another post.
The other way is to write a computer program that draws a starting game board according to the rules I think it is using. The program is simple. It enters an empty space and says “This could be any of the four kinds.” But before drawing the kind of item for a space it asks “Are the two spaces above the same as each other? If so, don’t use that kind for this space,” and then asks the same question about the two spaces to the left. Then it draws the kind to put in my new space from the possible kinds remaining. It then moves to the next empty space and repeats, until the board is drawn. After you’ve drawn a board, the program counts the number of our ‘successes’ and stores that number, and does this over and over.
This gives me the probability that I will get 0, 1, 2… successes if I play by the rules I laid out. They can be shown in a histogram for visualization. The tallest bars are for 1 and 2 successes, which means that these are the most likely outcomes under my rules. Likewise the bars are very small for 4, 5, or 6 successes, meaning they are very unlikely.
Now that I know what I should expect if I am right, I need some data from the actual game itself. I started the game on my phone 200 times and counted the number of ‘successes’ each time. This is much more boring than actually playing the game, but is the only way to test my theory. I can put the results into a histogram to compare visually to the simulations. Looks pretty close.
I’ve put the results into a table, where the outcomes of the real game are called “observed.” The simulations I ran gave me probabilities, and from them I know how many of each kind of success I expect if I play 200 times. I put these into the table called “expected.” Looking at the table, the observed and expected don’t look too different. This looks pretty good for my theory, but eyeballing it isn’t god enough. We need to do a statistical test.
The question we want to ask is “Are the results of the real game very different from what I would expect if I was right about the rules of the game?” Another way to phrase the question is to ask “Could my rules have produced the data we observed playing 200 games?” There are several statistical tests I can use to ask these questions, and I don’t have the space here to describe them. But the results of the tests showed that the data we saw from the game could have reasonably been produced by my rules. What it really says is that what we observed from the game is not different enough from our expectations that we can prove me wrong.
While this means the evidence supports my theory, it does not prove I was correct. It is possible that the rules of the game are really slightly different, but that 200 games is not enough to prove I was wrong. The bigger the sample size gets, the more powerful the statistical tests become to distinguish subtle differences. This is an important fact about statistical tests: small samples do not always reflect the truth. This is one reason why political polls are often misleading (there are several other reasons as well).
Here’s one way I could have been wrong: Perhaps on this easy demo level you are more likely to get large clusters of the same kind of gem than you would expect at random. For example, if you had two diamonds in a row, we know the next one cannot be a diamond, but maybe the next next one is more likely to be a diamond, to make it easier for us to complete the move. This could change as the levels get harder. I could test this as well. I can easily simulate with my computer program what I should get, so I would just need to collect more data from the actual game on how the gems cluster.
I’ll leave that task up to you. I’m too busy creating harder versions of this demo level for myself.
I ate a lot of peanut butter as a kid. Specifically Jif peanut butter. I didn’t even try another brand until after college, and maybe even then begrudgingly. A friend told me a similar story about his daughter, also a Jif loyalist: When he suggested to her that she might like another brand better she became very angry and insisted that she didn’t want to even try another brand, even if it WAS better. Perhaps you know someone with such a brand-loyal attitude, even as adults.
Was I unreasonably stubborn? Did she overreact? Probably, if this was only about peanut butter. But could it be about something bigger? What if my sense of identity was tied to my peanut butter brand? Asking me to change my peanut butter is then asking me to change not just something I eat, but a part of me. Sound crazy? A study published in upcoming issue of the Journal of Consumer Psychology asserts exactly that.
The authors argue that the brands we like become part of how we view ourselves. If that is true, we would expect that if someone were to criticize our favorite brands we might react in a way similar to if someone had criticized us directly. In the authors’ experiments, when people who felt strongly about a brand were confronted with negative information about their brand their measure of the individuals self-esteem dropped. They took it personally. (Wired magazine had a nice write-up about the article.)
This is a pretty cool notion, that how we value ourselves is determined in part by things completely outside ourselves. My guess is you should see similar patterns with not only brands we choose but also our favorite music, movies, and maybe our favorite local pizzeria.
But it gets even more interesting. How exactly do people react when their brands are shown to be inferior? When confronted with evidence of poor performance, people don’t just get sad then change their mind about the brand (“Oh, I guess Jif isn’t the best peanut butter after all…”). Instead they continue to rate their brand highly in spite of evidence to the contrary. They ignore the evidence.
That sounded familiar. A series of political science studies at the University of Michigan in 2005 and 2006 showed the same pattern: Misinformed political partisans when shown the facts did not change their minds, but instead held more strongly to their misguided beliefs. We don’t like being wrong, so instead of folding our hand we double-down.
These studies reveal something very interesting about the nature of self-esteem and our ability to be impartial when presented with new information. As scientists we try very hard to be completely objective, and I think we are pretty good at it. Unless maybe it hits a little too close to home.