Skip to content

Predicting the Supreme Court from oral arguments


Trying to predict the decisions of the Supreme Court is probably as old as the Court itself. And with many high profile cases being argued lately, the Court has been in the news a lot (see here, here, and here). People have developed statistical approaches such as {Marshall}+, and there is even a Fantasy SCOTUS league where anyone can make their predictions. Here’s a great story about some of these efforts and how well they do.

Commentators often stress how important oral arguments are in getting a sense of how the justices will vote. But statistical or machine learning approaches have focused on the case history rather than the oral argument. I’ve changed that, building what I believe to be the first machine learning Supreme Court predictor that uses oral argument features.

My model predicts outcomes correctly over 70% of the time and provides a probability for each prediction. If this was a spam filter that accuracy would be terrible, but for a Supreme Court predictor, it’s pretty great.

And I’ll toot my own horn for a second and say that I did all this from start to finish in three to four weeks.

Screen Shot 2015-03-05 at 3.41.28 PMYou can find my predictions of current and historical cases at I haven’t had much time to make an awesome front-end, so you have to enter the docket number of the case. But there are links to the Supreme Court website to find the docket numbers. And hopefully my site doesn’t crash if it actually gets some traffic.

So far in this session I’ve correctly predicted the five decisions released. And here are some of my predictions for the high-profile cases argued so far:

– For yesterday’s Affordable Care Act case (14-114) my model predicts a 60% chance the ACA will be upheld.

– For the Arizona gerrymandering case (13-1314) I predict a 75% chance that Arizona will win and gerrymandering will continue.

– For the Abercrombie & Fitch headscarf case (14-86) I predict a 69% chance that the Equal Employment Opportunity Commission will win.

Perhaps the most impressive part of this is that my accuracy is so high while completely ignoring any information about the case itself other than the way the justices act during arguments. My model doesn’t know what kind of law it is, or which district court the case came from, or who the lawyers are. It only really knows how the justices ask their questions.


Here are some technical details about the model and features:

I downloaded all of the transcripts of oral arguments from 2005-present from the Supreme Court’s website, converted them from pdfs to text files, then then did some natural language processing on the arguments to extract a set of features I thought might be important. More about those soon. I then built a machine learning classifier to predict the case outcome using only the features I extracted from the oral arguments of that case and trained it on cases argued prior to 2013.  I use a linear SVM classifier (although logistic regression also works well) evaluated with cross-validation. I then tested the model by trying to predict the cases from 2013-2014, and correctly predicted the outcome over 70% of the time.

I’m not predicting individual justice votes here, only the case outcome. Because in practice a 9-0 loss is just as bad as a 5-4 loss.

The features I selected are the relative number of words each justice says to each side, the relative sentiment (positive/negative) of those words to each side, and the number of times each justice interrupts each lawyer, along with a few others. The interruptions are my personal favorite and I’ve never heard anyone suggest them as an indicative feature before.

The intuition here is that in general, if a justice asks more questions they are trying to poke holes in your arguments. If their questions are more negative that is bad for you. And if they cut you off they disagree. Anthony Kennedy seems to be the exception here: if he asks you more questions that is bad, but if he interrupts you more that is good!

I also only track five justices: Scalia, Kennedy, Roberts, Breyer, and Ginsburg. These are the five justices for whom I have complete data from 2015-present. I could use Justice Kagan, for example, but would have less data for her on which to train the model. These five comprise the ideological center of the court and one from each flank. Since the justices votes are highly correlated, this gives a lot of power to account for ideology in prediction without having to follow all nine justices.

The obvious next step is to use the predictions of {Marshall+} as a prior probability and combine it with the oral argument data to give a posterior estimate of the outcome.

You can view, download or modify my code from my github page.

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: