College Football & Econometrics —> Fun with Matching

My friend Stephen Pettigrew, a PhD candidate in the Department of Government at Harvard, and I recently published a post on Deadspin’s Regressing blog. 

The question we ask is how different college football teams have performed when featured in the national spotlight of ESPN’s College Gameday. To do so, we compare their outcomes in such games to otherwise comparable games. In other words, we ask do any of the teams regularly featured on Gameday suffer from a “Gameday curse” and do some of them even potentially benefit from a “Gameday bump”?

To answer this question, we use the statistical method of matching on observables. The results we derive are displayed in the graph below and the full writeup is available hereimage

Ending Gerrymandering

The unpopular yet seemingly pervasive practice of gerrymandering may have may its match: programmer Brian Olson. Using the 2010 census data, Olson developed a program to automate the process of redrawing legislative districts (on both the state and federal level). He improves on previous efforts by not simply drawing lines that minimize district size, but by making districts as compact as possible without putting one neighborhood (i.e. census tract)* in more than one district. 

Here’s an example of the result: 

While this is an exciting development and I think state legislators would be wise to implement a version of  Brian’s work, we (probably) shouldn’t view this as a panacea. As John Sides has argued, “gerrymandering is not what’s wrong with American politics.”

*According to the Census Bureau, “census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people.  A census tract usually covers a contiguous area; however, the spatial size of census tracts varies widely depending on the density of settlement.”

Looking at where our taxes go on tax day

Looking at where our taxes go on tax day

A view at the spatial distribution of income in San Francisco. 
Produced via Social Explorer. Data source: ACS 2010 5yr - Median household income in 2010 Inflation Adjusted Dollars.

A view at the spatial distribution of income in San Francisco.

Produced via Social Explorer. Data source: ACS 2010 5yr - Median household income in 2010 Inflation Adjusted Dollars.

What Can Finance Tell Us About Auburn’s Road to Pasadena?

There’s no way around it: Auburn’s journey to tonight’s game was unbelievable. Even the most optimistic Tiger fans weren’t expecting to make it to Pasadena this year. But here we are, only a few hours away from Gus Malzahn leading his team into the Rose Bowl with the National Title on the line. 

Given the crazy series of events that lead to today, the question I’ve been asking myself is when did people finally start realizing that Auburn was this year’s team of destiny? The “Prayer at Jordan-Hare” was about as miraculous as any fourth quarter heave in recent college football history, but that merely kept the hope of Pasadena alive and set up the biggest Iron Bowl ever. Of course, we all know how that transpired, but, again, even after “Kick Bama Kick,” Auburn still needed to take down a tough Mizzou team in the Georgia Dome and pray for a Michigan State upset of Ohio State in the B1G Championship game. 

On December 7, the stars aligned yet again for the Tigers and both remaining pieces of the puzzle went Auburn’s way, setting up today’s matchup with Florida State. 

So let’s return to my question: which piece of the puzzle was the one that convinced the world that Auburn would be here today? One way to answer this would be to examine bets coming into Vegas on whether Auburn would make it to Pasadena and see where the surge is. Unfortunately, I’m not a bookie and thus don’t have access to such data (if you are and would like to share, please let me know!). However, there is an alternative data source that can offer basically the same insights.

TeamTix is a company that offers fans the right to reserve face-value tickets to bowl games (among other sporting events) if their team makes it to that game. Those in finance refer to this as a forward market: you pay today for the delivery of an asset tomorrow. In this case, that asset is just a reservation to buy tickets to see your team play in the big game.  Of course, that asset may be worth a lot if the team makes it to the game, but would be worthless if they don’t (making this also similar to a call option).

The price for making these reservations is driven by supply and demand and the mechanics of this forward market are pretty straightforward. Just like you do when trading stocks, you place a bid when you think a team is undervalued (more likely to make the bowl game than their current price reflects) and offer to sell a reservation you hold when you think the team is overvalued (not as likely to make it). The phenomenon produced by everyone doing this at once is that TeamTix prices reflect the “market’s” interpretation of the probabilities of teams making it to the big game. In other words, schools that are seen as very likely to make it, like Alabama prior to the Iron Bowl, have high prices, while schools that aren’t seen as darkhorse candidates have much lower prices. And these prices are fluctuating 24/7. Best of all, TeamTix sends a tweet every time a new purchase is made in their BCS Championship ticket marketplace. 

So what are the moments that matter? There are two that induced by far the most movement in the market (see chart below).

First, on the day of the Tigers upset of Alabama, there were steady purchases of Auburn reservations on the forward market, with prices hovering around $80. Just literally one minute after Chris Davis ran 109 yards (i.e. 7:26 pm EST), purchases for Auburn tickets came flying in. 16 new orders were made within 10 minutes, pushing prices up to $140 and going up to $180 by 10 pm that night.

From here, things were pretty calm on the Auburn ticket market, with prices actually falling over the next week (they hit of low of $102 on Friday night). Presumably, this can be attributed to Auburn fans (and speculators) realizing that an appearance in the title game hinged on Urban Meyer’s Buckeyes losing to Michigan State, an outcome many saw as unlikely.

On the day of the SEC Championship, the forward market for Auburn tickets saw a steady slew of orders, but prices remained fairly stable, climbing only to $5 to $133 before kickoff. They stayed here for much of the game and didn’t really start rising substantially until the game was called at 8:06 pm EST. Over sixty orders came in over that hour and prices rose up to $180. Of course, though the real action was still to come, as the OSU-MSU game wouldn’t finish for a few more hours.

After Connor Cook’s 9-yard touchdown pass to Josiah Price gave MSU a 27-24 lead with 11:41 to go in the fourth quarter, action started again in the TeamTix marketplace. Still, prices didn’t change much until 8:40 PM, when the Spartan D stopped Braxton Miller on 4th-and-2. By the time Jeremy Langford ran in the game-sealing 26-yard touchdown, the price to reserve tickets to see Auburn play for the BCS title had soared to $480. Between the conclusion of the game just before midnight and the BCS’ announcement that Auburn had officially made it to Pasadena, 137 orders came in, with prices peaking at $700 that afternoon.

The fat lady had sung and now fans were finally ready to put down their money on the upstart Auburn Tigers. We’ll find out tonight if that was money well spent. 

On the perils of extrapolationlocal regression

The Bureau of Labor Statistics released the jobs’ numbers for the month of November and many were pleased to see the unemployment rate fall from 7.3% to 7.0%. This was seen by many as positive news and the stock market jumped as a result. 

However, not everyone viewed the news through rose colored glasses.

He makes an important point: we shouldn’t consider the jump from October to November that meaningful given that the labor market was highly distorted by the government shutdown that lasted the first two weeks of that month. By showing that employment actually dipped from September to November (according to “the household adjustment survey data adjusted to a payroll concept”), he implies that we should be actually quite worried about the labor market. 

After seeing this tweet, I set out to see if a longer pattern in the data match corroborate his conclusion. To do this, I downloaded the past 5 years of civilian employment data* from the FRED database and fit two loess (aka local regression) curves. The first (in blue) has a span*  of 1/5, while the second (in dark green) had a span of 2/5. Otherwise, the two are identical.

This subtle difference turns out to be pretty important in evaluating the November jobs report. By fitting the blue loess curve to the previous 60 months of data (September 2009 - September 2013), we predict employment of 144,127,300. Our predict using the red loess curve is 144,592,200, a difference of 464,900 jobs. Perhaps more importantly, the blue curve says the labor market is out performing our expectations, while the red curve says the opposite. This important difference can be solely attributed to seems like an unimportant or minor modeling choice. This is an lesson for all of us working in data science to remember when interpreting data.

*This data set is close but not identical to the one Wolfers uses. I was unable to find the exact data set that he did.

**The span is also known as the smoothing parameter and is represented by alpha in regression equations. A larger alpha corresponds to fewer local regressions being run, while a smaller alpha uses fewer data points to construct the smoothing curve. When alpha is equal to one (the upper bound) the loess is identical to a standard linear regression.

Who do you think will win the Nobel Prize in Economics?

On Monday, the Nobel committee will announce who will be awarded their Prize in Economic Sciences. The winner will get a medal, a personal diploma, and a cash prize (8 million Swedish Krona or ~$1.2 million).

Fun fact: Franco Modigliani used some of his $225,000 Nobel Prize money to upgrade his laser-class sailboat.

So who are the front-runners? According to British betting house Paddy Power, the favorite (at 4/1 odds) is Yale housing and financial economist Robert Schiller. Tied for second at 5/1 are Harvard macroeconomist Robert Barro, LSE econometrician Anthony Atkinson, and NYU development economist Paul Romer. Here’s the full set of odds:

While whoever wins this award will no doubt have a career defining moment when they receive the phone call, it doesn’t actually mean their reputation in the field will change that much. According to research by Samuel Bjork, Avner Offer, and Gabriel Söderberg in Scientometrics, winning the Nobel provides a small boost to citations in the short-run, but then citations begin to decline. As The Economist explains, ”people tend to win Nobel prizes when their career has nearly reached its peak. The Swedish Academy, which makes the award, plays it safe.”

Anyways, I’m looking forward to finding out who wins!

Is there an end in sight? My $0.02 on the shutdown

Every since the shutdown began at midnight on Monday, pundits of every shape, size, and color have pointed out that there’s a silver bullet: allowing a vote on a clean continuing resolution (CR) would refund the government and “everything would return to normal.”  After all, there have been 20 relatively moderate Republicans (listed here) who have publicly supported the passage of a clean CR. Combine these votes with 200 from the members of the House Democratic caucus and the numbers are looking really good for those, like me, supporting the end of the shutdown.

However, just because a clean CR would pass if it got to the floor does not mean it will get there. That’s because Speaker Boehner has invoked the Hastert Rule, thereby giving him the sole power to decide what gets on the agenda and he has every incentive to hold onto that, preventing a vote that he won’t support. Still, there is a way around Boehner’s control of the floor: the discharge petition. News that Pelosi and the House Democrats are seeking to use this instrument threw the Twittersphere into a tizzy on Friday. 

While this is a promising (if unsurprising) development, it doesn’t mean that government employees should get ready to return to work. This is true for two reasons.

The first is that the discharge petition is designed to be slow. And I mean really slow. GW political scientist Sarah Binder explains:

Time lags built into the discharge rule are bound to frustrate lawmakers if they seek to open a shuttered government.   Even if an aspiring lawmaker bones up on the House rule book and today introduces a CR and a discharge motion to dislodge it, the earliest the motion to discharge would make its way onto the discharge calendar after securing 218 signatures would be November.  (I am assuming that the House’s calendar and legislative days run roughly in tandem this month).  If the motion doesn’t make it onto the calendar until after the second Monday of the month, the bill would be discharged at the earliest in late November.  Procedural details make the discharge rule ill-suited for swift enactment of a clean CR.

The second reason is that it’s costly for centrist Republicans to add their names to the petition (for more on this, read Molly Jackman’s blog post and Brookings white paper). The point here is simple: a moderate Republican signing onto the discharge petition would, to put it bluntly, really piss off Boehner. So the next time that legislator was trying to get her pet bill or amendment passed, she may not find a supportive audience in the Speaker’s office. Thus, these legislators have to decide which is more important for them: pleasing their centrist constituents who want the shutdown to end or maintaining a positive working relationship with the party leadership, a necessary component for future legislative success.

If these middle-of-the-road Republicans had a really short time horizon and were facing a general election opponent tomorrow, they’d probably side with Pelosi and the shutdown would end. Of course, we’re still pretty far away from that (and for many, the more pressing electoral concern is fending off primary challengers) so these officials aren’t all likely going to blow their political capital on ending the shutdown.

That leads to a final question: what would have to change for the discharge petition to succeed? Another way of putting this is at point does continuing the shutdown becoming costlier for the “swing legislators” (the moderate Republicans) than defecting from Speaker Boehner (assuming his stance doesn’t evolve). This decision-making calculus is illustrated in the figure below. 

The red curve illustrates the costs of continuing the shutdown. While many individuals in these key districts are already severely adversely affected, what really matters is the pain that the median voter there is facing. Surely, the “average” person in, say, Republican Congressman Devin Nunes’ 22nd district of California is annoyed with this state of affairs, but probably isn’t overwhelmingly personally affected. However, as the shutdown continues, they’re likely to face some increasing inconvenience (e.g. canceling their annual backpacking trip to Yosemite) or direct cost (they can’t purchase federal crop insurance). Thus, the costs begin to rise exponentially, though they are bounded at some upper level. 

At the same time, the costs to the moderate Republicans of signing onto the discharge petition drops considerably over time (the blue curve). There are two reasons for this. First, it is easier to defect as the shutdown becomes even more unpopular and Speaker Boehner’s power in the halls of Congress wanes. Second, as more fellow Republicans defect, it is easier for you to defect. Think of this as a tipping point: being the first Republican to sign onto Pelosi’s discharge petition is pretty risky since you don’t always know if others will follow (you probably don’t want to be the first to stick your neck out), but being the 18th is much less so. After all, there are strength in numbers when making a politically risky move.

Ultimately, at some point, t*, in the figure, the blue and red curves intersect and that is when the shutdown will end. It will be simply too costly for the swing legislators to not sign onto the discharge petition at that point. Let’s hope that we reach that point sooner rather than later.

When the government shutdown hits day four

via wheninwashington:


“The first duty of a man is to think for himself.” 
José Martí

“The first duty of a man is to think for himself.” 

José Martí