Want a better way to elect candidates? Look to Burlington, Vt.

Toby Talbot/AP/File

In 2009, the city used an alternative way of voting in its mayor's race, which selected a 'better' winner than under the traditional runoff method.

In March 2009, Burlington Vermont used a non-traditional system of voting—Instant Runoff Voting—to select its mayor. The voters returned the incumbent, Progressive Bob Kiss, to the mayor’s office and, in so doing, set off a surprisingly fierce debate among advocates for voting reform. Some tout the Burlington results as a success for Instant Runoff Voting, while others cite them as evidence of its fundamental flaws.

In this post, I will try to settle one part of this debate: whether the Burlington results display a voting pathology known as non-monotonicity. That sounds geeky—ok, it is geeky—but it boils down to a simple question: could a candidate lose an election if voters showed more enthusiasm for him or, equally perversely, win an election if voters showed less enthusiasm?

Several readers asked me to weigh in on this debate after my previous post on alternative voting systems (check out the comments on that post if you want to get a flavor of the debate). I should state from the outset that I am not an expert on voting systems, but I am a card-carrying math and economics geek and enjoy mediating interesting debates, so I gave it my best shot. I reached three main conclusions:

  • The Burlington results provide a fascinating case study in American voting for reasons that have nothing to do with non-monotonicity. In what was effectively a three-way race, Instant Runoff Voting (henceforth IRV) appears to have chosen a better winner than our usual system, plurality voting. That’s great news for IRV except for one thing: it failed to choose an even better winner. IRV thus appears to have elected the “wrong” candidate, but traditional voting would have elected an even “wronger” candidate. That weird result illustrates how challenging it can be to design a democratic voting system.
  • The debate about non-monotonicity–which pales in importance next to the larger issues posed by the Burlington results–confuses technical semantics and electoral substance. The Burlington results do illustrate the possibility for non-monotonicity in real world voting data, as IRV critics claim. But, as IRV proponents emphasize, that potential had no effect on the election outcome.
  • The debate among voting reformers would be more fruitful if they adopted some new lingo to distinguish between the potential for non-monotonicity and its actual impact. Inspired by the world of accounting, my suggestion is to distinguish between material non-monotonicity, in which it affected an election outcome, and immaterial non-monotonicity, in which it didn’t. The Burlington election results display the immaterial variety. (Suggestions welcome for better ways of saying this.)

For further details in a handy-if-lengthy Q&A format, keep on reading.

How does Instant Runoff Voting differ from traditional voting?

In most U.S. elections, each voter casts a vote for a single candidate, and the winner is the candidate with the most votes. This approach is often called plurality voting, because candidates need a plurality of the votes, not a majority, to get elected.

In IRV, each voter is asked to rank all the candidates. If no candidate wins a majority of the first-place votes, the candidate with the lowest number of votes is dropped from the race, and his or her votes are redistributed based on those voters’ second-place preferences. The votes are then tallied again—that’s the instant runoff part—and the process repeats until one candidate has a majority of the votes.

What happened in Burlington’s mayoral election in 2009?

This election was effectively a three-way race between Progressive Bob Kiss (the incumbent), Democrat Andy Montroll, and Republican Kurt Wright. After other, minor candidates had fallen by the wayside, the first place votes for these three were: Progressive (34%), Democrat (29%), and Republican (37%). None of the candidates had a majority, so the lowest vote-getter, Democrat Montroll, was dropped from the race. His voters were then reallocated to whomever they ranked next on their ballot. The votes in this final run-off were Progressive (49%) and Republican (46%), so the Progressive Bob Kiss was returned to office. [In calculating those percentages, I used the total number of voters who voted for at least one of the three candidates; the missing 5% in the final round are voters who ranked the Democrat first among the candidates, but did not rank the other two.]

Why do you say that IRV did better than traditional voting?

If you take the votes at face value, they suggest that a plurality race among the three candidates would have elected the Republican (his 37% would beat the 34% and the 29% of the other two candidates). But the voter rankings show that the Republican was the least popular of the three candidates. As noted above, the Progressive beat the Republican in the final head-to-head race by 49% to 46%. A similar head-to-head race shows the Democrat beating the Republican by 52% to 42%. For that reason, both IRV proponents and IRV detractors believe that IRV did better than plurality voting in this election.

That conclusion comes with a major caveat, however: it presumes that everyone would have behaved the same way in a plurality race. In reality, some voters might have voted differently, some might have stayed home, some non-voters might have turned out, and the candidates might have campaigned differently. For all those reasons, there is no way to be sure what would have happened with plurality voting. Nonetheless, the votes that were cast suggest an appreciable risk that plurality voting could have elected the least popular candidate in the race.

Why do you say that IRV nevertheless elected the “wrong” candidate?

In a head-to-head race, the votes suggest that the Democrat would have beaten the Progressive by a margin of 46% to 39%. Subject to the same caveat that voter and candidate behavior might have been different, the votes thus suggest that voters preferred the Democrat to both the Progressive and the Republican. Yet the Democrat finished third in the IRV results.

The Burlington results thus illustrate one of the most difficult challenges for a democratic voting system. How should you evaluate votes when you have three candidates, one of whom tends to be the second choice for a large number of voters? In this case, the Democrat Andy Montroll was clearly the man in the middle. He was the second choice of 85% of the voters who ranked the Progressive first (and expressed a second preference) and of 75% of voters who ranked the Republican first (and expressed a second preference). As a result, he would beat either of those candidates in a head-to-head race. His first-place support was sufficiently low, however, that he didn’t make it to the final runoff under IRV, and it appears he would have lost a plurality race.

That seemingly anomalous result is one of the reasons that some voting reformers prefer approaches such as Range Voting rather than IRV. However, Range Voting is not without its own potential flaws. (You can explore this debate over at RangeVoting [IRV critics] and FairVote [IRV proponents]).

What do monotonic and non-monotonic mean?

An election system is said to be monotonic if a candidate cannot be harmed (i.e., flip from winning to losing) if he “is raised on some ballots without changing the relative orders of the other candidates.” This idea also runs in reverse: an election system is monotonic if a candidate cannot be helped (i.e., flip from losing to winning) if he is lowered on some ballots without changing the relative orders of the other candidates.

If an election system fails this criterion, it is said to be non-monotonic. It is easy to show that IRV can be non-monotonic (see, e.g., here).

Was the Burlington mayor’s election non-monotonic?

Technically, this is a trick question. Monotonicity (and, therefore, non-monotonicity) is a characteristic of a voting system or a specific pair of elections. For that reason, one cannot describe a single election as monotonic or non-monotonic. Instead, you have to analyze whether a non-monotonic result could occur if some set of voters appropriately changed their votes.

OK. So can you find a group of voters who, if they increased their support for the winner would actually cause him to lose?

Yes. As RangeVoting documents, if about 750 voters who ranked the Republican first decided instead to rank the Progressive first, that could cause the Progressive to lose. Why? Because it would change who the Progressive faced in the final run-off. Instead of beating the Republican, he would have lost to the Democrat. (See the RangeVoting analysis for some details I am skimming over.)

The Burlington election results thus show the potential for non-monotonicity.

Did this affect the election outcome?

No. RangeVoting’s analysis shows that non-monotonicity could change the outcome of a hypothetical future race in which certain voters increased support for the Progressive candidate. However, as FairVote documents it is not possible to find any group of voters who could have gotten a preferred candidate elected by ranking him lower on their ballots.

The votes thus displayed the potential for non-monotonicity, but that had no effect on the election outcome.

That tells me that much of the battle over non-monotonicity in the Burlington election is semantic rather than substantive. To clear things up, I think voting reformers need some additional lingo that would (a) allow them to apply the concept of non-monotonicity to individual elections and (b) distinguish between cases when it matters and when it doesn’t.

How would you adapt the term monotonicity to apply to a single election?

My proposal is to distinguish three possibilities for an election outcome:

  • Materially non-monotonic: The votes show that non-monotonicity affected the election outcome.
  • Immaterially non-monotonic: The votes display the potential for non-monotonicity, but it did not affect the election outcome.
  • Monotonic: Neither materially nor immaterially non-monotonic.

To judge whether non-monotonicity affected the election outcome, you would examine whether one of the non-winning candidates would have won if some group of voters had ranked him lower on their ballots. If that’s true, it means those voters would have done better to rank that candidate lower. That demonstrates that the non-monotonicity mattered.

If that’s not true, you would examine whether the winning candidate would lose if some group of voters had ranked him higher on their ballots. If so, that demonstrates that the votes allow non-monotonicity, but it didn’t matter in this case.

In a three-way race, any non-monotonic pair of elections has one member that is material and one that is immaterial.

(Note to voting theorists: I think these definitions are exhaustive and mutually exclusive for a three-candidate race, but I wouldn’t be surprised if things are more complicated with more candidates. Also, apologies in advance if you have already come up with names for these concepts. I didn’t encounter any in my brief research of this topic — other than RangeVoting using the concepts of Criterion I and Criterion II– but I wouldn’t be surprised if they exist. If so, you should encourage their use by the folks who are debating the Burlington results.)

How then would you describe the 2009 mayoral race in Burlington?

The election outcome displayed immaterial non-monotonicity.

What does the existence of immaterial non-monotonicity tell us about the likelihood of material non-monotonicity?

Not much. In the Burlington election, the hypothesized pattern of vote changes would have to be very specific to flip the outcome from the Progressive to the Democrat. If you take the actual votes as your starting point, the odds are very small that raising the ranking of the Progressive on a randomly selected group of ballots would cause him to lose. In other words, it’s unlikely that this particular election outcome will transform into a materially non-monotonic one.

But that, in turn, tells us little about the broader likelihood of material non-monotonicity. What we really need are complete data from more IRV elections. Examples of materially non-monotonic elections would be much more troubling for IRV than examples (like Burlington 2009) of immaterially non-monotonic elections.

Note: In analyzing the Burlington 2009 results, I used the vote tabulations reported by Burlington also used IRV in 2006 without incident. Following the 2009 election, however, it eliminated IRV. Not surprisingly, voting reformers also dispute why Burlington made that decision.

