6. Questioning and Reforming the election system seem all the rage nowadays.

Question

There are some ways distributed systems folks can help with elections. Someone

at the election office thinks MapReduce could be useful for “instant runoff

voting" in primaries. (Fun fact: several states, including Alaska, now use instant

runoff voting!) Here's how instant runoff voting works. Consider an election

with three candidates on the ballot - A, B, C. Every voter ranks the three

candidates as first preference, second preference, and last preference. Between

any two candidates X and Y, if a majority of voters ranked X above Y, then X

dominates Y (and vice versa)-note that this only takes into account X and Y's

relative rankings, not where they appear in the preference order, or where the

third candidate appears. A Condorcet winner is a candidate that dominates all

other candidates (pair wise) on the ballot. By definition an election can have at

most one Condorcet winner (however, there may be zero).

You are given a dataset of votes from N voters (N is odd and large, and so

dataset is sharded), where each vote V has three fields V.1, V.2, V.3, respectively

for the first, second, and third preference votes of that voter. Each line of input is

one such voter's vote V (input to initial Map function). Write a MapReduce

program that outputs either the unique single Condorcet winner among the three

candidates A, B, or C, or if there is no single Condorcet winner, then it outputs

the list of candidate(s) with the highest Condorcet count (those that dominate the

most number of candidates). For background -- in MapReduce, one writes a

program for Map that processes one input line at a time and outputs zero or

more (key, value) pairs; and one writes a program for Reduce that processes an

input of (key, all values for key). The iteration over input lines is done

automatically by the MapReduce framework. You can assume this data is

already sharded in HDFS and can be loaded from there. Each line is one vote V