Understanding the Question - Count number of friends
The task is to count the number of friends each person has using a MapReduce framework. The input consists of multiple lines, each containing a pair of friend names separated by a space. The goal is to output the number of friends for each person in a specific format.
The input is a number of lines with pairs of name of friends, in the form:
[Friend1] [Friend2]
The output should list the number of friends each person has, in lexicographical order, in the following format:
{"key":"[Person name]","value":"[Number of friends]"}
The Hacker Rank code template includes the MapReduce class and parts
related to input/output handling. However, the mapper and reducer
functions are incomplete. Your task is to fill in these functions so
that the program correctly counts the number of friends for each person
and outputs the result in the required format.
The input format contains a list of single space-separated pairs of friend names. The input handling code to read this data has already been provided. Sample Input:
Joe Sue
Sue Phi
Phi Joe
Phi Alice
The output handling part has also been provided. The output should be a list where each entry has a key (person name) and a value (number of friends), sorted in lexicographical order. Sample Output:
{"key":"Alice","value":"1"}
{"key":"Joe","value":"2"}
{"key":"Phi","value":"3"}
{"key":"Sue","value":"2"}
Solution
Here’s how you can implement the mapper and reducer functions:
- Mapper Function: The
mapperfunction will read each friendship pair and emit each user as a key with a value of 1. - Reducer Function: The
reducerfunction will sum up the values for each user to get the total number of friends.
import sys
from collections import OrderedDict
class MapReduce:
def __init__(self):
self.intermediate = OrderedDict()
self.result = []
def emitIntermediate(self, key, value):
self.intermediate.setdefault(key, [])
self.intermediate[key].append(value)
def emit(self, value):
self.result.append(value)
def execute(self, data, mapper, reducer):
for record in data:
mapper(record)
for key in self.intermediate:
reducer(key, self.intermediate[key])
self.result.sort()
for item in self.result:
print("{\"key\":\""+item[0]+"\",\"value\":\"" + str(item[1]) + "\"}")
mapReducer = MapReduce()
def mapper(record):
# Split the record to get the two users in the pair
user1, user2 = record.strip().split()
# Emit each user with a value of 1
mapReducer.emitIntermediate(user1, 1)
mapReducer.emitIntermediate(user2, 1)
def reducer(key, list_of_values):
# Sum up the values to get the total number of friends
total_friends = sum(list_of_values)
mapReducer.emit((key, total_friends))
if __name__ == '__main__':
inputData = []
for line in sys.stdin:
inputData.append(line)
mapReducer.execute(inputData, mapper, reducer)
Understanding the solution:
- The
MapReduceclass initializes with an ordered dictionaryintermediateto store intermediate results and a listresultto store the final results. - The
emitIntermediatemethod appends values to the list of a specific key in the intermediate dictionary. - The
emitmethod appends the final key-value pairs to theresultlist. - The
executeMethod runs themapperon each input record and then runs thereduceron the intermediate results. It sorts the final results and prints them in the specified format. Mapperfunction splits each input record into two users (assuming the input format is a pair of user IDs separated by a space). It then emits each user with a value of 1 to signify a friendship.Reducerfunction sums the list of values (which are all 1s) for each user to get the total number of friends. It then emits the user ID and the total count.
Pass the input :
1 2
2 3
1 3
4 5
When we run the code on Hacker Rank, the sample Test case is shown as successful:


