学生的成绩清单格式如下所示,第一行为表头,各字段意思分别为学号、性别、课程名
1、课程名 2 等,后面每一行代表一个学生的信息,各字段之间用空白符隔开
Id
gender Math English Physics
301610 male 80 64 78
301611 female 65 87 58
…
给定任何一个如上格式的清单(不同清单里课程数量可能不一样),要求尽可能采用函
数式编程,统计出各门课程的平均成绩,最低成绩,和最高成绩;另外还需按男女同学分开,
分别统计各门课程的平均成绩,最低成绩,和最高成绩。
测试样例 1 如下:
Id gender Math English Physics
301610 male 80 64 78
301611 female 65 87 58
301612 female 44 71 77
301613 female 66 71 91
301614 female 70 71 100
301615 male 72 77 72
301616 female 73 81 75
301617 female 69 77 75
301618 male 73 61 65
301619 male 74 69 68
301620 male 76 62 76
301621 male 73 69 91
301622 male 55 69 61
301623 male 50 58 75
301624 female 63 83 93
301625 male 72 54 100
301626 male 76 66 73
301627 male 82 87 79
301628 female 62 80 54
301629 male 89 77 72
样例 1 的统计结果输出为:
course average min max
Math: 69.20 44.00 89.00
English: 71.70 54.00 87.00
Physics: 76.65 54.00 100.00
course average min max (males)
Math: 72.67 50.00 89.00
English: 67.75 54.00 87.00
Physics: 75.83 61.00 100.00
course average min max (females)
Math: 64.00 44.00 73.00
English: 77.63 71.00 87.00
Physics: 77.88 54.00 100.00
测试样例 2
Id gender Math English Physics Science
301610 male 72 39 74 93
301611 male 75 85 93 26
301612 female 85 79 91 57
301613 female 63 89 61 62
301614 male 72 63 58 64
301615 male 99 82 70 31
301616 female 100 81 63 72
301617 male 74 100 81 59
301618 female 68 72 63 100
301619 male 63 39 59 87
301620 female 84 88 48 48
301621 male 71 88 92 46
301622 male 82 49 66 78
301623 male 63 80 83 88
301624 female 86 80 56 69
301625 male 76 69 86 49
301626 male 91 59 93 51
301627 female 92 76 79 100
301628 male 79 89 78 57
301629 male 85 74 78 80
样例 2 的统计结果为:
course average min max
Math: 79.00 63.00 100.00
English: 74.05 39.00 100.00
Physics: 73.60 48.00 93.00
Science: 65.85 26.00 100.00
course average min max
Math: 77.08 63.00 99.00
English: 70.46 39.00 100.00
Physics: 77.77 58.00 93.00
Science: 62.23 26.00 93.00
course average min max
Math: 82.57 63.00 100.00
English: 80.71 72.00 89.00
Physics: 65.86 48.00 91.00
Science: 72.57 48.00 100.00
1 package com 2 3 object test{ 4 def main(arg:Array[String]){ 5 // 假设数据文件在当前目录下 6 val inputFile = scala.io.Source.fromFile("C:\Users\hasee\Desktop\spark2-3-2.txt") 7 //”\s+“是字符串正则表达式,将每行按空白字符(包括空格/制表符)分开 8 // 由于可能涉及多次遍历,用 toList 将 Iterator 装为 List 9 // originalData 的类型为 List[Array[String]] 10 val originalData = inputFile.getLines.map{_.split("\s+")}.toList 11 val courseNames = originalData.head.drop(2)//获取第一行中的课程名 12 val allStudents = originalData.tail // 去除第一行剩下的数据 13 val courseNum = courseNames.length 14 // 统计函数,参数为需要常用统计的行 15 //用到了外部变量 courseNum,属于闭包函数 16 def statistic(lines:List[Array[String]]) = { 17 // for 推导式,对每门课程生成一个三元组,分别表示总分,最低分和最高分 18 (for(i<-2 to courseNum+1) yield{ 19 val temp = lines map { 20 elem=>elem(i).toDouble 21 } 22 (temp.sum,temp.min,temp.max) 23 })map{case (total,min,max)=>(total/lines.length,min,max)} 24 // 最后一个 map 对 for 的结果进行修改,将总分转为平均分 25 } 26 // 输出结果函数 27 def printResult(theresult:Seq[(Double,Double,Double)]) 28 { 29 // 遍历前调用 zip 方法将课程名容器和结果容器合并,合并结果为二元组容器 30 (courseNames zip theresult) foreach 31 { 32 case (course,result) =>println(f"${course+":"}%-10s${result._1}%5.2f${result._2}%8.2f${result._3}%8.2f") 33 } 34 } 35 // 分别调用两个函数统计全体学生并输出结果 36 val allResult = statistic(allStudents) 37 println("course average min max") 38 printResult(allResult) 39 //按性别划分为两个容器 40 val (maleLines,femaleLines) = allStudents partition{_(1)=="male"} 41 42 43 44 // 分别调用两个函数统计男学生并输出结果 45 val maleResult =statistic(maleLines) 46 println("course average min max") 47 printResult(maleResult) 48 // 分别调用两个函数统计女学生并输出结果 49 val femaleResult =statistic(femaleLines) 50 println("course average min max") 51 printResult(femaleResult) 52 53 } 54 55 }
运行结果: