More Map Reduce

  • Counting all indian citizens like in adhaar data, may be very difficult with just SQL/Python
  • We have no choice unless using adhaar data

  • Mapper and reducer function for addhar generated
  • From database, we only took name of district(column 3) and the count of population of the district(column 8)

  • Imagine problems solved earlier can be handle by simple python script
  • But things could be significantly harder when big datas coming in

  • Hadoop is map reduce programming model
  • Many library use hadoop as map reduce on top of it
  • 2 of the top hadoop is Hive and Pig. Hive developed by Facebook and allow hive sql-like, pig developed by Yahoo and produce some of the hive can't
  • Also others hadoop library stated above