-
Counting all indian citizens like in adhaar data, may be very difficult with just SQL/Python
-
We have no choice unless using adhaar data
-
Mapper and reducer function for addhar generated
-
From database, we only took name of district(column 3) and the count of population of the district(column 8)
-
Imagine problems solved earlier can be handle by simple python script
-
But things could be significantly harder when big datas coming in
-
Hadoop is map reduce programming model
-
Many library use hadoop as map reduce on top of it
-
2 of the top hadoop is Hive and Pig. Hive developed by Facebook and allow hive sql-like, pig developed by Yahoo and produce some of the hive can't
-
Also others hadoop library stated above