- 
    This lesson gonna teach us how to use twitter dataset to analyze the data.
   
   
  
  
  
   - 
    This the dataset in general social network data, for this in particular tweeter.
   
- 
    Introduced The Agregation Framework, MongoDB powerful data anaylisis, to analyze what kind of data we've been working on.
   
   
  
  
  
   - 
    Here is the step to extract the user who tweeted the most based on the structure of data twitter above.
   
   - 
    The Agregation Framework in MongoDB implemented this
    
 
- 
    the framework using pipeline to solve the problem.
   
- 
    First it uses group operator, where the id(unique) means that we group all the tweet based on the uniqueness(id) of user screen name. the "$user.screen_name" doesn't mean operator, but value of "user.screen_name". Then for every tweet based on the same username, increment (count) to one.
   
- 
    The sort then perform the sorting based on count, on the descending(-1) order.
   
- 
    This is two-stage performed by the pipeline of agregation framework.
   
   
  
  
   
  
  
   
    
   
    
   
  
   - 
    The stage in agregation pipeline can be single or series of stage to get a result
   
- 
    Here we reshaping tweet to the middle(based on what we want) and then performe sorting stage in 'sort'
   
- 
    Agregation operators:
    
     - 
      $project: Reshaping all the data so that it can be presented nicely depend what we want, to the next stage or as result.
     
- 
      $match: filter documents.
     
- 
      $group, compact multiple documents(given parameters) with single documents that satisfied the operator. operator $group as follows:
      
       - 
        $sum
       
- 
        $first
       
- 
        $last
       
- 
        $max
       
- 
        $min
       
- 
        $avg
       
- 
        $push. Deal with Array
       
- 
        $addtoSet. Deal with Array, Perform as a set to update a value in array,
       
 
- 
      $skip: skip documents by index
     
- 
      $limit: limit by number, the documents. 3, means only first three allowed.
     
- 
      $unwind: unwind the array of a documents, to a multiple documents with same data, but different by each value of array name. This is useful as in Twitter, we may want to group by the hashtag
     
 
   
  
  
  
   - 
    This produce 4-stage pipeline for agregation
   
   - 
    friends: who i follow
   
- 
    followers: who follow me
   
   - 
    This is the function of who included the most user mentions.
   
   - 
    This will produce unique hashtag as an array, but not containing the same value.
   
   
  
  
   
  
  
  
   - 
    Multiple stage with same name operator.
   
- 
    This one counts the user that has the most unique user mentions(user that mentions many unique users, the most)
   
   - 
    We can index our database for fasten our query
   
- 
    To do this we specify our leftmost queries hashtag-->username
   
- 
    Keep in mind that although read faster, write becomes slower because the database has to be updated.
    
 
   - 
    Here is the indexex command from monggo shell
   
- 
    If we execute second line, it will have few seconds to execute, because the data have 7 millions set.
   
- 
    But when we set index(tg), the result for the query give immediate results
   
   - 
    We can specift name type(e.g. location) but the value must follow [x,y] format
    
 
 
- 
    Then we can query based on the $near operator.