Data Formats

  • This is the three most common format in data format
  • Here, we take the baseball player data earlier, and it just so happen that a famous baseball player in number one that we fetch
  • In CSV file, we have each 1 row represent 1 example
  • the data then divided into separated commas, to separate parameters of 1 example
  • For missing data CSV, we insert blank and still separated by commas
  • in XML data format we represent it at stated above
  • And for the missing data XML, put a slash on it
  • JSON is a lot like python dictionaries
  • We can have nested structured as above
  • To get better understanding XML structure, you may want to check Udacity Data Wrangling with MongoDB
  • For now, we stick to CSV and JSON data format.