Database Schema
- Schema is a blueprint for inputing the data. the data input must be follow the schema provided. if there's not much information, NULL must be specified
Summary
- With some labeled dataset, we have many supervised learning algorithm
Ceiling Analysis: What Part of the Pipeline to Work on Next
- So we have arrived in the conclusion that time is really important for the developers or ML to spent on.
Getting Lots of Data and Artificial Data Synthesis
- It's now can be deducted that to create a powerful learning algorithm, we must use low bias algorithm in a huge amount of data.
Sliding Windows
-
In previous videos, we talked about the pipeline, where we put the videos through many segment of ML process.
Problem Description and Pipeline
This set of Following videos will talk about:
- How we handle big problem, with very complex machine learning problem
Map-reduce and data-parallelism
- Often the data is so huge that we have to use more than one machine (Hadoop Cluster)
Online Learning
- Online learning algorithm is used when we have streamed input, having continuous data that didn’t stop. And what we like to do is set learning algorithm to keep learning as the data streamed and making learning algorithm making better and better decision live