Ceiling Analysis: What Part of the Pipeline to Work on Next
Ceiling Analysis: What Part of the Pipeline to Work on Next
-
So we have arrived in the conclusion that time is really important for the developers or ML to spent on.
-
If spent time incorrectly, we may waste a lot of time and just get a little performance increase.
-
This video will talk about ceiling analysis, and how we using it to prioritise our pipeline to
-
In the case of ceiling analysis, we want to separate pipelines of the problem, and detecting what benefit the most from our effort.
-
On the bottom right, we have accuracy of overall system.
-
Suppose we have run our algorithm and have the accuracy of predicting character by 72%.
-
Then , try manually set the labeled of the first step, text detection.
-
By manually (don't run learning algorithm on text detection) set label text detection perfectly, we have passed text detection as 100% accuracy(just in text detection). Then we run the algorithm to the end and have performance increase to 89%.
-
Then we moving to ceiling step 2, character segmentation. We manual perfectly set labeled for text detection, and character segmentation, then run the algorithm again. Increase to 90%.
-
Then finally set labeled perfect for all pipeline give us 100% increase.
-
If we look into performance increase, there's 17% increase, 1% increase and 10% increase. That means text detection should get our effort much, don't waste on character segmentation (perfect only 1% increase), and then maybe character recognition.
-
Let's move on to the next example.
-
This is the face recognition example that step-through to simplified process in order to better understand the ceiling analysis.
-
FIrst we remove background. And then we do face detection.
-
Divide 3 segmentation, eyes would be the most important. Gather all segmentations that feed to logistic regression producing the label of a person, the name.
-
Probably more complicated, but just for illustration of the process.
-
Again we ceiling-analysis through the process.
-
Then we have three problem that we better spent our effort into (pointed by magenta)
-
When we set perfect the algorithm for removing background, our performance just increase 0.1%
-
There's once a team of two engineer working 18 months to perfectly set background removal.
-
They published the papers and conclude that their algorithm didn't increase the performance.
-
If only the one of them perform ceiling analysis, then they would not waste that effort.
-
So ceiling-analysis would give us some insight on how we increase our performance.
-
Andrew Ng's experience over the years in ML learn not trust gut-feeling and rely based on ceiling-analysis as it would give definite prioritize about what we should work on.