Ceiling Analysis: What Part of the Pipeline to Work on Next

  |   Source
Ceiling Analysis: What Part of the Pipeline to Work on Next
  • So we have arrived in the conclusion that time is really important for the developers or ML to spent on.
  • If spent time incorrectly, we may waste a lot of time and just get a little performance increase.
  • This video will talk about ceiling analysis, and how we using it to prioritise our pipeline to

  • In the case of ceiling analysis, we want to separate pipelines of the problem, and detecting what benefit the most from our effort.
  • On the bottom right, we have accuracy of overall system.
  • Suppose we have run our algorithm and have the accuracy of predicting character by 72%.
  • Then , try manually set the labeled of the first step, text detection.
  • By manually (don't run learning algorithm on text detection) set label text detection perfectly, we have passed text detection as 100% accuracy(just in text detection). Then we run the algorithm to the end and have performance increase to 89%.
  • Then we moving to ceiling step 2, character segmentation. We manual perfectly set labeled for text detection, and character segmentation, then run the algorithm again. Increase to 90%.
  • Then finally set labeled perfect for all pipeline give us 100% increase.
  • If we look into performance increase, there's 17% increase, 1% increase and 10% increase. That means text detection should get our effort much, don't waste on character segmentation (perfect only 1% increase), and then maybe character recognition.
  • Let's move on to the next example.
  • This is the face recognition example that step-through to simplified process in order to better understand the ceiling analysis.
  • FIrst we remove background. And then we do face detection.
  • Divide 3 segmentation, eyes would be the most important. Gather all segmentations that feed to logistic regression producing the label of a person, the name.
  • Probably more complicated, but just for illustration of the process.

  • Again we ceiling-analysis through the process.
  • Then we have three problem that we better spent our effort into (pointed by magenta)
  • When we set perfect the algorithm for removing background, our performance just increase 0.1%
  • There's once a team of two engineer working 18 months to perfectly set background removal.
  • They published the papers and conclude that their algorithm didn't increase the performance.
  • If only the one of them perform ceiling analysis, then they would not waste that effort.
  • So ceiling-analysis would give us some insight on how we increase our performance.
  • Andrew Ng's experience over the years in ML learn not trust gut-feeling and rely based on ceiling-analysis as it would give definite prioritize about what we should work on.