This is a step-by-step tutorial on how to use tensorflow.js to run an object detection web application for your webcam feed in real-time. The model used is COCO-SSD-mobilenet_v2.
They should work in real time. Atleast the architecture looks like end to end neural network so just like other CNN based models, this should as well work in real time after quantization etc
Yeah nowadays with post-training quantization techniques as well as things like squeezenet which is quantization aware training technique, models are becoming fairly small to be able to run on phone smoothly
I can think of a couple of use cases, none of them particularly useful though. Imagine you had the pose data for the image and the pose data of someone else, even an imaginary someone else thanks to GaNs.
If it was in google IO, it will be some consumer use case probably? Althought there are a lot of commercial use cases of this technology. For example, you can train a simple classifier on top of pose which will help you record time spent on different types of activities which is useful in measuring productivity.
The speed at which China seems to be adopting and implementing the latest deep learning technology into everyday life is truly astounding. There is no doubt it is going to be one of the top countries contending for the world leader in AI in the next decade or so.
Simple: One-sentence method summary: use keypoint detection technic to detect the bounding box center point and regress to all other object properties like bounding box size, 3d information, and pose.
Versatile: The same framework works for object detection, 3d bounding box estimation, and multi-person pose estimation with minor modification.
Fast: The whole process in a single network feedforward. No NMS post processing is needed. Our DLA-34 model runs at 52 FPS with 37.4 COCO AP.
Strong: Our best single model achieves 45.1AP on COCO test-dev.
Easy to use: We provide user friendly testing API and webcam demos.
Manually searching is time taking since you need to wait for the results from each experiment. This becomes impossible when the number of hyperparameters is more than 8-10 and you will probably end up only tuning a few of them that you think are relevant. You'd also need a lot of experience in tuning hyperparameters else your tuning is as good as random.
Given these disadvantages of manual tuning, "Bayesian Optimization" seems like the most promising technique, it needs a lot less "choose->train->evals" loops as it uses the information from previous runs to select the next set of hyperparameters (similar to what humans would do).
It depends on how well the problem is understood. If the problem is your standard MNIST dataset then sure it could very well be a waste of time to sit around and serialize your manual hyper param search. For any new datasets which may or may not be cleaned theres much to be learned from iterating on a very small subset of the data, at that small scale it's much easier to get a handle on the major failings, such as encoding the wrong things or weight explosion.
Sure, it does, it's not trivial though. Tedious to implement it yourself. You could use python libraries as "scikit-optimize" which has an implementation of parallel Bayesian optimization (based on Gaussian process), have a look at this: https://scikit-optimize.github.io/notebooks/bayesian-optimiz...
Training deep learning models can be tough. They don't work without the right hyperparameters. This interactive blog gives an explanation of the algorithms that can automate the hyperparameter search and has all the code you need to try it out for yourself.
If you've been conducting manual quality checks at your manufacturing company, you've probably been over-paying for low productivity and poor quality output. The link explains why AI-powered visual inspection is the future of manufacturing.