For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | manneshiva's commentsregister

This is a step-by-step tutorial on how to use tensorflow.js to run an object detection web application for your webcam feed in real-time. The model used is COCO-SSD-mobilenet_v2.


How fast are the algorithms? does anyone of them work in real time?


Certainly. For example, https://storage.googleapis.com/tfjs-models/demos/posenet/cam... will run in your browser. https://github.com/CMU-Perceptual-Computing-Lab/openpose is also a decent choice for realtime pose estimation, although I'm not sure if they have a web demo anywhere.


They should work in real time. Atleast the architecture looks like end to end neural network so just like other CNN based models, this should as well work in real time after quantization etc


They showed an app like this running on a phone in laggy but real time at I/O yesterday


Yeah nowadays with post-training quantization techniques as well as things like squeezenet which is quantization aware training technique, models are becoming fairly small to be able to run on phone smoothly


Interesting, wonder what their use case was?


I can think of a couple of use cases, none of them particularly useful though. Imagine you had the pose data for the image and the pose data of someone else, even an imaginary someone else thanks to GaNs.


If it was in google IO, it will be some consumer use case probably? Althought there are a lot of commercial use cases of this technology. For example, you can train a simple classifier on top of pose which will help you record time spent on different types of activities which is useful in measuring productivity.


Dance Dance Revolution, but without the sensor mat?

https://en.wikipedia.org/wiki/Dance_Dance_Revolution


Just a fun dance recognition app to show off ML.


A news agency in China uses an AI news anchor.

The speed at which China seems to be adopting and implementing the latest deep learning technology into everyday life is truly astounding. There is no doubt it is going to be one of the top countries contending for the world leader in AI in the next decade or so.


I have seen similar results using Gunner Farneback's algorithm.


Highlights of this technique:

Simple: One-sentence method summary: use keypoint detection technic to detect the bounding box center point and regress to all other object properties like bounding box size, 3d information, and pose.

Versatile: The same framework works for object detection, 3d bounding box estimation, and multi-person pose estimation with minor modification.

Fast: The whole process in a single network feedforward. No NMS post processing is needed. Our DLA-34 model runs at 52 FPS with 37.4 COCO AP.

Strong: Our best single model achieves 45.1AP on COCO test-dev.

Easy to use: We provide user friendly testing API and webcam demos.


Even Facebook uses Bayesian Optimization for tuning the parameters of some of its online systems (like its internal ranking system): https://research.fb.com/efficient-tuning-of-online-systems-u...


Manually searching is time taking since you need to wait for the results from each experiment. This becomes impossible when the number of hyperparameters is more than 8-10 and you will probably end up only tuning a few of them that you think are relevant. You'd also need a lot of experience in tuning hyperparameters else your tuning is as good as random.

Given these disadvantages of manual tuning, "Bayesian Optimization" seems like the most promising technique, it needs a lot less "choose->train->evals" loops as it uses the information from previous runs to select the next set of hyperparameters (similar to what humans would do).


It depends on how well the problem is understood. If the problem is your standard MNIST dataset then sure it could very well be a waste of time to sit around and serialize your manual hyper param search. For any new datasets which may or may not be cleaned theres much to be learned from iterating on a very small subset of the data, at that small scale it's much easier to get a handle on the major failings, such as encoding the wrong things or weight explosion.


Does it work in parallel though?


Sure, it does, it's not trivial though. Tedious to implement it yourself. You could use python libraries as "scikit-optimize" which has an implementation of parallel Bayesian optimization (based on Gaussian process), have a look at this: https://scikit-optimize.github.io/notebooks/bayesian-optimiz...


It just amazes me how the attention to detail in a video game could help reconstruct a medieval Catholic cathedral!


Training deep learning models can be tough. They don't work without the right hyperparameters. This interactive blog gives an explanation of the algorithms that can automate the hyperparameter search and has all the code you need to try it out for yourself.


If you've been conducting manual quality checks at your manufacturing company, you've probably been over-paying for low productivity and poor quality output. The link explains why AI-powered visual inspection is the future of manufacturing.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You