rampantraccoon's comments

rampantraccoon · on Feb 22, 2024

a new lightweight network architecture -- Generalized Efficient Layer Aggregation Network (GELAN), based on gradient path planning is designed. GELAN's architecture confirms that PGI has gained superior results on lightweight models. We verified the proposed GELAN and PGI on MS COCO dataset based object detection. The results show that GELAN only uses conventional convolution operators to achieve better parameter utilization than the state-of-the-art methods developed based on depth-wise convolution.

rampantraccoon · on Feb 15, 2024

Paper: https://arxiv.org/abs/2401.17270

Code: https://github.com/AILAB-CVC/YOLO-World

rampantraccoon · on Jan 18, 2024

"Interestingly, even at this scale, we observe no sign of saturation in performance, suggesting that AIM potentially represents a new frontier for training large-scale vision models."

rampantraccoon · on April 13, 2023

The problem being solved is AI being able to distinguish unique objects within visual data. Before SAM, people would have to train a model on specific objects by labeling data and training a model to understand those objects specifically. This becomes problematic given the variety of objects in the world, settings they can be in, and their orientation in an image. SAM can identify objects it has never seen before, as in objects that might not be part of the training data.

Once you can determine which pixels belong to which object automatically, you can start to utilize that knowledge for other applications.

If you have SAM showing you all objects, you can use other models to identify what the object is, understand it's shape/size, understand depth/distance, etc. It's a foundational model to build off of for any application that wants to use visual data as an input.

DaiPlusPlus · on April 13, 2023

> SAM can identify objects it has never seen before

I'd love to see what SAM does when you send it a photo of rolling fog though, e.g. https://www.google.com/search?q=rolling+fog+scotland&tbm=isc... - what happens then? (and how can it meaningfully segment-out fog?)

yeldarb · on April 13, 2023

Not sure if this is what you mean, but I grabbed some of those images & dropped them in to see what it predicted: https://imgur.com/a/CXLmYXo

idopmstuff · on April 13, 2023

It groups the fog as a single object (except where it's separated by things like hills).

You can see what it does - it's available to test at https://segment-anything.com/.

endisneigh · on April 13, 2023

Yes, what I am interested in are the other applications.

rampantraccoon · on March 10, 2023

Repo: https://github.com/roboflow/notebooks/blob/main/notebooks/sa...

HN For You