For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | more turmeric_root's commentsregister

Hell, what about Skynet?


and what about my shredder?


exactly! I gave a heartfelt letter to my shredder the other day and it simply destroyed it. issues like these are why AI alignment research is so critical.


The model weights were only shared by FB to people who applied for research access. Github repos containing links to the model weights have been taken down by FB.


I like using them for memeing


More VRAM => larger models. IME it is absolutely worth maxing out VRAM for the significant improvement in quality, especially with LLaMA (though even with a 4090, you won't be able to run the largest 65-billion parameter model even with 4-bit quantization).

That said, I recommend renting a cloud GPU for a few hours and trying the larger models on them before buying a GPU of your own, just to see if the models meet your requirements.


But should fit easily on a Apple MBP or Studio with 96GB or 128GB of unified memory.


they're just microdosing it's ok


A lot of the 'look what I made with AI' images that get shared around also don't include the creator's workflow. There's usually lots of trial-and-error, manual painting/inpainting, multiple models involved etc. and explaining all that is a lot harder than just saying 'I used stable diffusion'.


ugh, that's so shitty. so many people in this space seem to be absurdly demanding and angry at devs, but one thing I've noticed is that every text AI project discord I've hung out in has this sleazy, obsessive 4chan /g/ vibe hiding somewhere in it.


> the "number B" stands for "number of billions" of parameters... trained on?

No, it's just the size of the network (i.e. number of learnable parameters). The 13/30/65B models were each trained on ~1.4 trillion tokens of training data (each token is around half a word).


'accuracy' and 'truth' are legacy 0.1X concepts, move fast and break things


yeah when getting DL up and running on AMD requires using a datacentre card then it's no wonder CUDA is more popular. AMD is enabling ROCm for commercial GPUs now but it's still a pain to get it up and running, because of the inertia that CUDA has.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You