cocktailpeanut's comments

cocktailpeanut · on March 18, 2023

Hi guys, I really didn't think that it would move THIS fast, but it did, and we now have Alpaca.cpp.

As soon as I saw Alpaca.cpp on the front page, I started working on Dalai's JavaScript integration, and today I'm happy to release Dalai Alpaca.

Dalai is now kind of like a package manager of sorts, just one command to install LLaMA or Alpaca, and all integrated with the web UI and the ultra-hackable API. All you need to do is one command: "npx dalai install alpaca 7b"

Also, been fixing a lot of bugs since last time so if you had trouble installing, I recommend you try again.

Officially supports Windows, Mac and Linux. Still some quirks here and there but mostly works and happy to figure out if you are having trouble. This time it's much simpler to use because it only takes up like 4.2GB. Appreciate any feedback!

cocktailpeanut · on March 13, 2023

Hi HN!

I posted Dalai on HN yesterday (https://news.ycombinator.com/item?id=35127020) and I have been spending the last 24 hours trying to listen to all the feedback and incorporate all the incoming pull requests and make a new release.

  1. If you had trouble installing and they were just not installing for some reason, make sure you try this version.
  2. If you had trouble working with models other than the 7B one, yes it was my fault, only the 7B one was working, and now this has been fixed, and now ALL models should work with this new version
  3. Use existing workspace: If you already have an existing llama.cpp workspace folder, now Dalai can connect to it programmatically instead of creating its own repository under ~/dalai. Remember, while the easy installation is important, the most important aspect of Dalai is that it lets you interact with llama.cpp with JavaScript. So even if you already have an existing llama.cpp and just want to play with it using JS, you can use Dalai.
  4. New Web UI

I don't want to ramble on and on here so will link a thread where I discussed this in more detail, for those who are interested in learning more: https://twitter.com/cocktailpeanut/status/163539451761565286...

Finally, like 90% of the code for this release came from people other than myself and it's amazing how this happened. Thank you everyone who contributed.

There are some further issues I think need to be addressed. If you have any feedback that hasn't been addressed with this release, please let me know!

and if you want to follow along with the development, you can find me on Twitter https://twitter.com/cocktailpeanut

codpiece · on March 14, 2023

I tried it yesterday and it was great! Really simple to set up. Looking forward to updating and trying out the new changes. Thank you for this work!

jeffhwang · on March 13, 2023

Thank you for your work on this and appreciate the quick fixes!

cocktailpeanut · on March 12, 2023

Hey guys, I was so inspired by the llama.cpp project that I spent all day today to build a weekend side project.

Basically it lets you one-click install LLaMA on your machine with no bullshit. All you need is just run "npx dalai llama".

I see that the #1 post today is a whole long blog post about how to walk through and compile cpp and download files and all that to finally run LLaMA on your machine, but basically I have 100% automated this with a simple NPM package/application.

On top of that, the whole thing is a single NPM package and was built with hackability in mind. With just one line of JS function call you can call LLaMA from YOUR app.

Lastly, EVEN IF you don't use JavaScript, Dalai exposes a socket.io API, so you can use whatever language you want to interact with Dalai programmatically.

I discussed a bit more about this on a Twitter thread. Check it out: https://twitter.com/cocktailpeanut/status/163504032247148953...

It should "just work". Have fun!

cocktailpeanut · on March 13, 2023

UPDATE:

Thanks for all the feedback! I went outside to take a walk after posting this and just came back, and went through them to summarize what needs to be improved.

Basically looks like it comes down to the following:

  - *customize features:* Should not be difficult (will add flag features)
    - *path:* customize the home directory (instead of automatically storing to $HOME)
    - *python:* some people are having issues with the python binary (since the package is essentially calling these shell commands). Maybe add a flag to specify the exact name of the python binary (such as "--python python3")
    - *avoid downloading files:* I have this issue too when I just want to install the code instead of downloading the full model which takes a long time. Might add a flag to avoid downloading models in case you already have them (EDIT: actually upon thinking about it, it's better to just set the source model folder, something like --model)
    - *other flags:* The rest of the flags natively supported by the llama.cpp project, such as top_k, top_p, temp, batch_size, threads, seed, n_predict, etc. (They are already in the code but just was not exposed for CLI and not documented)
    
  - *documentation*
    - document the machine spec
    - document the storage spec: how much space is used?
    - node version: which version of node.js is required?
    - python version: which version of python doesn't work?

Am I missing anything? Feel free to leave comments, will try to roll out some updates as soon as I can. To stay updated, feel free to follow me on twitter https://twitter.com/cocktailpeanut (or you could create issues on GitHub too!)

devmor · on March 13, 2023

I tried to run your NPX commands from the examples on a fresh WSL install of Ubuntu 20.04, but if you don't have build tools installed, they both just silently fail.

I only realized what was happening after trying to go the other route and use it in a package, where I then noticed the NPM install will give a node-gyp error about make missing.

pmarreck · on March 13, 2023

I'm on NixOS, where you have to explicitly state dependencies (which is a good thing, except when... this happens)

Besides make (which I can quickly set up a project environment to make available for), what other deps do you think it uses but doesn't declare or state? ;)

devmor · on March 13, 2023

The other one I noticed is pip! A lot of the script fails without pip, and it takes until after the fairly long downloads finish to let you know it was needed.

pmarreck · on March 13, 2023

so it needs make/gcc, python AND node available... what versions, I wonder?

devmor · on March 14, 2023

I successfully used the latest version of node LTS (via NVM) and the latest versions of python-pip3 and build essentials from the Canonical apt repo, if that helps.

yieldcrv · on March 13, 2023

I don’t understand why it’s downloading at all, that shouldn’t be default behavior.

It should have default instructions to load a file from a default place, and then arguments/flags to load from a specific path, and then MAYBE a prompt to download the models after it can’t find them on the paths, plural

cocktailpeanut · on March 13, 2023

UPDATE 2:

Thanks to all the pull requests, we've managed to solve most of these issues in the most optimal manner.

Version 0.1.0 released: https://news.ycombinator.com/item?id=35143171

icosahedron · on March 13, 2023

I followed the initial instructions and the 7B model worked just fine.

I tried the supplementary instructions to download some of the models (7B, 13B, and 30B), and it didn't seem to work. The prompt returned nothing after waiting for several minutes.

Is there a way to run just one of the larger models?

cocktailpeanut · on March 13, 2023

I am going to test this out today and roll this out as soon as I can, hopefully tomorrow. stay tuned.

Datagenerator · on March 13, 2023

What's the minimum spec GPU required? NVIDIA only? Any differences between Debian and Fedora Linuxes? RAM required?

MacsHeadroom · on March 13, 2023

This app is CPU only and gets good speeds on even mobile phone CPUs. Minimum RAM required is 5GB.

sucram1 · on March 13, 2023

Oh wow, any way to do this on Android yet? That would be fun to tinker with, even if it's just the smaller model. Even my older Note 9 has 6GB.

MacsHeadroom · on March 15, 2023

Yes. Starting with the Facebook versions of LLaMA-7B you just quantize the model to 4bit on your desktop (since it takes 14GB of RAM) and then move it to your phone and follow the Android instructions in the repo. https://github.com/ggerganov/llama.cpp/#android

I've seen dozens of screenshots of it running in termux on androids by now at completely usable speeds.

sucram1 · on March 15, 2023

Thank you for the link! Insane that this can run on a phone.

As my current potato computer has 8GB of RAM, I'll ask a friend to do it :-)

mrfreed · on March 13, 2023

What distro and PC specs do you have success with?

garyfirestorm · on March 13, 2023

I ran this on my intel i7-7700k with 32 gig ram. It ran very slow. Almost 1 word per second slow. Not sure if I did something wrong. Distro Ubuntu 22.04

la64710 · on March 13, 2023

It would be great to also understand how one can finetune this model. Thanks for the awesome work!

khimaros · on March 13, 2023

you may be able to use pyenv to increase compatibility across Linux distributions

upghost · on March 13, 2023

My biggest concern about these LLMs was the corporate sequestration and the potential socioeconomic imbalances it would create. The work you are doing here is part of some amazing work to check that back. In summary—- Bruhhhhhh. THANK YOU!

sebastianconcpt · on March 13, 2023

This is something to keep an eye, really. The solution for making that sequestration impossible is twofold:

1. to know how to architect and create LLMs (including training data readiness) 2. have them produced in hardware that is acquirable at reasonable cost for a normal citizen

yawnxyz · on March 12, 2023

Wow that's so incredible. Thanks for putting this together!

Do you have any machine specs associated with this? Can an old-ish Macbook Pro run this service?

I'm also curious, since I'm new to all this — is it possible to run something like this on Fly.io or does it take up way too much space?

sp332 · on March 12, 2023

7B is the default. If it's quantized to 4 bits, that's a 3.9 GB file.

skykooler · on March 13, 2023

How powerful of a computer does this need? It would be useful to see, for one thing, minimum RAM requirements for these models.

spion · on March 13, 2023

llama.cpp needs 40GB for the 65B model (due to int4 quantization)

RamNeeded(other_size) ~= 40GB * other_size/65B

EZ-Cheeze · on March 12, 2023

Add something like this to your instructions: "Make sure you have Node.js installed on your computer."

AlecSchueler · on March 13, 2023

One step install after the steps that lead up to it.

turbocon · on March 13, 2023

Yea not a nodejs/javascript dev at all but this is failing to install on Fedora. I don't have time to dig into it at the moment but if anybody has any well known gotchas that could be the issue that would be helpful :)

Edit: I do have nodejs and npx installed

vorticalbox · on March 13, 2023

Maybe make, python and pip. From what I gather this is a node wrapper it's actually python that runs the model

sieste · on March 12, 2023

Does anyone know how to avoid downloading the model weights when doing `npx dalai llama`, and instead telling the install process where they are on my drive?

gregsadetsky · on March 13, 2023

you could clone the repo and comment out https://github.com/cocktailpeanut/dalai/blob/main/index.js#L... i.e. the specific synchronous download call..?

jacooper · on March 13, 2023

Does this use the GPU? If not why? Aren't GPUs much faster than CPUs at AI?

londons_explore · on March 13, 2023

Is is usable without a GPU... it'll output data a bit faster than most people type.

boredemployee · on March 13, 2023

I think thats exactly the point so everyone can run it on their PCs with no GPU.

lolinder · on March 13, 2023

Or without a beefy GPU. I've got 8GB VRAM, which is great for Stable Diffusion but not useful for any of the language models released so far.

I think the 4-bit 7B LLaMA would work, but the 7B is pretty fast anyway without GPU.

boredemployee · on March 13, 2023

I'm installing it here. How's the 7B model going so far?

lolinder · on March 13, 2023

Haha, I just finished ordering 32GB of additional memory for my PC so I can run the 65B model, if that tells you anything. I'm upgrading from 32GB -> 64GB.

7B is fine, 13B is better. Both are fun toys and almost make sense most of the time, but even with a lot of parameter tuning they're often incoherent. You can tell that they have encoded fewer relationships between concepts than the higher-parameter models we've gotten used to--it's much closer to GPT-2 than GPT-3.

They're good enough to whet my appetite and give me a lot of ideas of what I want to do, they're just not quite good enough to make those applications reliably useful. Based on the reports I'm hearing here of just how much better the 65B model is than the 7B, I decided it was worth $80 for a few new sticks of RAM to be able to use the full model. Still way cheaper than buying a graphics card capable of handling it.

Semaphor · on March 13, 2023

Heh, you just made me upgrade as well. After originally paying 130 € for 32 GB, it’s nice that I only had to pay 70 € to double it ;) Not sure if I want to run LLMs (or if my Ryzen 5 3600 is even powerful enough), but I’ve wanted some more RAM for a while.

iambateman · on March 13, 2023

If I was running in a server context, would the 50gb of ram be required to respond to one request, or can it be used to respond to multiple requests simultaneously?

lolinder · on March 17, 2023

I'm very late to this question, but I believe that that amount is only required once, but the context tensor will need to be created per request. I haven't confirmed that, though.

boredemployee · on March 13, 2023

I'd assume that all the calculations used for 1 request would already eat up that amount of memory, but I could be wrong!

radicalbyte · on March 13, 2023

I'm still holding on to a small bit of hope that the GPU market will normalise this year. Don't think that I'm the only one looking to get something highly capable but for a fair price.

dragonwriter · on March 13, 2023

> I’m still holding on to a small bit of hope that the GPU market will normalize this year.

I suspect all the people hoping it will (b/c of Stable Diffusion, etc.) are exactly the reason it won’t.

boredemployee · on March 13, 2023

Me too. But for 3rd world countries its mad priced.

radicalbyte · on March 15, 2023

It's expensive for first-world countries too. Just look at the 4090 - it's insane that it costs 2k EUR... it's literally double the fair price (which itself is high).

teruakohatu · on March 12, 2023

Very nice. Anyway to add an option to install elsewhere other than ~/ ?

pmarreck · on March 13, 2023

I ran "npx dalai llama" and it's just... sitting there (after I hit "y" to confirm). I checked btop++ and there's barely any downloading or CPU activity occurring, so not sure what it's doing... but does "pip3 install torch torchvision torchaudio sentencepiece numpy" take a while?

If it's actually downloading the 3.9GB of model weights or whatever, it would be pretty cool if it showed a progress bar of some sort. Stretch goal, for sure, but a very nice nicety for users.

anyway, I'll leave it be and check on it to see when it's complete. Super cool if this works!!

m3kw9 · on March 13, 2023

Made a comment on the other thread: why can’t we have a one click install thing and here it is. Nice!

anigbrowl · on March 13, 2023

Well that's pretty wild. I was wondering whether I wanted to build LLaMA tomorrow but you upended my plans in the space of 2 minutes. 10/10 well done.

Tepix · on March 13, 2023

There's an elephant in the room, or is it just me?

Is your script making users violate the original license agreement(§)?

For the record, i don't think Meta will go after you or anyone else. But they may decide not to make their future models available after what is happening with the Llama weights.

I realize that some people are of the opinion that AI models (weights) cannot be copyrighted at all.

--

§ the license agreement is at https://forms.gle/jk851eBVbX1m5TAv5

Tiberium · on March 13, 2023

Yes, you are right, every project that distributes LLaMA right now is violating Meta's agreement.

pksebben · on March 14, 2023

I've got a weird, probably untrue conspiracy theory about this.

Hugging face releases stable diffusion. It goes viral and vastly outpaces the competition in the blink of an eye. Then they get sued.

Meta sees both of these things go down. Meta needs a leg up on chat GPT, but worries about legal repercussions similar to stable diffusion.

Whoops, it leaked! Hey, we didn't say those dastardly devs could use it.

sebzim4500 · on March 13, 2023

>But they may decide not to make their future models available after what is happening with the Llama weights.

I think that ship has probably sailed, in that no one is going to release weights in this way again. Either they will publish them outright (like Whisper) or they will keep them (almost) completely closed.

holtkam2 · on March 12, 2023

This is awesome! I've wanted to try llama.cpp and you just reduced my to-do list significantly on my Sunday :) Thanks!

GordonS · on March 12, 2023

Looks great! Does it work on Windows please?

buzzier · on March 12, 2023

For Windows:

1. Binary build https://github.com/jaykrell/llama.cpp/releases/tag/1

2. Quantized model (7B/13B/30B) https://mega.nz/folder/UjAUES6Z#bGhKkyiZX3eRrn9HcxVVfA

3. main.exe -m ggml-model-q4_0.bin -t 8 -n 128

placebo · on March 13, 2023

Thanks. Initial test:

main.exe -m ggml-model-q4_0.bin -t 8 -n 128 -p "The Drake equation is nonsense because"

The Drake equation is nonsense because it takes parameters that can only be known AFTER the conclusion is reached. It would be like saying "I'm going to prove a theorem by starting from the conclusion, then making up the proof. The Drake equation uses the existence of extraterrestrial intelligence as the conclusion and then making up the parameters. It is nonsense.

GordonS · on March 13, 2023

Nice, main.exe seems to work just fine with the 7B quantized model - generates a token every 400ms on an AMD Ryzen 5 2600!

But, quantize.exe doesn't seem to work - any valid command (such as below) pauses for a split second, then returns with no output?

$ quantize.exe ggml-model-f16.bin ggml-model-q4_0.bin 2

GordonS · on March 14, 2023

In case this helps anyone else: I built it myself on Windows with CMake, and then everything just works.

starik36 · on March 15, 2023

Do you mind sharing the binaries?

GordonS · on March 16, 2023

Sure! https://filetransfer.io/data-package/8hxKAiaH#link

I wasn't sure where to upload them, and that link is only good for 50 downloads. Can put them somewhere else if you know a better location that doesn't require signup.

starik36 · on March 16, 2023

Thank you.

llama.exe is basically main.exe?

I actually learned how to compile this code via CMake/VS2019. It's sure a whole lot more complicated then it was 25 years ago when I was writing C.

GordonS · on March 17, 2023

Yes, llama.exe is actually the name the project produces - the other poster must have renamed it to main.exe.

I just did `scoop install cmake`, then built from the command line, was a doddle!

tough · on March 13, 2023

I actually am installing in windows via WSL/Ubuntu fwiw

bsenftner · on March 13, 2023

My attempt does not work, and now I'm trying to figure out where the 35+ GB of data and files that were added to my hard drive are located so I can clean it all off.

tough · on March 13, 2023

I got it to work with WSL/Ubuntu in case you want to try it that way.

dragonwriter · on March 13, 2023

If it makes common unix-ish assumptions like “Python 3 executables have a ‘3’ appended to their name”, which other comments here seem to suggest it does, it won’t, even if you have the required version of python installed.

GordonS · on March 14, 2023

So, I actually got it working on Windows, pretty easily!

The provided `main.exe` binary worked as-is, but `quantize.exe` did not - I built myself with CMake, and `quantize.exe` started working too.

volaski · on March 12, 2023

Curious too. Let me know if you try it out. Technically I think it should work.

starik36 · on March 13, 2023

I tried it, doesn't work. Trying the sibling post from @buzzier.

vid · on March 13, 2023

You, sir or madam, are a hero.

evo_9 · on March 14, 2023

When I run this commnad: npx dalai llama

I get the following output / errors?

What exactly do I need to install prior to running that command?

---------------------------- >> npx dalai llama

exec: git clone https://github.com/ggerganov/llama.cpp.git /Users/rickg/llama.cpp in undefined git clone https://github.com/ggerganov/llama.cpp.git /Users/rickg/llama.cpp exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. a.cpp3.2$ git clone https://github.com/ggerganov/llama.cpp.git /Users/rickg/llam fatal: destination path '/Users/rickg/llama.cpp' already exists and is not an empty directory. bash-3.2$ exit exit exec: git pull in /Users/rickg/llama.cpp git pull exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. bash-3.2$ git pull Already up to date. bash-3.2$ exit exit exec: python3 -m venv /Users/rickg/llama.cpp/venv in undefined python3 -m venv /Users/rickg/llama.cpp/venv exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. bash-3.2$ python3 -m venv /Users/rickg/llama.cpp/venv bash-3.2$ exit exit exec: /Users/rickg/llama.cpp/venv/bin/pip install torch torchvision torchaudio sentencepiece numpy in undefined /Users/rickg/llama.cpp/venv/bin/pip install torch torchvision torchaudio sentencepiece numpy exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. io sentencepiece numpy/llama.cpp/venv/bin/pip install torch torchvision torchaud Requirement already satisfied: torch in ./llama.cpp/venv/lib/python3.10/site-packages (1.13.1) Requirement already satisfied: torchvision in ./llama.cpp/venv/lib/python3.10/site-packages (0.14.1) Requirement already satisfied: torchaudio in ./llama.cpp/venv/lib/python3.10/site-packages (0.13.1) Requirement already satisfied: sentencepiece in ./llama.cpp/venv/lib/python3.10/site-packages (0.1.97) Requirement already satisfied: numpy in ./llama.cpp/venv/lib/python3.10/site-packages (1.24.2) Requirement already satisfied: typing-extensions in ./llama.cpp/venv/lib/python3.10/site-packages (from torch) (4.5.0) Requirement already satisfied: pillow!=8.3.,>=5.3.0 in ./llama.cpp/venv/lib/python3.10/site-packages (from torchvision) (9.4.0) Requirement already satisfied: requests in ./llama.cpp/venv/lib/python3.10/site-packages (from torchvision) (2.28.2) Requirement already satisfied: charset-normalizer<4,>=2 in ./llama.cpp/venv/lib/python3.10/site-packages (from requests->torchvision) (3.1.0) Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./llama.cpp/venv/lib/python3.10/site-packages (from requests->torchvision) (1.26.15) Requirement already satisfied: idna<4,>=2.5 in ./llama.cpp/venv/lib/python3.10/site-packages (from requests->torchvision) (3.4) Requirement already satisfied: certifi>=2017.4.17 in ./llama.cpp/venv/lib/python3.10/site-packages (from requests->torchvision) (2022.12.7)

[notice] A new release of pip available: 22.3.1 -> 23.0.1 [notice] To update, run: python3 -m pip install --upgrade pip bash-3.2$ exit exit exec: make in /Users/rickg/llama.cpp make exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. bash-3.2$ make I llama.cpp build info: I UNAME_S: Darwin I UNAME_P: arm I UNAME_M: arm64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -DGGML_USE_ACCELERATE I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread I LDFLAGS: -framework Accelerate I CC: Apple clang version 12.0.5 (clang-1205.0.22.9) I CXX: Apple clang version 12.0.5 (clang-1205.0.22.9)

cc -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -DGGML_USE_ACCELERATE -c ggml.c -o ggml.o ggml.c:1364:25: error: implicit declaration of function 'vdotq_s32' is invalid in C99 [-Werror,-Wimplicit-function-declaration] int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls); ^ ggml.c:1364:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values) with an expression of incompatible type 'int' int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:1365:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values) with an expression of incompatible type 'int' int32x4_t p_1 = vdotq_s32(vdupq_n_s32(0), v0_1ls, v1_1ls); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:1367:13: error: assigning to 'int32x4_t' (vector of 4 'int32_t' values) from incompatible type 'int' p_0 = vdotq_s32(p_0, v0_0hs, v1_0hs); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:1368:13: error: assigning to 'int32x4_t' (vector of 4 'int32_t' values) from incompatible type 'int' p_1 = vdotq_s32(p_1, v0_1hs, v1_1hs); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 errors generated. make: * [ggml.o] Error 1 bash-3.2$ exit exit /Users/rickg/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:153 throw new Error("running 'make' failed") ^

Error: running 'make' failed at Dalai.install (/Users/rickg/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:153:13)

jamesjyu · on March 14, 2023

seeing this too. did you find a solution?

jamesjyu · on March 14, 2023

updating xcode did the trick

evo_9 · on March 14, 2023

Where does it say I need Xcode installed?

Is there a list of prerequisites?

Hey thanks, after installing Xcode, that did resolve the issue.

cocktailpeanut · on March 12, 2023

It's free. there's extremely cheap, and there's free. no matter how extremely cheap something is, "free" is on a completely different level and gives us a new assumption that enables a lot of things that are not possible when each request is paid (no matter how cheap it is)

Tepix · on March 13, 2023

You do have to pay for electricity which can be significant when you have multiple GPUs

HN For You