More

idiliv · 2026-06-03T18:54:49 1780512889

Uber is likely on an enterprise plan - these charge tokens at API cost, which can be much more expensive than the $20 flat rate.

idiliv · 2026-01-19T15:44:55 1768837495

Sometimes model developers coordinate with inference platforms to time releases in sync.

idiliv · on Feb 26, 2025

Wait, but we're doing that already, and it works well (Qwen 2.5 VL)? If need be, you can always resort to structured generation to enforce schema conformity?

idiliv · on Oct 19, 2024

Duplicate, posted on October 9: https://news.ycombinator.com/item?id=41784591

idiliv · on Sept 25, 2024

Where do you see the MMLU-Pro evaluation for Llama 3.2 90B? On the link I only see Llama 3.2 90B evaluated against multimodal benchmarks.

wesleyyue · on Sept 25, 2024

Ah you're right I totally misread that!

idiliv · on Sept 23, 2024

Is the "Ultra Deep" analysis worth it over the standard "Deep" analysis?

mfld · on Sept 24, 2024

When you are interested in heath issues, probably yes. For hobbyist the standard coverage will be enough.

idiliv · on Sept 12, 2024

In the demo, O1 implements an incorrect version of the "squirrel finder" game?

The instructions state that the squirrel icon should spawn after three seconds, yet it spawns immediately in the first game (also noted by the guy doing the demo).

Edit: I'm referring to the demo video here: https://openai.com/index/introducing-openai-o1-preview/

Bjorkbat · on Sept 12, 2024

Yeah, now that you mention it I also see that. It was clearly meant to spawn after 3 seconds. Seems on successive attempts it also doesn't quite wait 3 seconds.

I'm kind of curious if they did a little bit of editing on that one. Almost seems like the time it takes for the squirrel to spawn is random.

idiliv · on June 14, 2024

How are flexible working hours equivalent to more money?

dkdbejwi383 · on June 14, 2024

If you don't have to pay for child care because you can just take off the time to pick up your kids from school, you are saving money your job otherwise forced you to spend.

Being forced to commute at peak time costs more - on a retail salary the difference between a peak and off-peak ticket can mean that the first hour or two of working is essentially pointless, as you're just paying back the cost of getting there in the first place.

Having to pay an extra surcharge to visit the dentist, because you can only go on a Saturday because you need to be at work other days. Flexible working would allow you to just take the Tuesday off no problem and go when it's cheaper.

I'm sure there are lots of other examples that apply to different lifestyles.

fragmede · on June 14, 2024

hours with your child aren't fungible. you can't pay the babysitter to go see the dance recital for you if you want to be the parent instead of the babysitter being the parent. all the money in the world isn't going to make up for missing the soccer game where your kid makes the winning goal.

jpc0 · on June 14, 2024

Well to be pedantic, with all the money in the world you wouldn't be working for Ikea and the problem wouldn't exist so really that's a problem also solved by money...

Generally though higher paid employees tend to have more sway within a company structure and likely don't need to miss these important events, the win here is that something that was generally true for mid management up for most companies now extends down through all the ranks.

akira2501 · on June 14, 2024

You can manage things in your life when they occur instead of spending money to displace them or risk losing your job because of them.

In another way, if you present a worker with the option between two jobs with the same hourly rate, one having flexible working hours and the other not, which would expect to be more likely choice? You can then measure the value of this choice by changing the hourly rates between the two until you see changes in outcome and you would be able to estimate exactly how much "more money" it appears to be "worth."

idiliv · on June 13, 2024

You can rent them online for ~ 4-5 $ per hour per GPU. Not cheap, but definitely feasible as a weekend project.

_zoltan_ · on June 13, 2024

where can I rent a H100 for 4-5 dollars an hour?

AWS doesn't let you use p5 instances (not getting a quota as a private person), lambda cloud is sold out.

lhl · on June 13, 2024

It looks like Runpod currently (checked right now) has "Low" availability of 8x MI300 SXM (8x$4.89/h), H100 NVL (8x$4.39/h), and H100 (8x$4.69/h) nodes for anyone w/ some time to kill that wants to give the shootout a try.

darrick_horton · on June 13, 2024

We'd be happy to provide access to MI300X at TensorWave so you can validate our results! Just shoot us an email or fill out the form on our website

Jlagreen · on June 17, 2024

If you're able to advertise available GPU compute in some public forums then it's enough to tell us about the demand of MI300X in cloud ...

lhl · on June 18, 2024

You're joking/trolling right? There are literally 10's of thousands of H100s available on gpulist right now, does that mean there's no cloud demand for Nvidia gpus? (I notice from your comment history that you seem to be some sort of bizarre NVDA stan account, but come on, be serious)

idiliv · on April 10, 2024

In Mixtral 8x7B, the 8 means that the model uses Mixture-of-Experts (MoE) layers with 8 experts. The 7B means that if you were to remove 7 of the 8 experts in each layer, then you would end up with a 7B model (which would have exactly the same architecture as Mistral 7B). Therefore, a 1x7B model has 7B params. An 8x7B model has 1 * 7B + (8-1) * sz_expert params, where sz_expert is some constant value that the MoE layers increase by when adding one expert. In the case of Mixtral 8x7B the model size is 46.3GB, so, sz_expert ≈ 5.6B.

If these assumptions port over to 8x22B, then 8x22B has, at 281GB, sz_expert ≈ 13.8B.

KTibow · on April 10, 2024

I tried to check this for myself.

I agreed for the first one, (46.3 - 7) / 7 = 5.61b.

The second one doesn't match up, (281 - 22) / 7 = 37b or (140.5 - 22) / 7 = 16.92b. Am I doing something wrong?

idiliv · on April 10, 2024

Just tried this again and I also arrive at 16.92B. Not sure what I did wrong the first time, thanks for double-checking this!

idiliv · on April 10, 2024

Oh, and to answer your actual question: Assuming that the model is released with 16 bits per parameter, then it as 281GB / 16 bit = 140.5 parameters.

HN For You