For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | ur-whale's commentsregister

Why is it no one ever talks about the one thing no one can get their hands on except the big labs ?

I'm talking about the training set.

Sure there are some open sets out there.

But my guess is they are nowhere near what OpenAI, Google and Anthropic are actually using.

Happy to be proven wrong.


I think OpenAI and Anthropic just downloaded the same torrents from Anna's Archive that anyone else can. But it's only OK when they do it. The rest of us get nastygrams from law offices. Anthropic actually had to cough up some bucks, for that matter.

At that point, a lot depends on the quality of the preprocessing applied to the raw text dumps. It is reportedly not that trivial to go from DumpOfSketchyRussianPirateSite.zip to a data set suitable for ingestion during pretraining. A few bad chunks of data can apparently do more harm than one would expect.

AFAIK Google scans almost everything in print as part of the Google Books initiative, so they may have been able to skip the torrenting step.


> I think OpenAI and Anthropic just downloaded the same torrents from Anna's Archive

I think you're underestimating how much chat conversation data they've gathered at this point, and how much of it is part of the training set.

None of that is available to anyone who wants to train a frontier model.

And when it comes to Google ... the hoard of data they're sitting on goes back to what? 1998? They've basically got a digital record of what happened since the birth of the internet.


> Oh yes that's what I mean. I don't like calling it the Netherlands.

it was tongue-in-cheek dude.

literal people are a hoot.


True everyone loves hooters :3

> Such an institution would never buy and sell to trade the market

This is not what they're doing.

They're just re-asserting their sovereignty over their property, a smart move in the current geopolitical climate.

I'm actually surprised the utter dumbass they have at the helm over there managed to cook up such a smart move.


> Germany also needs to pull all gold. We have 1236t there.

They had better act fast, before an executive order prevents that from ever happening.


US also has gold reserves and investments in Germany. They can be seized.

The real question is who has more invested in the opposing country

> This is not gain at all. At least in theory: You own some tons of gold at the start of the process, you have the same tons of gold at the end of the process.

Correct. A better way to put it is you shorted the USD. Which is a smart move at any rate. So a gain indeed.


> De Gaulle initiated a systematic, aggressive policy where they converted USD into physical gold

The dude was a visionary for many things, but I didn't know about this. Borderline prescient. What a guy.


Just like the majority of the classical economists and policymakers, you would call him a blithering idiot and overzealous nationalist two decades ago. It was thought that this kind of behavior caused world-wars. I mean it did cause them. It is just we're speed running the next one that changed the narrative.

This behavior is economically inefficient, that's why it's criticized. And it's undeniable

But the point is that "economical efficiency" is not the only metric that matters, stability and power do not come cheap.


I think many academics are often specialized in one area of their expertise and overfit in that dimension. Journalists pick this up and promote those views a bit too much. This results in non-optimal decisions due to skewed public perceptions.

We need to promote holistic thinking considering multiple dimensions and not just one where academics are proficient in.


> many academics are often specialized in one area of their expertise and overfit in that dimension

An economist saying a national-security measure costs this much is fine. Where it goes off the rails is in turning costs into damnation without accounting for what one gets in return. In an attention-driven media environment, that sells.


The problem is that there isn't simply an efficient solution for everything. At one point every problem has solutions with pros and cons

France could do it as it is a rich and big country but smaller countries do not have a viable choice. This reasoning could have been applied to France too in another universe.

It's a balance impossible to totally tilt one way or another.

So no amount of extra information could help when it's matter of opinion at the end of the day


De Gaulle was not an overzealous nationalist. He actually ended French colonialism and started an alliance with Germany.

He was a patriot and very pragmatic. He knew France had been diminished. He had no time for delusional ideas.


Carry permit to operate a compiler is in our near future.

Richard Stallman's "Right to Read" is worth reading again, because it portrays a very similar scenario.

Never forget what they did to encryption

> FreeCAD is amazing these days.

FreeCAD has become much better, no denying it.

"Amazing" is however not the word I would use though, the UI is still very convoluted and very hard to learn.

The worst part in FreeCAD, and which remains true to this day is the load of minutia you need to know to handle/avoid weird corner cases that you inevitable run into when you start building complex models and where FreeCAD stubbornly refuses to let you carry on with your work.

When you paint yourself into one of these corners, the software is hugely unhelpful when it comes to understanding what you did wrong and how to correct it.

In short, the word "Amazing" only works if you compare it to the absolute abomination the UI was a few years back.

But compare FreeCAD today to, for example, how slick Fusion is, there is still a very, very wide gap.

Finally, the geometry engine, is a somewhat old and creaky thing that sometimes downright fails to compute fillets or surface/surface intersections correctly, so yeah, YMMV.

FreeCAD is however, free software, and not controlled by one of the worst corp. in the world of software: Autodesk. So huge thumbs up there.


This is really accurate to my experience learning FreeCAD earlier this year. I am a former professional CAD user (of a lesser software than AutoCAD) and I don't think I would have gotten far without being able to ask ChatGPT for help understanding some of the quirks of FreeCAD.

For free and open it's truly impressive though. Actually I think my time building iOS UIs in Storyboard was at least as useful as previous CAD experience, since constraints are the foundation of (at least one approach to) designing parts.


The last Autodesk software I've used was AutoCAD 2000 (released in 1999). And I've not followed them since.

Perhaps they have indeed become "one of the worst corp. in the world of software", but in the early years they were very interesting. The founder of Autodesk, John Walker (he died in 2024) wrote/edited and interesting book on the early years: "The Autodesk File" https://fourmilab.ch/autofile/


Yeah, and then ran away to Switzerland rather than work to preserve the democracy which made his fortune possible.

By the way this creation of his, from July 1990: https://www.fourmilab.ch/evilempire/ is very relevant here, but we are getting off-topic :-)

SpeakFreely was his as well - a very early encrypted, VoIP app.

And this: https://www.fourmilab.ch/hackdiet/ ... As I said - an interesting person :-)


That is more of an indictment of the US than it is of Mr Walker. Maybe I should run away to Switzerland, too.

Is this critique or praise of his character? ;-)

Statement of fact with my interpretation --- folks should verify the fact and read what he has written and come to their own conclusions.

While I'm grateful Autodesk stepped in and kept TinkerCAD afloat, I'm relieved Sketchbook escaped their clutches, and am glad I never got involved in Fusion 360 so as to suffer from their on-going "rug pulls" --- which of these are a result of his influence, I've not found a need to discern.


The word "amazing" fits perfectly if you compare FreeCAD to viable alternatives, of which there are none.

I nominate Adobe to the worst corp. in the world of software.

Fusion360 at least works on Linux

Photoshop/Lightroom don't.


> I think OpenSCAD is currently the best and most feature complete choice

As much as I love OpenSCAD, I would strongly disagree with your conclusion.

All the OpenSCAD language can do is boolean operations and moreover, the engine can only implement those on polygonal (triangle actually) meshes.

That's a very far cry from what a modern commercial CAD engine can do.

For example, the following things are very, very hard to do, or even specify using OpenScad:

   - Smooth surfaces, especially spline-based

   - Fillets / Chamfers between two arbitrary surfaces

   - Trimming surfaces

   - Querying partly built models and using the outcome in the subsequent construction (e.g. find the shortest segment between two smooth surfaces, building a cylinder around it and filleting it with the two surfaces, this is an effing nightmare to do within the confines of OpenSCAD)

   - Last but not least: there is no native constraint solver in OpenSCAD, neither in the language nor in the engine (unlike - say - SolveSpace)
I might have misunderstood what you're looking to do, but, yeah, digging deeper feels very much like the right thing to do.

(my) fncad doesn't have the querying, but it does have smooth csg! https://fncad.github.io/

using BOSL2 alleviates most issues I've run into with OpenScad for chamfers and the like, but it is an extra set of functions you need to remember sadly

https://github.com/BelfrySCAD/BOSL2


> BOSL2 ... but it is an extra set of functions you need to remember sadly

It's also extremely slow: it implements chamfers and fillets using morpho, and if you have a large number of fillets, the morpho algorithms (minkowski / hull) are very much non linear in time on polygonal meshes, which leads to compute time explosion if you want a visually smooth result.


you can get around this somewhat by having less visually smooth previews when editing and higher quality when you want an stl

  $fn = $preview ? 32 : 256;

Amazing work, and a fantastic idea, congratulations to the author. I am definitely trying this next week-end!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You