For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | surround's commentsregister

I think you're right. Try asking GPT-5 this:

> Are the parentheses in ((((()))))) balanced?

There was a thread about this the other day [1]. It's the same issue as "count the r's in strawberry." Tokenization makes it hard to count characters. If you put that string into OpenAI's tokenizer, [2] this is how they are grouped:

Token 1: ((((

Token 2: ()))

Token 3: )))

Which of course isn't at all how our minds would group them together in order to keep track of them.

[1] https://news.ycombinator.com/item?id=47615876 [2] https://platform.openai.com/tokenizer


This is mostly because people wrongly assume that LLMs can count things. Just because it looks like it can, doesn't mean it is.

Try to get your favourite LLM to read the time from a clock face. It'll fail ridiculously most of the time, and come up with all kinds of wonky reasons for the failures.

It can code things that it's seen the logic for before. That's not the same as counting. That's outputing what it's previously seen as proper code (and even then it often fails. Probably 'cos there's a lot of crap code out there)


Don’t ask the LLM to do that directly: ask it to write a program to answer the question, then have it run the program. It works much better that way.

But for lisp, a more complex solution is needed. It's easy for a human lisp programmer to keep track of which closing parentheses corresponds to which opening parentheses because the editor highlights parentheses pairs as they are typed. How can we give an LLM that kind of feedback as it generates code?

That's a different question than the one you asked. Are you saying LLMs are generating invalid LISP due to paren mismatching?

That's what the comment I was originally replying to was saying.

If the LLM is intelligent, why can’t it figure out on its own that it needs to write a program?

The answer is self-evident.

does the ai performance drop if it uses letters for tokens rather than tokens for tokens?

Try asking an LLM a question like "H o w T o P r o g r a m I n R u s t ?" - each letter, separated by spaces, will be its own token, and the model will understand just fine. The issue is that computational cost scales quadratically with the number of tokens, so processing "h e l l o" is much more expensive than "hello". "hello" has meaning, "h" has no meaning by itself. The model has to waste a lot of computation forming words from the letters.

Our brains also process text entire words at a time, not letter-by-letter. The difference is that our brains are much more flexible than a tokenizer, and we can easily switch to letter-by-letter reading when needed, such as when we encounter an unfamiliar word.


The graphic that shows that a hijacker can route traffic to their malicious website is a little misleading. Since the SSL certificate would be invalid, browsers would block the connection and show a warning.

I guess the attack could still be used for denial of service.


Once you have control of the destination, you could get a valid SSL certificate with Letsencrypt or whatever.

Wow I'm surprised, you're right, and it has happened before:

> the attacker issued and registered a free temporary 3-month certificate for the developers[.]kakao.com domain through SSL certificate issuer called ZeroSSL. Because the routing policy was already manipulated by the BGP Hijacking, the attacker was able to register the certificate.

https://medium.com/s2wblog/post-mortem-of-klayswap-incident-...


You could mitigate this by monitoring certificate transparency logs for unwanted certificates issued for your domain.

Currently there are no good monitors though aka the system is a bit broken.



It sounds like that one may have been the result of a "lawful intercept", so perhaps not necessarily BGP hijacking. If you have legitimate control of the ASN/network, it's not a hijack.

Your posts: https://twiiit.com/hac

2020 - "Ping"

2021 - "Pong"

2023 - "Boop."

2023 - "Bleep"

2023 - "will inventing new technology be the solution to our problems?"


People can use Twitter actively and not post. That’s not really a reason to take someone’s handle away.


The obvious reason is, of course, money.

Since rare handles can generate high prices and are returned to auction once the buyer fails to meet their obligations, Twitter has a strong incentive to increase the number of handles in its auction pool.

The relevant product manager has probably ranked existing attractive handles according to their expected mobilisation/outrage potential and started confiscating handles from the bottom of that list.

This is probably also why you won't be notified about their auction of your handle, even though you'll receive email alerts for irrelevant stuff all the time. The process looks designed to be stealthy.

Money really is the trivial Occam's razor explanation here.


I can't believe X would take back the account of such an active and valued member of the community who is clearly not squatting on the name or anything.


Squatting is something you do to someone else's property. It implies that there is someone else out there with a more legitimate claim to the @hac handle, which there isn't. It's not as if we're talking about @google or something.

If I stole your house and sold it because I didn't think you were using it properly, that would clearly be illegitimate. I don't see why the rules change when we talk about someone's twitter handle. Nobody needs @hac. X merely wants it and has the power to take it.


But you don't own it. X does. It's their service, they are free to apportion handles as they see fit. It is nothing like a house where you have an actual ownership claim through the deed.


It's less like having the house taken away, and more like having your house's street address reassigned to someone else's house. Sure, no one's taken your land. Your deed gives you ownership of parcel #530453080, not of the identifier "123 Vine Street", so nothing you legally own has been taken from you.

But it's your identity. It's the way you've been putting yourself into the world and telling people they can reach you there. It used to be that if someone sent a message to that address, or tried to navigate to that address, they would reach you; but now, they'll be taken to somewhere else, and they perhaps won't even realize what's happened.

And for the ownership issue, sheesh. Yes X, in a literal sense, owns all the usernames. We're talking about whether it's morally right for them to do, not about whether it's illegal. If they had held back these short "valuable" usernames from the beginning, no one would care; it's the act of taking away someone's established identity that is problematic.


This "ownership" or rather "identification" is a significant part of the service though.

It wouldn't have been so successful if everybody be called "Anonymous" meaning that they wouldn't be able to make money with it.

They've started to take this away now. Today it's some account with obviously few words. Tomorrow it might be one with wrong words. What you counted as value is nothing. It might be lost tomorrow, so why bother?


God, how I hate all those "well ackchyually" idiots who think TOS are the only contract there ever was ignoring social norms that were there for literally decades.


[flagged]


> can we please not play stupid.

Hmmm who is playing stupid?

Internet monolithic social services are run by private companies with TOS that no one reads and change, services that barely anyone pays for (except through their data).

We should definitely normalize this so that people see what the internet actually is for the vast majority of people.


> but there's something of a grand social contract that keeps the concept of accounts on websites working

no there's not. this is complete and utter fiction. the things that keep it working are ads and normal users putting their eye in front of them, and the tos to make any silly claims of "social contracts" legally and absolutely moot.


It’s playing stupid to pretend that the theft of a hardly used handle has anything to do with an actual user account. I’m sure if @hac had a presence online, their handle wouldn’t have been sold from under them.


Since when do you "own" social media handles? Maybe you should, but that's not reflected in the laws of our countries or the policies of these platforms. They own your presence, your content, and your reach. This is our "solution" to self-publishing. Do you want change? Advocate for it.

Of course, if you advocate for a system with no equivalent to eminent domain you'll quickly discover why the rule exists.


X already owned it.


Yeah well Google owns my Gmail address, but they'd sure ruin my life if they gave it to someone else. It's not acceptable.


People have accounts and never post. Since X makes it mandatory to be signed in to read anything on the site meaningfully, there would be millions of such accounts with limited post history. And that doesn’t even include the fact that people sometimes go away from a platform for months for a variety of reasons.


So if you sign-up just to be able to read Twitter's gate-kept content you should assume they can pull the rug out from under you?


I think that account is a work of art and should have been kept as digital heritage.

I mean: ping and then a year later pong? Priceless.


This is unironically deeper than 90% of what's expressed on this platform


Trust your own style, even if you aren't a native English speaker. Here's an example where a non-native speaker used an LLM to polish his post. The general consensus was that his own writing was preferable to the LLM's edited version.

https://news.ycombinator.com/item?id=45591707

For dyslexia, use a spell-checker. For grammar, use a basic grammar checker, like the kind of grammar checker that has come with MS word since the 1990s. But don't let a style-checker or an LLM rob you of your own voice.


> The general consensus was that his own writing was preferable to the LLM's edited version.

I don't believe a single one of those people.

> For grammar, use a basic grammar checker, like the kind of grammar checker that has come with MS word since the 1990s.

Those are notorious for false-positives, false-negatives, and generally nonsensical advice. Not that the LLM-based alternatives are much better (looking at you, Grammarly), but still.


How do you know if it's real?



> What You Can Do To Protect Yourself

> 1. Disable your mobile advertising ID

> 2. Review apps you’ve granted location permissions to.

I'm surprised they missed the most important step, which is blocking the advertisers from collecting your data in the first place. This is easily done in the browser with uBlock Origin and system-wide with DNS filtering.


How do you do the system wide DNS filtering?


pihole is one way, though it's tricky to do it right


Anna's Archive announced they intended to infringe on the label's copyrights by distributing their music without a license. The law allows the court "to prevent or restrain infringement of a copyright" (emphasis mine).

https://www.law.cornell.edu/uscode/text/17/502#:~:text=Any%2...


Spotify does not own that copyright, only a distribution license. How they can get away with it?


The plaintiffs are actually record companies, spotify is tacked on at the end for some reason, and the article decided to confuse matters :)


Rights can be extended through contracts. A lawyer at Spotify might think to put in: "we distribute the music for you, your right to enforce copyright or otherwise litigate on behalf of that music is also extended to us as if we also own it".

The legal language would be different, that's a dumbed down version.


I do understand what can happen (I'm an IP lawyer), but this basically requires enabling spotify to act as your attorney, since they still do not in fact own the rights, even with this. You can't manufacture standing here - only folks who are exclusive rightsholders can sue. Period. So it would require giving them power of attorney enabling them to sue on your behalf, since you (or whoever) still own the exclusive rights .

I strongly doubt their contract terms have this in there, it would be fairly shocking.

I say this having seens tons of these kinds of contracts, even with spotify, and never seeing something like this.


What I have seen in practice (not with Spotify) is a law firm that is cozy with both entities will be delegated standing, the "powers" in power of attorney but with clauses defining a limited scope and "escape hatch" and "kill switch" clauses.


With the amount of content that has been described, it's not unlikely that Spotify actually owns some tiny fraction of it. They probably have some half-assed record label that owns two songs by a nobody.


Apparently you can win anything you want in a default judgement, no matter how ridiculous. When you know the other side won't show up because they'd be handcuffed, this is a useful way to achieve your goals.


Nah - the plaintiffs include record companies, who do have rights here.


We need to abolish copyright laws entirely. This is just the ten millionth instance of them being abused by the 1% to harm the majority.


In times of AI this doesn't sound lije the ideal solution either


Wikipedia says

  Ek's initial pitch to Lorentzon was not initially related to music, but rather a way for streaming content such as video, digital films, images or music to drive advertising revenue.
So yes, they were always intending to get revenue from ads. And yes, the initial pitch included other types of media too. But I don't think we can call Spotify "an ad platform" that "never actually cared about music" any more than we could call Ars Technica "an ad platform that never actually cared about tech news."


Did you know that Ars in Ars Technica stands for Ass, showing how badly they really thought of technology?

/s


Sarcasm doesn't mean bad jokes.


And this is exactly why I had to use /s. Because some people would not understand that it was weitten tongue-in-cheek, while some others would fail to see the larger context and confuse my sarcasm with a simple joke (sure, as a joke it is bad; and that's precicely because it was optimized to be sarcastic, not joke-funny).


Where does it talk about his use of English or his lawyers?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You