More

rhuber · on Oct 13, 2023

That's totally reasonable, and I agree that using something hosted entirely by a 3rd party makes sense for some use cases. Our reason goes a bit beyond security concerns, in this case. We built Nebula for large scale deployments, and because of that, we have made decisions that lean into that model for hosting.

Our decision to leave lighthouse hosting in the hands of users has one primary rationale: We want users to have complete control their network availability. Any downtime of our service should not impact their network availability. You can even host some of your lighthouses inside of network boundaries to ensure that an internal network functions properly if its connection to the internet is interrupted. Other overlay options may continue to work for some time, but new connections are often not possible, and the network can degrade rapidly.

Relays are are a similar story, but with an additional reason: We don't have to limit our customers' relay bandwidth due to cost. When hosting relays on behalf of others, we would be transiting a lot of traffic, which has an associated (sometimes unpredictable) cost. By letting our customers host relays, they can ensure relay traffic is just as fast as direclt connections.

rhuber · on Oct 13, 2023

(*blog post author here)

Thanks for sharing this on HN! I'll keep an eye on the comments and try to answer questions that come up.

donutshop · on Oct 13, 2023

Love Nebula, keep up the good work!

Do you thing the tech landscape today would have allowed for nebula to be born? Lots of companies now have strict IP agreements they have team members sign.

rhuber · on Oct 13, 2023

That's a great question! One of the things I enjoyed during my time at Slack was their willingness to contribute to open source projects. We had similar IP clauses, but asking permission to open source things was straightforward.

The most important concern (IMO), was considering whether we could commit to properly maintaining a project. Before open sourcing anything, you need to discuss how you'll go about managing an issue and pull request backlog, so that people don't come across "dead" projects under your stewardship.

In a high growth startup, I do think something like this could happen again, but as a company grows, there are certainly more layers that can make it difficult to share things openly.

shawn-butler · on Oct 13, 2023

Seeing as Slack was born as a tool inside Glitch which existed only because of a side project called Flickr…

I don’t really think it’s the size or layers of a company that prevent; it’s the culture. This culture of creation permeates everything I’ve seen Stewart Butterfield do. At least from the outside. Admirable and extremely profitable.

xoa · on Oct 14, 2023

Just wanted to chime in along the others that I also love Nebula, and I'm really grateful to have a mesh option that is modern, truly decentralized and self hostable. Nebula is also just an plain elegant IMO, one of those pieces of software that clicks right away top to bottom. Now I just hope it gains momentum to make it into a wider variety of tools so it becomes ever more accessible. So again, thank you and everyone else who had a hand in it!

harrisonpage · on Oct 13, 2023

Hey Ryan! Love Nebula, miss you at the day job

rhuber · on Oct 13, 2023

Thanks Harrison, hope you're well!

rhuber · on May 4, 2022

We need to do a better job of this and I'm really sorry you had a not-great experience with expiration. Totally agree with your take.

JeremyNT · on May 5, 2022

I hope I don't come across as too negative! Sure I'd love to see some improvements here, and they would help adoption amongst hobbyists / home users, but I totally understand focusing on the features needed to make the business work first.

The existing open source functionality for the overlay network itself is (for me) what's really exciting, and it's all there. The management limitations just keep me from evangelizing more broadly (outside of places like HN).

rhuber · on May 4, 2022

(Nebula coauthor here)

People sometimes ask me to describe the differences between Nebula and Tailscale. One of the most important relates to performance and scale. Nebula can handle the amount of internal network traffic and scalability of nodes (100k+ nodes, constant churn) required on a large network like Slack's, but Tailscale cannot. Tailscale's performance is fine for many situations, but not suitable for infrastructure. It is just a fundamentally different set of goals.

Nebula was created and open sourced before Tailscale was offering their product, but their architecture is similar to older offerings in the market, and is something we purposely avoided when creating Nebula.

Fwiw, I even recommend Tailscale to friends who want to do things like connect to their Plex server or Synology or [other thing] at home remotely. It simplifies this kind of thing greatly and doesn't require you to set up any infrastructure you control directly, which can be a headache for folks who just want to reach a handful of computers/devices.

JeremyNT · on May 4, 2022

> Fwiw, I even recommend Tailscale to friends who want to do things like connect to their Plex server or Synology or [other thing] at home remotely. It simplifies this kind of thing greatly and doesn't require you to set up any infrastructure you control directly, which can be a headache for folks who just want to reach a handful of computers/devices.

First thanks for working on Nebula! It's great.

Nebula seems to be about 95% there. The functionality it actually does provide once set up is really great. It's just missing the 5% that is arguably the most important for a huge number of people: a simple way to do the configuration management bits such as device enrollment, revocations, key rotations, that sort of thing.

If you are a home user, with a small network, the overhead of doing things manually is low, but you need to be patient and technical enough to read the docs and do it right initially. If you're a big enough organization I guess you can write your own tooling. But for any small shop or any non-technical home user this is not going to fly and you will bounce off it.

I don't know if the plan is to create a commercial offering for this side of the house (it would make sense...) but as far as I'm concerned, this is the only reason that Tailscale is so successful and Nebula is lesser known (despite Nebula's advantages in other ways that may be more relevant to technical users).

rhuber · on May 4, 2022

The Nebula CA we built at Slack was very specific to Slack's internal devops, and just wasn't generalizable. It is highly automated there, and is custom tooling, just as you describe. The open source version is somewhat bare bones (a command line tool for CA vs something like vault).

I will say that the OSS tooling of Nebula is everything someone needs to stand up an entire working network on every common platform (linux/mac/windows/ios/android), but there is a definite gap in simplification that we need to address to make it easier for smaller scale use cases.

We actually have a managed enterprise Nebula offering at my current gig, but that's rather a different market than Tailscale, so I'm avoiding talking as that company as opposed to a Nebula OSS project lead. The commercial offering is targeted at large enterprises, because that's the market where Nebula has unique advantages. It also means we don't currently have a freemium or smb type offering, and are not prioritizing creating one at all. I don't want to give people false hope that we will, and would prefer to see the OSS project improve to address the small-medium use cases.

crawshaw · on May 4, 2022

Tailscalar here. Tailscale can handle 100k+ nodes with lots of churn just fine.

rhuber · on May 4, 2022

Fair enough. I am sure the key distribution is fast and all that, but not needing peer key distribution at all was a goal and the overhead associated is less scalable than just not doing it at all. Regardless, very cool that you can handle that many nodes, which is a hard problem. I assume you do just-in-time key distribution or something, because (n-1) distribution of peer keys would be ... less than ideal.

Anywho, the more important bit is my point about performance. Nebula is significantly faster than userspace Wireguard, and plain userspace Wireguard is (last I checked) a bit faster than Tailscale, due to the additional code needed for things like your ACLs. At gigabit type scale it is probably fine and not noticeable, but at Slack, we needed to scale to 10G+ on links, while ensuring we didn't take a significant hit on CPU resources.

Again, I think Tailscale is very good for its target use case as a VPN replacement, and congrats on raising these funds!

lupire · on May 4, 2022

> the overhead associated is less scalable than just not doing it at all

That's only true if you can actually articulate a reason why it won't scale to some matbitut that some user might actually need today or at some point in the future.

For example, Go may be "not as scalable at C" (or vice versa! Or both!), but what matters is the scale to which it is actually desired to be deployed.

rhuber · on May 4, 2022

I mean... the title of the Tailscale blog post is "Tailscale raises $100M… to fix the Internet", and that's pretty massive scale. /s

I don't have 100k hosts on a large network to test deploying Tailscale, but if I did, I'd be benchmarking the cpu/network/storage overhead of telling 99,999 hosts about a new one that comes online, every time that happens, or every time its pubkey changes. You can optimize this away _if_ your "fan out" is not as large, but there are plenty of cases where every host on your network needs to talk to a particular host, so all of them need to know about its keys as soon as possible.

Again these aren't unsolvable problems, to a point, but we didn't want to solve a problem when we could avoid it entirely, so that's the path we chose. It removes complexity and is a good part of the reason the system we built has been resilient.

A complaint some people express about tailscale is the battery life on mobile (or at least iOS). This exists because there is coordination overhead on even idle tailscale nodes. Back when we ported Nebula to iOS, we sweated details like "how often it wakes the radios" and did a lot of profiling. I never turn Nebula "off" on my iPhone, and it just sits in there in the background not using any resources most of the time.

We worked hard to optimize this out of our architecture, so that Nebula avoids generating traffic that is unrelated to the actual communication between hosts or lookups to lighthouses. An idle nebula tunnel can truly be idle indefinitely, and that also matters as the set of hosts becomes larger.

I do not think the Nebula project and Tailscale are direct replacements for each other in any fashion, and afaik neither is trying to be. I'm just pointing out that different design goals led to unique advantages and disadvantages to each architecture.

stavros · on May 4, 2022

Does Nebula have anything like Tailscale's rules engine? I am absolutely in love with being able to configure all my connections by just specifying a JSON file somewhere. No need to have firewalls, the configuration specifies which service or user can talk to which.

That having been said, I also am wary of using Tailscale for the same reasons as above, I have to trust Tailscale and Github? I can maybe justify trusting Tailscale, but trusting GH/Microsoft/other SSO provider is a bridge too far.

rhuber · on May 4, 2022

It does! In fact replacing AWS security groups and making them cross region and cross platform was probably the first goal of the project. My coauthor, Nate, wrote Nebula's internal firewall code before we wrote a single line of the actual protocol, because he wanted to ensure it was performant enough for massive scale.

stavros · on May 4, 2022

Well that is great, thank you! I will play with it today.

stavros · on May 4, 2022

Ah, it looks like the firewall rules need to be copied to each host separately. That's not a dealbreaker, but not as easy to deploy as having them managed centrally (by the lighthouse, I guess?).

vgel · on May 4, 2022

> People sometimes ask me to describe the differences between Nebula and Tailscale. One of the most important relates to performance and scale. Nebula can handle the amount of internal network traffic and scalability of nodes (100k+ nodes, constant churn) required on a large network like Slack's, but Tailscale cannot. Tailscale's performance is fine for many situations, but not suitable for infrastructure. It is just a fundamentally different set of goals.

Making broad claims like this without a source or links to benchmarks feels like FUD to me. For example Tailscale's comparison page on performance (https://tailscale.com/kb/1148/tailscale-vs-nebula/#performan...) doesn't mention a meaningful performance difference, so if you're claiming they're not telling the truth (by omission), I'd hope to see more to that than just a straight assertion, even just "We tried Tailscale in Slack's network and it wasn't able to keep up with our usage patterns".

rhuber · on May 4, 2022

Another fair criticism. We will publish the benchmarks and make them repeatable (which most existing ones I've found don't bother to do). We hadn't done so because Tailscale isn't really seen as a direct competitor to what the Nebula project is doing, but if people want numbers, that's a thing we are happy to provide.

vgel · on May 4, 2022

That's fair, if you've been benchmarking but haven't made the benchmarks public / repeatable yet. Too used to software where the authors claim it's fast with no proof or based on heuristics like what language it's written in :-)

SahAssar · on May 4, 2022

So "People sometimes ask me to describe the differences between Nebula and Tailscale" and the answer is "performance and scale", but you don't have clear comparisons for those numbers?

rhuber · on May 4, 2022

We have an automated set of ansible scripts that spin up large groups of hosts for Nebula performance regression testing, and a while back I added zerotier, tailscale, wireguard-userspace, wireguard, tinc, ipsec, and openvpn to that automation so I could get a sense of where things stand. I spent a lot of time optimizing each of the above options to make fair comparisons, but it was mostly for mine and the team's curiosity, and we weren't interested in playing benchmark-fight with similar softwares of the world.

Publishing repeatable benchmarks is hard, and when doing open source work, it just hasn't been a priority. As I replied above, if I'm going to say it I should prove it, and I promised to do just that.

And a counterpoint: tailscale does mention in the "Tailscale vs Nebula" article on their website that performance is just about the same but similarly provides no proof. This is motivation enough for me to show proof of the opposite, I guess.

ncmncm · on May 4, 2022

See, I have seen promotions of Tailscale and Zerotier before, but this is the first I have heard of Nebula. If with Nebula I am not beholden to some internet behemoth who may cancel my authentication without notice, I am motivated to try it.

FL410 · on May 4, 2022

Nebula rocks!

rhuber · on Jan 6, 2022

(coauthor of Nebula)

We briefly considered building something atop Wireguard in the early days of Nebula, but decided not to do so because of scaling. Wireguard's protocol necessitates that all nodes have existing keypairs for each other ahead of time. At Slack's scale, that means every time a fresh node is launched, you would have to tell 50,000 other nodes it exists.

Obviously you can smarten this up and tell only hosts it might talk to. But this adds complexity. Using PKI eliminates this key distribution problem and means that you don't have the same scaling limitations as something built on WG.

Wireguard is a very very good VPN, but I cannot imagine trying to run something on the scale of tens of thousands of nodes when you need such complex coordination systems to exchange keys/trust, especially in a dynamic environment where nodes are coming and going all the time.

I totally get that it solvable overall, but Slack has had 4 years of nearly perfect uptime on Nebula, whilst using it to pass >95% of all backend traffic. These considerations may seem simple to address, but there are fundamentals that mattered and led us to writing Nebula. We didn't want to create something new, but to do what Slack needed, we had to.

rhuber · on Sept 18, 2021

Actually it does do that, you can trust multiple CAs in a single instance and even write firewalls scoped to CAs.

rhuber · on May 20, 2021

"Hey guys, remember when freenode was taken over by one guy?" (It was, like, yesterday.)

It's hard to overstate the dangers of over-centralisation like this, and I say it as a person who used freenode professionally.

https://www.kline.sh/

nexuist · on May 20, 2021

...and the moderators who resigned immediately set up their own network under their own terms (libera.chat), so I'm not really sure what your point is? The difference is that you can jump to another server with an IRC client, you can't jump to another Slack with the Slack client.

rhuber · on Feb 26, 2021

The author of this article should consider following their own advice, since they have a woefully outdated RSA-1024 ssh key securing their GitHub account.

$ curl -s https://github.com/apenwarr.keys > blah

$ ssh-keygen -l -f blah

1024 SHA256:1IWAUSXOcCKLcmOdAec8JbDt3T75udA4KSpRosEWUaU no comment (RSA)

(update: they have now replaced it with an RSA 2048 bit key. progress.)

suifbwish · on Feb 26, 2021

It would still take a long ass time to brute force a 1024 unless there is no brute force detection. Alternatively capturing the traffic can allow brute forcing the applied algorithm itself.

rhuber · on Feb 26, 2021

I wasn't commenting on the strength of RSA-1024, per se, but on the assumed age of that key. OpenSSH's ssh-keygen hasn't defaulted to 1024 bit RSA keys since before version 4.2, in 2005. (I had to look it up: https://www.openssh.com/releasenotes.html)

You can still generate a 1024 bit RSA key, but someone would have to go out of their way to do so, and I can't imagine why they would have done that in the past .. decade?

foolmeonce · on Feb 26, 2021

> I can't imagine why they would have done that in the past .. decade?

Maybe they aren't using software keys, but rather a low quality/older/small-kb hardware token or following the default guide for one? The vast majority supported 2048 in 2010 though..

Retr0spectrum · on Feb 26, 2021

It's a public key, you can perform the "brute force" (factorisation) entirely offline, to derive the private key. Hypothetically. For now, RSA-1024 is too expensive to crack, for mere mortals.

katbyte · on Feb 26, 2021

alternatively theres no reason to use 1024? i've been using 4096 for maybe a decade now?

rhuber · on March 21, 2020

It is, admittedly, impossible for me to be unbiased in this discussion (coauthor of Nebula, hi), but I strongly disagree that your client code is your most important component, from a trust perspective.

Your coordination server tells every node about every other node and distributes the keys for the entire network. Everything on a tailscale network implicitly trusts your coordination service.

If an individual client is compromised, code or otherwise, the effect is more limited than your coordination service being compromised, in which case the entire system's trust is broken.

rhuber · on July 16, 2017

Every aspiring programmer should learn C. They should also learn multiple dialects of ASM. Learning these things helps you better understand how computers fundamentally work, and knowing how computers work pushes you to write better code in any language. Ignoring how computers organize data can lead to inefficient code in any language.

That said, if you are writing something new, you should carefully consider whether C is the best choice before using it. If you are working within a nearly-impossible-to-replace + enormous codebase (such as the Linux kernel), it is the only option. If your project's #1 goal is performance above all else, and you are a seasoned or aspiring expert, perhaps C is the best choice. The majority of people writing software do not fall into either of these categories.

strictfp · on July 16, 2017

I feel like the length of the list of things "every programmer should know" is approaching infinity.

Koshkin · on July 16, 2017

From my experience working with people, one can be an excellent programmer knowing just one language, whether it is C#, Ruby, or Java. On the other hand, I have met (too many) people who "knew" a lot but who were quite bad at putting that knowledge into practice.

jcranberry · on July 18, 2017

I'm not sure I agree. Maybe the list of things presented by bloggers as things every programmer should know, but C would not be a new addition to such a list in any case.

krapp · on July 16, 2017

I feel like the list of things "every programmer should know" tends to include "all the things I know,"in which case you may be right.

sirpalee · on July 17, 2017

Neither C++ or Rust has any characteristic that would make it slower than C. It's all up to what is your existing codebase, and what kind of experts you have. Also, both languages can mix with C. For example, the Linux Kernel could also accept C++ code easily if they wanted, but it's a conscious decision not to do so. So, yeah, C is not the only option.

TazeTSchnitzel · on July 16, 2017

Even in an existing codebase written in C it may be possible to avoid it, if you are able to choose whether to introduce a new language for some components.

dchichkov · on July 16, 2017

...if you are able to choose whether to introduce a new language for some components... Well, there are complexity, readability and maintenance costs associated with introducing any new dependencies into a project. And (in my opinion) these costs go in the following order, from small to high:

1. header-only library 2. compiled library 3. compiled library with type framework 4. framework 5. meta-programming framework 6. custom meta-programming framework 7. domain specific language 8. custom domain specific language 9. generic programming language 10. custom generic programming language

As a side note, increase in complexity also might increase job security of a developer. If you are after it, design your own programming language and try to deliver every project implemented in it :) Should your projects be in high demand, your skills will be in high demand as well!

TazeTSchnitzel · on July 16, 2017

Yeah, there's obviously a cost involved. It's not something I'd want to do more than once within a project.

HN For You