I'm one of the authors of this. Happy to answer any questions.
One of the fun technical details is that, when enabled on a machine (tailscale up --ssh), the userspace tailscaled process takes over all TCP port 22 packets after the WireGuard decryption and doesn't even feed them into the kernel over TUN. We use gVisor's netstack to handle the TCP connections in-process.
So it doesn't matter whether you have other processes (or iptables rules, etc) that would prevent the Tailscale SSH server from binding to port 22. This lets people gradually use Tailscale SSH over time without messing with their system one.
The Tailscale SSH server currently only runs on Linux but there's support in git main for macOS too but it's not super well tested yet and not included in the sandboxed GUI builds currently.
Just a quick thank you to the team working on Tailscale. It’s hands down the most seamless dev experience I’ve ever seen. Every time I think “such and such would be nice”, I search the docs, and it’s already implemented better than I could have expected (eg the stateless mode for ephemeral servers).
I tinkered with cloudflare before that but just couldn’t get on with the interface of the admin tooling.
With Tailscale I have a lot more confidence that I’ve set up the access rules as I need them. It’s all just a lot more obvious.
> This lets people gradually use Tailscale SSH over time without messing with their system one.
That is something I have really appreciated about Tailscale. It seems to consistently not mess with the existing environment. Considering it does networking witchcraft and it works on a variety of architectures and OSs this is quite an accomplishment.
I suspect Tailscale's customers have found the same.
Not really. It messes with DNS big time. Try enabling the "MagicDNS" or "Exit Nodes" features, and watch as /etc/resolv.conf is edited with each change. I can easily reproduce scenarios where it's left empty and there's no working DNS resolution.
This is one of the major things I _don't_ like about Tailscale. I wish they'd just stick to enabling Wireguard and making the authentication easier (i.e., where they started). I'm not a fan of most of the features they've added since. I don't want service discovery, magic DNS, SSH key management and/or the kitchen sink bolted on.
But, yeah, without systemd-resolved Linux DNS is a fight for the death between uncooperating processes. NetworkManager is okay but there are a dozen buggy variants in the wild we have to work around.
Linux is by far the worst platform for DNS config.
I totally recommend systemd-resolved. It's the only thing that does DNS well on Linux.
Consistently I’m unable to use Tailscale on a GCP instance and also use GCP services cleanly, because it messes with the DNS route to the metadata server. Otherwise, it’s a great product.
The firewall is the system. Just like apple bypass its own firewall and just send packet back home. Or the chinese way.
Of course as said by one of the author the key is to control port 22 or rule for ssh. That is not a totally lost. Still, one that is ok … you are breaking the system by promoting a way to bypass it. Or just 1 rule. It is so hard to remember.
No, it's not. Network access control is the whole point of Tailscale; it is the network filtering layer. It serves literally the same function that a Checkpoint Firewall-1 installation would have in 1997, and that's why people buy it. This is basic stuff from the Tailscale website; it doesn't even qualify as analysis. You really ought to understand how these things work before you describe things as "big holes".
How was the decision made to roll this functionality out before announcing it to customers (we found it during a previous security audit)?
While it might seem logical in your mind to bolt on extra features and add value, your customers evaluate risk based on functionality of the software they are approving. Customer buys a VPN solution, magically gets remote access that bypasses firewalls. Can we trust Tailscale to not roll out a remote file backup feature and start silently exfiltrating data (as an extreme example)?
There are two things have have to be enabled to turn it on:
(1) a target server needs to run "tailscale up --ssh" to enable the SSH server
(2) your Tailscale ACLs have to permit it. Our default, if you've never set your ACLs (as is usually the case for personal users), is that you're allowed to SSH to your own untagged devices only.
For an org that's already using ACLs, you won't have any SSH rules defined and thus nobody in your org can enable the SSH server. (Or rather, they can enable it but nobody can connect to it.)
If your concern an org that's using the default "all packets are allowed" ACLs?
I can't speak for mike_d specifically, but there is a concern with having (potentially significant) modifications made to the codebase that aren't surfaced in the release notes. I imagine closed-source projects do this on a regular basis whether customers know (or care) or not.
The expectations for opensource projects are different though, particularly when it comes to system-level or near system-level components. So not being able to access the functionality is a great default but it doesn't address side effects of the changes or the desire to know about changes being made in our environments.
Of course it doesn't. Only the actual auditing of the code could do that. Nobody in the world wants to audit code they are relying on every update to make sure that the developers have not added potential new security concerns that they would otherwise not be aware of.
Concern more around what looks like an ssh backdoor showing up unannounced. How would they know the subtleties of what it takes to enable it when it wasn’t announced yet?
Try to look at it without your inside knowledge of how it works. Think about a customer discovering this with no documentation.
Until you decide to ship a completely on-prem Tailscale server, ACLs mean nothing. They can be modified by the same rogue employee that added an SSH server that bypasses local firewalls to our environment without telling anyone.
If you're unwilling to trust Tailscale and their processes, you can't run Tailscale right now. That's obvious. It's part of the premise. The idea that ACLs "mean nothing" is risibly reductive; the ACLs protect our team members from each other and mistakes they might make with their environments.
(We don't use Tailscale SSH, and are unlikely ever to; we have a separate source of authentication truth for SSH, and a separate certificate-based access control system.)
They built a footgun into a toaster, and victim blame when people complain that they thought it was just supposed to make toast. Users should not be put into a situation where they need to configure ACLs in anticipation of undocumented features.
My hope was that with a little public prodding they would do better in the future. It is a product I want to like, maybe not for what you or I do, but lots of folks out there are slinging cat pictures where it will be a net benefit.
If you're not already using Tailscale, with your security or IT teams controlling it, it would be malpractice to allow it on a controlled network. No competent security team allows people to introduce their own VPNs.
The way it works in enterprise that is principal engineers like me are generally given some freedom to explore new technologies responsibly. In my mind, that includes visiting the Tailscale website (which started being blocked by our IT yesterday) to gather information about whether this would be a good alternative technology for our research teams.
Now what I have to do is file a bunch of tickets and take a bunch of meetings to get a block removed from the overall site. Really, what I was trying to do is provide nformation to the Tailscale developers that enterprise already considers their website/product scary enough to do a whole block, and if they want to expand into enterprise, they may want to understand the reasons for that.
> Now what I have to do is file a bunch of tickets and take a bunch of meetings to get a block removed from the overall site. Really, what I was trying to do is provide nformation to the Tailscale developers that enterprise already considers their website/product scary enough to do a whole block, and if they want to expand into enterprise, they may want to understand the reasons for that.
Not all large enterprises are this disfunctional. I'm sure Tailscale are doing just fine.
You're right. I guess my brain wouldn't let me process something as dumb as a corporate security control based on blocking a website to keep people from installing a binary.
Anyways, I'm just here to say, corporate security teams are definitely not OK with you doing a rogue Tailscale install, and that's as it should be.
Anyways, I'm just here to say, corporate security teams are definitely not OK with you doing a rogue Tailscale install, and that's as it should be.
You might be shocked at how often I get "can you deploy a tarsnap server on port 443? My company's security team won't let me connect to your server on port 9279" requested.
I mean, it's trivial to bounce the TCP connection... but I'm not going to help subvert security policies.
I work for a Fortune 100 company and this is precisely what our corporate overlords do with non-approved VPN software. There are device management tools on every machine in the corporation to detect and block the software, but that isn't turned on. Just blocking the website. ¯\_(ツ)_/¯
Any good network security monitoring system should allow it to be fingerprinted in some manner, and if deep packet inspection is in use then it should be blocked outright.
It's likely just because it's a VPN not under control of the corporation. Corporations have this magical wand which they swing to make it hard for people to do their work :)
I've been using Tailscale for years but will likely not use this feature, even though I would like to.
The fundamental problem with the approach really is that connections are different over the tailnet and over the local network. Here is a specific use case that is painful:
1. There exists a cluster of machines, each with large amounts of locally attached storage. They are all on the same local network and connected with 10Gb (and likely soon 40Gb ethernet interfaces).
2. Each machine is individually on the same tailnet so they can be accessed remotely.
3. Remote users frequently need to move large amounts of data between machines. A user copying a few hundred gigabytes of data with "scp" is normal.
4. For performance reasons, it's preferred to avoid the Tailscale/wireguard overhead when copying data between adjacent machines in a rack.
At this point, if I enable tailscale ssh for remote login, it appears that the problem of key management for connections between local machines (using ssh over the normal interface, not the tailnet) still remains, and in fact, the overall authentication configuration is more complex than it was before.
What I would love to exist, and would make me instantly use this feature, is if the tailnet issued SSH certificates (probably injected into its own ssh-agent?), the existing tailscale SSH implemention worked just like it currently does (it's great!), AND I could manually configure servers to accept certificates issued by the tailnet. Then SSH paths like "laptop --> (over tailnet) --> server 1 --> (over local network) --> server 2" could be made to work transparently, for those machines that need it, and for regular users, it still "just works".
Oh that sounds exciting, would it also solve the current performance issues when moving large amounts of data? It's currently the only reason I still have to use public IPs for some applications.
If those machines are in the same rack, why you don't put them on the same subnet and use a different interface when moving files around instead of Tailnet?
That's exactly what we do, which is why adding Tailscale SSH to our current workflow isn't helpful, since we would still have to manage SSH keys for access via the local subnet.
> (...) the userspace tailscaled process takes over all TCP port 22 packets after the WireGuard decryption and doesn't even feed them into the kernel over TUN. We use gVisor's netstack to handle the TCP connections in-process.
> So it doesn't matter whether you have other processes (or iptables rules, etc) that would prevent the Tailscale SSH server from binding to port 22.
This sounds like a great feature when exploiting buggy WordPress/php apps! /s
I realize this is a feature - but it's a bit sad that the standard package handling isn't up to the task; leaving (I expect) the tailscale daemon as a "magic" netcitizen - not featuring in neither "ss" or "iptables" output (why can't I login to opensshd?).
How do you figure? The idea is that Tailscale is bypassing the kernel, which it can only do for requests coming in over the tailnet --- it gets those packets raw, directly from WireGuard, unlike the normal IP packets your kernel routes to/from localhost or an egress interface.
You're right, it's not at all the same. The Tailscale bypass exists (1) only for traffic traversing Tailscale interfaces (by design, that's the only traffic it can impact, because Tailscale can't run a userland TCP/IP stack for non-Tailscale traffic), and (2) only for this one feature, and (3) only if you've explicitly allowed it for particular users in your Tailscale ACLs. It's not clear to me how you could screw it up to, e.g., amplify an SSRF attack.
You have to explicitly enable Tailscale SSH, both on the host and in the ACLs that allow users to use the feature. Tailscale's ACLs are much, much better than iptable rules (for instance: they have built-in unit testing).
> Tailscale's ACLs are much, much better than iptable rules (for instance: they > have built-in unit testing).
Humility helps a lot on the internet- the important thing about iptables is that it runs on millions, possibly billions of machines. Production systems that don't have unit tests but run at scale aren't worse than systems which are newly introduced but have fairly unknown implications.
I'm sorry, I really don't know what you're trying to say here. I'm evaluating a set of engineering tradeoffs and reaching a conclusion about them; I'm not trying to psychoanalyze them.
Im trying to say you're better than iptables because your code has unit tests makes you look arrogant because iptables is a production system that operates successfully at such a large scale that it shows unit tests aren't an accurate measure of quality. I'm saying that when people talk like you did and criticize prod systems, you look arrogant, and humility- using terms like "we believe" rather than "is" help a lot in building user confidence.
Again: these are engineering details, not people; they aren't "arrogant". There is lesser engineering and there is better engineering. As someone who does quite a bit of work with iptables and who has used ACL systems like Tailscale's, I can tell you right off the bat that Tailscale's system is better, and if you have the option of using one or the other --- there are good reasons you might not be able to --- you should use something like Tailscale's, which is identity-aware, testable, dynamic, and simple. Obviously, if you're not using Tailscale at all, this is a moot point, for many reasons, including the fact that if you're not using Tailscale, you don't have to think about how it interacts with your iptables rules.
I'm not making a value judgement about people who need to keep using iptables. I might be making a value judgement about people who demand that everyone else keep using iptables.
OK, you're free to completely ignore my advice that you look arrogant, and that it might affect the uptake of your product from the very people who could lead the way to increased adoption.
But, my point still stands. You can't simply assert your system is better, it has to be proven at a scale similar to iptables before you can say that.
> [only for..] requests coming in over the tailnet
Well, that's certainly different from "all TCP port 22 packets" - I suppose some emphasis should be on "after the WireGuard decryption" (ie: over the wireguard interface). It's not entirely clear from the comment (but probably clear to engineers working on the tailscale code).
I read it as if tailscale snapped up packets before the kernel from (all) network interfaces...
The fact that all the ways are horrible is indeed why I was so surprised to (mis)read tailscale as having such capabilities... I'm happy I misunderstood.
Hey bradfitz, guy who previously had 32150 here. :-) This looks insanely cool, a couple questions:
I know it says it's linux-only right now, but is that client side or server only? Can my Windows users TailSSH into linux boxes?
Would be cool if somehow it could wedge into sudo auth so you could login as a a user and sudo without password if allowed by ACLs, especally if I could add "check" to the ssh. agent pam module?
One thing that has prevented me from trying Tailscale, despite the great word on the street, is I can't figure out pricing, despite contacting sales. I'd like to run it on ~120 dev+stg+prod VMs, with 10 people (devs, testers, ops). I'd like every box to talk over tailscale directly, as an overlay network, but servers I hope aren't users, that'd get expensive fast. But I need more devices than 10/user. I presume "custom" would help with that but I got no reply from sales. We are probably too small fry. Now that I'm typing this, I realize I guess we could just buy ~15-20 users despite needing only 10.
I think I've resolved myself to setting up Nebula for the server overlay network, and using Tailscale for physical users, with a traditional firewall bridging them.
Again, Tailscale SSH looks very nice, job well done!
Just to add to the above, pricing was a little obsecure for me too though I commited to Tailscale and then worked it out after the fact.
Minor suggestion, for future and new users, is it possible to get a calculator where you could input the number of users you expect, the number of servers you want to include, expected unique ACL's and provide you an ETA of what your license cost would be?
> I know it says it's linux-only right now, but is that client side or server only? Can my Windows users TailSSH into linux boxes?
Linux-only on the server right. macOS support is kinda there (in git) but not entirely done and not included in the GUI builds. Windows server support is tracked in https://github.com/tailscale/tailscale/issues/4697.
You can use any SSH client from any OS.
> Would be cool if somehow it could wedge into sudo auth so you could login as a a user and sudo without password if allowed by ACLs
> One thing that has prevented me from trying Tailscale, despite the great word on the street, is I can't figure out pricing, despite contacting sales. I'd like to run it on ~120 dev+stg+prod VMs, with 10 people (devs, testers, ops). I'd like every box to talk over tailscale directly, as an overlay network, but servers I hope aren't users, that'd get expensive fast. But I need more devices than 10/user. I presume "custom" would help with that but I got no reply from sales. We are probably too small fry. Now that I'm typing this, I realize I guess we could just buy ~15-20 users despite needing only 10.
You only pay for unique humans, not tagged role account devices. I wonder if your email got eaten as spam or something. Email me (username at tailscale) and copy sales@ and I'll make sure somebody replies. But I don't think you need a custom plan.
> I think I've resolved myself to setting up Nebula for the server overlay network, and using Tailscale for physical users, with a traditional firewall bridging them.
Hey, if you've got something that works, stick with it. :)
Because if you are an IoT service with one human and 100,000 devices, the amount of support you may need is more dependent on the 100,000 than on the 1. Very large numbers of devices per human need somewhat different pricing.
Very promising the start of the pam. Good news about the SSH client, I figured that was the case but wanted to ake sure. That would be a huge benefit for my developers who are all on Windows.
Thanks for the info about pricing, I set up the 1 user free account and started that to get some hands on experience, and I'll copy you on pricing if I can't get it figured out. Thanks!
I've tried this earlier and was unsusccessful sshing from my iPad, using Termius and Blink apps. Not sure if there are specific client requirements on the iPad?
I had a notification asking me to verify, but because of Focus, that notification didn't show up anywhere that I could see...... So in theory, this should work, will try again.
Regarding pricing, in my experience the Tailscale crew have been very forgiving when it comes to user/device limits. I'm sure if you have 10 users and 100,000 devices you would get some attention, but keep it reasonable and you should be OK.
This looks great, and I'd love to replace AWS SSM (at least for the purposes of instance access) with this! One question I have is have is around device limits.
With SSM, I can easily run an agent on every instance. Tailscale has pretty tight device limits on the Team and Business plans. I have no idea what the custom pricing looks like, but I'm guessing it would exceed my budget. What's the intended way to use this with a large number of servers? A small team can easily have more devices than 5x or 10x the number of users. Should we just set up some "gateway"/"bastion" instances to access via Tailscale SSH and then use regular ssh from there? Some sort of more limited device mode that doesn't count against the device limit (for ssh only, perhaps?) would be great.
You could do a Tailscale SSH bastion thing, yeah. But before you build a funky setup to avoid pricing concerns, at least reach out to the sales folk to see what it is. We're usually pretty flexible on exact quotas and realize that different orgs have different user/device shapes.
This is neat. I've used Cloudflare's Zero-Trust SSH, but I've been frustrated that it interacts poorly with sftp and scp because of the client-side changes that they make to ~/.ssh/config
Tailscale employee here. Tailscale SSH works at the target side by listening on the SSH port on that machine. Client changes aren't needed for this to work, all that is required is to use your SSH client as normal. This should allow you to use sftp and scp without issues.
I have Tailscale on all my Macs. I use MacOS default SSH between my machines, but only via the Tailscale interface.
Nevertheless, I had to open SSH on each machine, and it's a nightmare to close up the firewall so only Tailscale gets through. You'd think this was the whole point of Tailscale; there should be a one click lock to restrict to Tailscale. But the Tailscale documentation is wanting. I actually paid for a candidate for the best firewall front end, it came with "Let us know if we can help!" and radio silence once I explained the problem. Likely, restricting to Tailscale requires a granularity one can only hand-code.
I can write a firewall, I've written plenty in the past, I just couldn't find the several hours to do this as a one-off for me when it should be easy, but I was missing needed information.
Tailscale is justly proud of how it connects machines through uncooperative routers and such. Tailscale SSH should do the same. The idiot's guide to securing a machine so only Tailscale SSH gets through should be to find SSH in the preferences and turn the fucker off.
You don't really need a firewall to do that. putting
ListenAddress 100.x.x.x
where 100.x.x.x is the address on the tailnet, into your sshd_config would do what you want. Unfortunately you can't specify an interface, but if you have any sort of automation in place this is easy enough to template in.
It does. You can "turn the fucker off" (as you say) at the OS level and Tailscale SSH will still work. We don't send the Tailscale SSH packets through the OS for it to block them.
Well, Tailscale SSH server support for macOS is still not entirely done. You can build it from source if you're brave (and set an env var to turn it on), but it's not in the product yet by default.
Just want to say thanks: This is insanely cool/easy. Combined with the VSCode Remote SSH extension and MagicDNS, it's now insanely easy to work on a project in a remote environment. I was recently reading through a relatively long post on setting up SSH through Tailscale to access a WSL2 environment, and now it's literally as easy as popping open VSCode in any environment that I have Tailscale installed on and accessing `user@my-magic-dns-machine`. Great work!
Could you share some details about the embedded SSH server? I'm curious if this would work to add SSH capabilities to devices that run Tailscale but don't include a built-in SSH server. Previously I've used dropbear, so it'd be really nice to be able to drop that requirement!
If you're already running recent-ish Tailscale on them, they're already running an SSH server that's just disabled. Run "tailscale up --ssh" to turn it on.
I attempted this on a VM inside a Linux host and got a lower privileged user from inside the guest VM to ssh to a root-privileged user outside on the host.
Both were authenticated to Tailscale with the same gmail account, so from an OAuth perspective, this is valid.
From the OS perspective though, the host SSH port is blocked, and a guest should never get full root access to the host or see the host's resources.
I am not sure if I am confused about something, or maybe there are prod use-cases where the same IDP identity should have different roles/privileges depending on the machine, and Tailscale SSH breaks that?
A nice next step would be tailscale managing an ssh key that's allowed to do interact with a git(hub) repository.
So that I wouldn't have to create multiple keys or setup the same key on different machines and still be able to interact with a repo from all of them.
It'd be really nice just using git transparently and having tailscale take over the git ssh connection and authenticate using taliscale access controls.
At least for personal projects or small teams that'd be quite convenient.
Depending on which part of those things you find painful, you might want to look into ssh certificates? They're pretty easy to work with, much easier than most kinds of certificate systems.
Hi Brad: Thanks for helping out with this feature! I've been one of the early users of tailscale. My network is around 50 machines. I've recently started having issues with ssh on some of my machines, especially from mac m1 -> some ubuntu boxes. Could this be related to this new feature? Any suggestions/pointers on how to debug these issues?
Tailscale is a userland process built in a memory-safe language, which leaves you only the SSH protocol cryptography-type vulnerabilities, which are themselves mooted by Tailscale (see downthread for a discussion about why they didn't simply expose rsh instead of ssh).
This is safer than OpenSSH.
(OpenSSH, as a piece of software, is extraordinarily safe, and has one of the best records of any memory-unsafe codebases. But OpenSSH as configurable infrastructure is much less safe; people screw it up all the time.)
Security groups where? On the Tailscale ACL side, you need to allow tcp/22 in.
On your host where you're running Tailscale, usually nothing. You can keep everything locked down for ingress. Outbound UDP only, but usually cloud VMs allow outbound traffic already. (This is covered more in https://tailscale.com/kb/1082/firewall-ports/)
quick question. Does this do user (de)provisioning like Jumpcloud? I.e. if the target machine doesn't have a /home/someuser but someuser is in my tailnet ACL, will it create the account?
One of the fun technical details is that, when enabled on a machine (tailscale up --ssh), the userspace tailscaled process takes over all TCP port 22 packets after the WireGuard decryption and doesn't even feed them into the kernel over TUN. We use gVisor's netstack to handle the TCP connections in-process.
So it doesn't matter whether you have other processes (or iptables rules, etc) that would prevent the Tailscale SSH server from binding to port 22. This lets people gradually use Tailscale SSH over time without messing with their system one.
The Tailscale SSH server currently only runs on Linux but there's support in git main for macOS too but it's not super well tested yet and not included in the sandboxed GUI builds currently.