How HTTPS Works (2018)

gmac · on Sept 17, 2023

In a similar (but more binary) vein, there’s this: https://news.ycombinator.com/item?id=35884437

hsdropout · on Sept 17, 2023

I like this one

https://tls12.xargs.org/

&

https://tls13.xargs.org/

aniforprez · on Sept 17, 2023

>The green padlock on the URL bar of your browser tells you that there are no crabs watching over your shoulder

The comic probably needs to be updated. At this point, all browsers have stopped showing the green lock for HTTPS in favour of just a grey nondescript lock and a red strike when it's HTTP.

But this is definitely an excellent resource to make learning about HTTPS a little more accessible in a digestible format for new developers

december456 · on Sept 17, 2023

I wonder what the new green lock will be. WEI?

larsonnn · on Sept 18, 2023

Interesting read. Just to add, even HTTPS isn't completely safe from Man-in-the-Middle attacks, especially in corporate environments. A few points to consider:

- Some apps bypass or poorly implement certificate pinning.

- Companies using self-signed certs can be spoofed.

- There's always the risk of compromised Certificate Authorities.

- Corporate tools that inspect HTTPS traffic, while useful for security, can inadvertently become a vulnerability if misconfigured or hacked.

Always good to remember: HTTPS is a protective layer, but not an unbreakable barrier.

dang · on Sept 17, 2023

How HTTPS works - https://news.ycombinator.com/item?id=17932728 - Sept 2018 (72 comments)

csydas · on Sept 17, 2023

I do enjoy the comic and the breakdown of HTTPS/Certs, but I do take issue with the "HTTPS/Certs offer identification" in any meaningful way.

This has been bugging me for awhile, but I don't think it's correct to associate identification with modern IT security, regardless of what it is. We don't have identity management, we have authorization management, and they're very different.

Identity management would mean that the system knows that the authorization request and secret it just received is coming from someone who is authorized to receive that authorization, which is obviously not the case. Even with MFA solutions (Yubikeys, geo-based fencing, etc), it's not really identity management unless someone actually sees you making the request, and even then, we can't know if the authorization request is done under duress or not or with malicious intent.

I'm not trying to suggest that this _must_ be a part of security, but I would say let's call it what it is; authorization management at scale. There is no identity, even with all the data that the companies vacuum up from users just trying to do their work, it's just deciding whether this user trying to access this resource has the correct secret to get access: or more succinctly, HTTPS and SSL Certs (and naturally any cert) doesn't validate identity, it just determines if the secret passed to the server should get authorized or not.

Similarly, Certs, as useful as they are, are still a mess; it's too easy to break the cert process with other common security practices, cert creation is too costly from the suppliers, unless you already understand how certs work, it's very easy to ask for a cert that doesn't meet your requirements which often means paying for another cert; SSL/HTTPS certainly are quite great and I don't want them to change or leave, but the cert process would be benefitted from a UX improvement. But it is important to understand that certs don't necessarily guarantee identity either, even with the names. All you can tell is that the chain is valid to the CA, so at some point, the cert was given specifically to a company. But breaches are so common, it's a bit too nervous to trust them blindly, and certs compromising stories also happen a bit too much. Unlike with Identity Management, I do think certs can solve this with some minor improvements to the process, and LetsEncrypt was a great step in this direction; just still, I have concerns about the idea that certs prove identity in a meaningful way.

Excuse the rant, but identity management has been on my mind a lot recently, and while the article was good, the 3rd panel mentions identification, which none of the items discussed in the comic truly addresses.

dclowd9901 · on Sept 17, 2023

I feel stupid but I still don’t understand how a third party snooper is unable to use the public key themselves to decrypt the pre-master key.

ferfumarma · on Sept 17, 2023

The public key is like a math answer that is easy to compute if you have the whole equation, but really hard if you only have the answer.

Kind of like "what are the divisors of 93036637?"

You can figure it out, but it requires testing lots of numbers (a brute force apporoach) to figure out that it's only 9391 and 9907.

Say Adam sends the public key of 93036637 to Bob. (Note that the public key was chosen to be the product of two primes).

Bob chooses a new prime number - say 8867 - and multiplies it in there, too, and the sends it back to Adam.

Now we have 93036637 x 8867 => 824955860279

When Adam gets that public key back, he can divide out one of the original two prime numbers (824955860279 ÷ 9907 => 83,269,997) and then send it back to Bob.

Bob KNOWS that his secret key was 8867, so he divides that back out to get the shared secret: 9391 (the other half of the original public key).

Even though Eve listened to every single number being transmitted:

  Adam: "93,036,637"

  Bob: "copy. 824,955,860,279"

  Adam: "copy. 83,269,997"

Only Adam and Bob can know the three prime factors of 8867, 9391, and 9907 without doing brute force division.

Brute force division is easy for prime numbers less than 10k, but a sender can choose a prime number that's long enough to satisfy their privacy needs, based on their estimate of Eve's resources.

So the bottom line is that the public key is easy to compute if you have the inputs, but incredibly time consuming (brute force) if you don't.

ta8645 · on Sept 17, 2023

If Eve is listening in to the data stream you've outlined, it's trivial for her to decode.

   Adam: (A)93,036,637
    Bob: (B)824,955,860,279
       Eve:  (B)824955860279 / (A)93036637 = Bob's secret of (S3)8867
   Adam: (C)83,269,997
       Eve:  (C)83269997 / (S3)8867 = (S1)9391
       Eve:  (A)93036637 / (S1)9391 = (S2)9907

Eve doesn't need anything but A, B, and C to easily calculate the rest.

a1369209993 · on Sept 17, 2023

Yeah, I'm not sure why they used a blatantly broken example, but here:

>>>

The public key is like a math answer that is easy to compute if you have the whole equation, but really hard if you only have the answer.

Kind of like "if 2^x mod 32749 = 29640, what is x?"

You can figure it out, but it requires testing lots of numbers (a brute force apporoach) to figure out that it's 12345.

Say Adam sends the public key of 29640 to Bob. (Note that the public key was chosen to be two to some known power, modulo a shared and public prime number).

Bob chooses a new number - say 22222 - and computes 2^22222 mod 32749, too, and the sends it back to Adam.

Now we have 2^22222 mod 32749 => 12883.

When Adam gets that public key back, he can raise it to his own private key, 12345, and get (2^b)^a = 2^(b*a) (mod 32749) = 31458.

Meanwhile, Bob can do the computation (2^a)^b = 2^(a*b = b*a) (mod 32749) = 31458.

Even though Eve listened to both numbers being transmitted:

  Adam: "29640"

  Bob: "copy. 12883"

Only Adam and Bob can know the result 31458 without doing brute force exponentiation.

Brute force exponentiation is easy for prime numbers less than 33k, but a sender can choose a prime number that's long enough to satisfy their privacy needs, based on their estimate of Eve's resources.

So the bottom line is that the secret is easy to compute if you have one of the private keys, but incredibly time consuming (brute force) if you don't.

<<<

There are still some problems with this (really, Finite-Field Diffie-Hellman just generally kind of sucks, even without quantum attacks), but it's basically the right idea.

dclowd9901 · on Sept 19, 2023

Having now understood how this all works from the Khan Academy explanation, the real magic of all of this are these two things:

1) math that’s hard to do backwards

2) the transitive property of mathematics.

Given #1 is basically arbitrary, I think it can be hand-waved in service of explaining the process, which only works because of that transitive property, which I think the person you replied to did a decent job of relating.

In these explanations, people are getting hung up on the literal nuts and bolts steps, but simply understanding that we can send parts of a complete equation to each other and let the rules of math sort it out really clarifies the point.

mnahkies · on Sept 17, 2023

This is a good explanation but thought it might additionally be useful to link to https://en.m.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_e... as an example of creating a secure channel from an insecure one

liquidpele · on Sept 17, 2023

This is, BY FAR, the best explanation I’ve found. Goes over some cool history too.

https://www.khanacademy.org/computing/computer-science/crypt...

dclowd9901 · on Sept 19, 2023

Wow! This actually completely nailed it home for me. Thanks so much for sharing.

kazinator · on Sept 17, 2023

Public key cryptography breaks a key into two parts: the public key and the private key. A message is encrypted with the public key, but can only be decrypted with the private key, not with the public key.

You publish the public key to the world: "hey everyone, here is my public key; write me!". People use the public key to write you secret messages, which only your private key can decode, and that key is your guarded secret.

That message someone sends to you can be a symmetric key, which you then use after that to communicate with that someone (for a while, until you again agree on a different key).

dreamcompiler · on Sept 17, 2023

One FAQ is "Why do we even need symmetric cryptography? Why don't we just do everything with public key cryptography now?"

It's because public key cryptography is extremely CPU resource-intensive, while symmetric cryptography is not. So we typically use public key crypto only to exchange symmetric keys, and that way we get the best of both worlds without chewing up too much CPU time.

d-z-m · on Sept 17, 2023

The pre-master key is encrypted to the server public key, and can only be decrypted with the server private key.

nirav72 · on Sept 17, 2023

These are great. Loved reading their similarly created DNS explanation couple of years ago.

pizzafeelsright · on Sept 17, 2023

Does anyone else find the comic-style very distracting?

justsomehnguy · on Sept 17, 2023

The problem is not in the comic style. The problem is what this particular story is made to look like a comic for a 5 y/o, placed on LTR blocks like a proper text, with giant pictures yet with a small text and what nor the author nor the artist don't know shit about comics and web design or care about them.

But hey! It has a fancy FQDN and TLD! Multiple translations! Emoticons everywhere! Grab your pumpkin smoothie latte and get ready for a fun! story!

It's just an ad for the company behind the site. Somebody had spare money to throw in the PR dept.

philistine · on Sept 17, 2023

I take umbrage at using national flags to identify language. It’s always difficult to figure out intent and then fish for the tricolor flag that is not the other tricolor flag.

Language names are clear and precise, and much better. If you are pressed for space, language codes are fine too. Language speakers are used to fishing for their codes.

arrowsmith · on Sept 17, 2023

There's a whole website dedicated to this complaint:

https://www.flagsarenotlanguages.com/

bullen · on Sept 17, 2023

1. privacy: this does not solve the server problem. You have no privacy because the server has your data unencrypted. to protect passwords use this: https://datatracker.ietf.org/doc/html/rfc2289

2. integrity: if you need integrity, like for a bank transaction HTTPS is great, but you don't need HTTPS for all sites. for simple tampering control hashing is way more efficient.

3. identity: "This SSL certificate is valid and has been issued by a legitimate Certificate Authority." Who is the authority. If they have the power to pre-install root certificates on all hardware on the planet does that mean they are to be trusted?

None of these arguments make any sense if you know how cryptography works, HTTPS is a waste of time, electricity, "money" and freedom for 99% of uses.

Everyday I go to URLs with expired certificates. Just use HTTP!

From the folks at "DNS"imple, great another centralized protocol.

lolinder · on Sept 17, 2023

> privacy: this does not solve the server problem. You have no privacy because the server has your data unencrypted.

You seem to be saying that because servers can still see data that I send to them, we should just let every ISP along the route snoop on it too. If I'm sending data to a server, it's because I have some reasonable level of trust in that server. I don't want to have to think about whether I trust every single network my packet will pass through on the way, and HTTPS frees me from having to think about that aspect of privacy.

> integrity: if you need integrity, like for a bank transaction HTTPS is great, but you don't need HTTPS for all sites. for simple tampering control hashing is way more efficient.

Given what I said above—I want HTTPS all the time in order to keep middlemen from inspecting every request—it doesn't matter if hashing is more efficient, because I'll already be using encryption. Integrity is a nice side effect of privacy.

> identity: "This SSL certificate is valid and has been issued by a legitimate Certificate Authority." Who is the authority. If they have the power to pre-install root certificates on all hardware on the planet does that mean they are to be trusted?

The CA system is a legitimate weakness in HTTPS that needs to be fixed, but rejecting HTTPS wholesale as "a waste of time, electricity..." on that weakness alone is throwing the baby out with bath water. Let's fix the CA system, not throw out encryption.