The internal tools at Meta are incredible tbh. There’s an ecosystem of well-desi...

Random_BSD_Geek · on Sept 18, 2024

Polar opposite of my experience. To achieve the technical equivalent of changing a lightbulb, spend the entire day wrangling a dozen tools which are broken in different ways, maintained by teams that no longer exist or have completely rolled over, only to arrive at the finish line and discover we don't use those lightbulbs anymore. Move things and break fast.

loeg · on Sept 18, 2024

IMO there's a mix of a few really good, widely used, well-supported tools as well as a long tail of random tiny tools where the original team is gone that are cruftier.

extr · on Sept 18, 2024

Yeah 100%. I found it immensely frustrating to be using tools with no community (except internally), so-so documentation, and features that were clearly broken in a way that would be unacceptable for a regular consumer product. If you have a question or error not covered by an internal search or documentation, good luck, you'll need it. Literally part of the reason I left the company.

landedgentry · on Sept 18, 2024

Well, you're supposed to read the code and figure it out. And if you can't, you're not good enough an engineer. According to people at Meta.

extr · on Sept 18, 2024

People probably think you’re exaggerating but it’s true. Sometimes when I would get blocked the suggestion was to “read the source code” or “submit a fix” on some far flung internal project. Huge fucking waste of time and effort, completely unserious.

tru3_power · on Sept 19, 2024

No matter what, tools will be broken. Having access to the source and being able to land a diff to fix the issue is awesome imo.

extr · on Sept 19, 2024

That’s how open source already works by default. The difference is if an OSS tool is broken my boss doesn’t imply landing a fix is my responsibility on top of my regular job duties.

majormajor · on Sept 19, 2024

Working around it is somehow is. A huge part of my work has been plumbing and hacking around limitations in mediocre-at-best OSS tools.

Lots of nonserious companies that take those issues as enough of a reason to move slowly.

Many fewer serious ones where bad tooling is expected to be fixed, smoothed over, or replaced entirely in the interest of future dev time.

KaiserPro · on Sept 19, 2024

> Having access to the source

Yes, thats great.

> being able to land a diff to fix the issue is awesome imo.

yes, if its a one off. but for my last project that would involve spinning up many "XFNs" (multi-team chat fests) to argue that actually they don't want to have that change because of reason x,y and z.

At which point you just give up and make a stupid fucking hack.

So much is not about engineering excellence, its about trying to get people to accept change.

hnav · on Sept 18, 2024

Doesn't sound like your type of company tbh, the flipside is that a "serious" company will often have broken bs too except now nobody is going to look at your contribution/fix.

KaiserPro · on Sept 19, 2024

Pfft.

"your type of company" sod off. Meta is only like this because its got a massive advertising revenue stream.

the sheer amount of engineering time wasted because we don't document stuff is astounding.

For example, how many message queue systems do we have?

how many half arsed message queues have been created because they didn't know about FOQS?

lclarkmichalek · on Sept 20, 2024

I think a fair few of them were created because they knew a bit too much about FOQS

extr · on Sept 19, 2024

Yes lmao, the number of times I would start off on some nominally useful task only to find out 3 weeks later that there is actually already a solution to that created by team XYZ that nobody in my reporting chain has ever heard of…(3 weeks was optimistic case, I remember my team member getting like 2 months in to some new data pipeline before finding out some tables already existed that did what he needed…)

KaiserPro · on Sept 19, 2024

Welcome to meta! where everything is a murder mystery.

Except you're not really sure if there has been a murder, or sometimes you wonder if you're the murderer, because at every turn you're told that you've been a bad dev for trying x,y and z

moandcompany · on Sept 18, 2024

Same as Google. Many internal tools have painful interfaces and poor or documentation because the hiring bar was high and it was acceptable to assume that the user's skill level is high enough to figure it out. That attitude becomes a bigger problem when trying to sell tools to the public (e.g. Google Cloud Platform).

yodsanklai · on Sept 18, 2024

As an outsider, I was always under the impression that Google had a tradition of engineering excellence (robust tools, clean and while tested code following strict guidelines), while Meta has more of a Hacker culture (move fast and break things).

moandcompany · on Sept 18, 2024

Google also has traditions that created Broccoli Man: https://www.youtube.com/watch?v=3t6L-FlfeaI

fsociety · on Sept 19, 2024

Or you know, go chat with the tool maintainers because they want people using them for impact.

zer0zzz · on Sept 18, 2024

Agreed. I often get my work done using open source build instructions and tools and then when everything works I port it to internal infra. Other people are the opposite though, which for open source based code bases has a nasty side effect of the work having no upstream able tests!

uuddlrlrbaba · on Sept 18, 2024

Mmm breakfast

grantsucceeded · on Sept 18, 2024

haha the reason I stayed as long as i did

aprilthird2021 · on Sept 19, 2024

But you're both talking about different things. The tools are both often left in disuse, lacking documentation, etc. But they also have a really tight integration with each other that allows for unparalleled visibility and ability over enormous systems with many moving parts.

bozhark · on Sept 18, 2024

Move Smooth and Fix Things (tm) is our nonprofit corporation’s version of this atrocious motto.

on Sept 19, 2024

[dead]

ec109685 · on Sept 19, 2024

Large checkouts is a solved problem now https://github.com/facebook/sapling/blob/main/eden/fs/docs/O...

moandcompany · on Sept 18, 2024

My opinion: Many Meta tools and processes seem like they were created by former Googlers that sought to recreate something they previously had at Google, during the Google->FB Exodus, but also changed aspects of the tool that were annoying or diverged from their needs. This is not a bad thing.

Since Bento doesn't appear to be usable by the public, aparallel version of this that people can get a feel for cross-tool integration would be Google's Colaboratory / Colab notebooks (https://colab.research.google.com/) that have many baked-in integrations driven by actual internal use (i.e. dogfooding).

kridsdale3 · on Sept 18, 2024

As someone from both, I confirm/support your opinion 100%.

mark_l_watson · on Sept 19, 2024

I agree, the paid for Pro version of Colab just seems to have the features I need. I often use it because it simply saves me time and hassles.

KaiserPro · on Sept 19, 2024

You and I must be working in different areas.

For any kind of general Python/C++ work, its a _massive_ pain.

The integrated debugger rarely works, and its a 30 minute recompile to figure that out. The documentation for actually being efficient in build/run/test is basically "ask the old guy in the corner". You'd best hope they know and are willing to share.

The code search is great! The downside is that nobody bothers to document stuff, so thats all you've got. (comments/docstrings are for weaklings apparently)

You want to use a common third party library? You'd best hope its already ingested, otherwise you're going to be spending the next few days trying to get that into the codebase. (yes there are auto tools, no they don't always work.) Also, you're now on the hook to do security upgrades.

JohnMakin · on Sept 18, 2024

One of the crazier things a L4 meta colleague of mine told me, that I still don’t believe entirely, is that meta pretty much has their own fork of everything, even tools like git. is this true?

tqi · on Sept 18, 2024

Facebook actually doesn't use git, they use mercurial (https://graphite.dev/blog/why-facebook-doesnt-use-git).

That decision is also illustrative of why they end up forking most things - Facebook's usage patterns at the far extreme end for almost any tool, and things thats are non-issues with fewer engineers or a smaller codebase become complete blockers.

kridsdale3 · on Sept 18, 2024

Yes when I used to talk about this to interviewees, I described that every tool people commonly use is somewhere on the Big-O curves for scaling. Most of the time we don't really care if a tool is O(n) or O(10 n) or whatever.

At Meta, N tends to be hundreds of billions to hundreds of trillions.

So your algorithm REALLY matters. And git has a Big-O that is worse than Mercurial, so we had to switch.

steventhedev · on Sept 19, 2024

I'm gonna disagree with you there. The difference was with stat patterns, and the person at Facebook who ran the tests had something wrong with the disk setup that was causing it to run slowly. They ignored multiple responses that reproduced very different results.

Nail in the coffin on this was a benchmark GitHub ran two years ago that got the results that FB should have: git status within seconds.

Facebook didn't use mercurial because of big O, they used it because of hubris and a bad disk config.

sangnoir · on Sept 19, 2024

> Facebook didn't use mercurial because of big O, they used it because of hubris and a bad disk config.

Half-remembering a blog post I read - the git maintainers also wouldn't give Facebook the time of day on code changes to accommodate FBs requirements. Mercurial was more amenable. This also disproves the "Facebook has a fork of evertyhing, because the attempted to upstream the changes they wanted)

deadmutex · on Sept 19, 2024

This sounds plausible, but would love a source

steventhedev · on Sept 19, 2024

I should probably just write it up into a post, but the git mailing list at the time is the source (I remember reading it from the side a few months after convincing our VP R&D to switch from svn to git). We were chuckling around the same time that FB had to reallocate the stack on Galaxy S2 phones because they were somehow unaware of proguard or unable to have it work properly with their codegen.

Anyways:

1. Github benchmark: https://github.blog/engineering/infrastructure/improve-git-m...

2. The original email thread: https://public-inbox.org/git/CB04005C.2C669%25joshua.redston...

3. There's another email thread that gets linked everywhere - but in light of the prior thread, the numbers don't track: https://public-inbox.org/git/CB5074CF.3AD7A%25joshua.redston...

I recall there being a message from someone either at AirBnB or Uber who mentioned that they have a similar monorepo but without the slow git status, but can't seem to find it now - it's likely on one of the other mailing list archives but didn't make it to this one.

Point being that painting this as "the community was hostile" or "git is too slow for FB" is just disingenuous. The FB engineer barely communicated with the git team (at least publicly) and when there was communication, it was pushing a single benchmark that was deeply flawed, and then ignoring feedback on how to both improve the performance of slow blame, commit by repacking checkpoint packfiles (a one-off effort) and also ignoring feedback that the benchmark numbers didn't make sense in absolute terms.

master_crab · on Sept 19, 2024

If git is blocking you, you are using it wrong. Lotta instances of people treating it as an artifact repository. Use it correctly with a branching strategy that works for your use case and it's bulletproof.

Plenty of other customers with the same magnitude problems as Meta are using Git perfectly fine.

quicklime · on Sept 19, 2024

Who are the others with the same magnitude as Google and Meta’s monorepos?

ec109685 · on Sept 19, 2024

Microsoft has all of windows in a single repository.

disgruntledphd2 · on Sept 19, 2024

Particularly in 2014, when the git thing happened.

master_crab · on Sept 19, 2024

No one. Because people should know better than to use monorepos. Most teams at Amazon (which works at the same scale) religiously avoided them.

KaiserPro · on Sept 19, 2024

> Plenty of other customers with the same magnitude problems as Meta are using Git perfectly fine.

I mean there aren't. there are perhaps three places that have the same scale problem.

A monorepo for a place with about 50k developers, that has been operating at that scale for 5 years.

The current checkout if not sparse would be >80gigs

The commit rate is > 20 a second.

no amount of branching strategy is going to help on that.

I love git, I used it professionally since 2010, but git is not a good fit for something _massive_

ec109685 · on Sept 19, 2024

Microsoft has made git work for massive repositories through a lot of blood sweat and tears: https://github.blog/open-source/git/the-story-of-scalar/

LarsDu88 · on Sept 18, 2024

They use sapling. An in-house clone of mercurial that was open sourced 2 years ago

herval · on Sept 19, 2024

FB uses mercurial _for most things_, but like any company that size, there's teams that use git and even teams that use perforce

ipsum2 · on Sept 18, 2024

Yep. Zeus is a fork of Zookeeper, Hack is a fork of PHP, etc. It's usually needed to make it work with the internal environment.

The few things that don't have forks are usually the open source projects like React or PyTorch, but even those have some custom features added to make it work with FB internals.

gcr · on Sept 18, 2024

This is also how things work at Google.

Google also maintains a monorepo with "forks" of all software that they use. History diverges, but is occasionally synchronized for things like security updates etc.

zhengyi13 · on Sept 18, 2024

Am I completely off-base/confused thinking that the GFE originally started life (like back under csilver) as a fork of boa[0]?

[0]: http://www.boa.org/

lacker · on Sept 18, 2024

I thought it was GWS that originally started as a fork of boa.

zhengyi13 · on Sept 19, 2024

That's it, yes, thank you.

grantsucceeded · on Sept 18, 2024

Few companies experienced the explosive growth fb did, though many will claim to have done so. Hack made the existing codebase of php scale to insane levels while reaching escape velocity for the overall company to even attempt to transition away or shrink the php codebase, as i recall (i was an SRE, not a dev)

zeus likewise.

ipsum2 · on Sept 18, 2024

You worked at FB, but you call yourself an SRE, not a PE? ;)

grantsucceeded · on Sept 25, 2024

haha .. but i did work at google before fb.

my memory is hazy, but when i started i was called an sre. Then someone made up the term "app ops". then it was production engineering. then I called myself an SRE when i was interviewing to leave fb.

Oh, and i was called a DBA before being acquired by google.

fragmede · on Sept 18, 2024

You still call it Facebook?

KaiserPro · on Sept 19, 2024

PEs are still quite new remember....

ahupp · on Sept 19, 2024

nit: HHVM was a completely new implementation of a runtime for a PHP-like language, it wasn't a fork of Zend.

jamra · on Sept 18, 2024

Meta doesn't use git. It uses mercurial. It does fork it because they have a huge monorepo. They created a concept of stacked commits which is a way of not having branches. Each commit is in a stack and then merged into master. Lots of things built for scaling.

sdenton4 · on Sept 18, 2024

It wouldn't be terribly surprising. Forking everything provides a liiiitle bit of protection against things like the 'left pad' incident.

3eb7988a1663 · on Sept 19, 2024

Left pad was from the creator pulling the code from the public source forge, not from a destructive code change.

I assume all of the big tech companies host internal mirrors of every single code dependency + tooling. Otherwise they could not guarantee that they can build all of their code.

crabbone · on Sept 18, 2024

A friend of mine is doing his PHD while being an intern at Meta. He does not share your excitement... at all. To summarize his complaints: a framework written a long while ago with design flaws that were cast in stone, that requires exorbitant effort to accomplish simple things (under the pretense of global integration that usually isn't needed, but even if was needed, would still not work).

almostgotcaught · on Sept 19, 2024

> A friend of mine is doing his PHD while being an intern at Meta

I interned thrice as phd student at FB. your friend isn't entirely wrong but also just doesn't have enough experience to judge. all enormous companies are like this. FB is far and away better than almost all such companies (probably only with the exception of Google/Netflix).

jonathanyc · on Sept 19, 2024

Agreed. I'm reading some complaints in the thread about being told to "just read the source code" for internal tools at Meta. When I worked at Apple we didn't even get the source code!

crabbone · on Sept 19, 2024

I don't see why saying that Facebook's tools are bad should be invalidated by saying that Google's or others' tools are bad too. Google being bad doesn't vindicate or improve Facebook tools. There's no need for perspective: if it doesn't work well for what's it designed to do, then that's all there is to it.

almostgotcaught · on Sept 19, 2024

> Google's or others' tools are bad too

lol bruh read my response again - FB's and Google's and Amazon's tool are lightyears ahead of #ARBITRARY_F100_COMPANY. you haven't a clue what "bad" means if you've never worked in a place that has > 1000 engineers.

sangnoir · on Sept 19, 2024

How long has he been interning? Is it long enough for him to have learned how long the timescale big-tech roadmaps operate on? If he wants a feature, he better write it himself (if his PR doesn't conflict with an upcoming rewrite, coming "soon"), or lobby to get it slotted for the second quarter of 2026.

crabbone · on Sept 19, 2024

He started right about the time COVID started, so... about four years now, I think. I'm not sure if those were contiguous though.

I'm not sure what your idea about PRs and features has to do with the above... he's not there to work on the internal infra framework. He's there for ML stuff. Unfortunately, the road to the later goes through the former, but he's not really a kind of programmer who'd deal with Facebook's infrastructure and plumbing.

The point is, it's inconvenient. Is it inconvenient because Facebook works on a five-year plan basis or whatever other reason they have for it doesn't really matter. It's just not good.

I also have no problems admitting that all big companies (two in total, one being Google) I worked for so far had bad internal tools. I don't imagine Facebook is anything special in this respect. I just don't feel like it's necessary to justify it in any way. It's just a fact of life: large companies have a tendency to produce bad internal tools (but small often have none whatsoever!) It's a water is wet kind of thing...

sangnoir · on Sept 19, 2024

> I'm not sure what your idea about PRs and features has to do with the above... he's not there to work on the internal infra framework.

My idea is if he's not making the monorepo codebase changes himself, he's going to have to wait for an awfully long time for any non-trivial improvements he'd like because the responsible teams have different priorities sketched out for next calendar year. It's a function of organization size, unless you have the support of someone very high up on the org chart, ICs can't unilaterally adjust another teams priorities.

slt2021 · on Sept 18, 2024

how else can you build empire as Engineering Manager and get promo?

fork open source, then demand resources to maintian this monster.

easiest promotion + job security.

its even called "Platform Engineering" these days

jchonphoenix · on Sept 18, 2024

Meta tools are best in class when the requirement is scale. Or that the external tools haven't matured yet

Qshdg · on Sept 18, 2024

Looking at some of the bureaucracy in their open source projects, I'd say that they need less tooling and more thinking. These tools help to keep spaghetti code bases from imploding totally.

baggiponte · on Sept 18, 2024

Uuuh can you tell a bit more about wasabi, the Python LSP? Saw a post years ago and been eager to see whether it’d be open sourced (or why it wouldn’t).