"We have data on the performance of >50k engineers from 100s of companies. ~9.5% of software engineers do virtually nothing: Ghost Engineers.”

Last week, a tweet by Stanford researcher Yegor Denisov-Blanch went viral within Silicon Valley. “We have data on the performance of >50k engineers from 100s of companies,” he tweeted. “~9.5% of software engineers do virtually nothing: Ghost Engineers.”

Denisov-Blanch said that tech companies have given his research team access to their internal code repositories (their internal, private Githubs, for example) and, for the last two years, he and his team have been running an algorithm against individual employees’ code. He said that this automated code review shows that nearly 10 percent of employees at the companies analyzed do essentially nothing, and are handsomely compensated for it. There are not many details about how his team’s review algorithm works in a paper about it, but it says that it attempts to answer the same questions a human reviewer might have about any specific segment of code, such as:

  • “How difficult is the problem that this commit solves?
  • How many hours would it take you to just write the code in this commit assuming you could fully focus on this task?
  • How well structured is this source code relative to the previous commits? Quartile within this list
  • How maintainable is this commit?”

Ghost Engineers, as determined by his algorithm, perform at less than 10 percent of the median software engineer (as in, they are measured as being 10 times worse/less productive than the median worker).

Denisov-Blanch wrote that tens of thousands of software engineers could be laid off and that companies could save billions of dollars by doing so. “It is insane that ~9.5 percent of software engineers do almost nothing while collecting paychecks,” Denisov-Blanch tweeted. “This unfairly burdens teams, wastes company resources, blocks jobs for others, and limits humanity’s progress. It has to stop.”

The Stanford research has not yet been published in any form outside of a few graphs Denisov-Blanch shared on Twitter. It has not been peer reviewed. But the fact that this sort of analysis is being done at all shows how much tech companies have become focused on the idea of “overemployment,” where people work multiple full-time jobs without the knowledge of their employers and its focus on getting workers to return to the office. Alongside Denisov-Blanch’s project, there has been an incredible amount of investment in worker surveillance tools. (Whether a ~9.5 percent rate of workers not being effective is high is hard to say; it’s unclear what percentage of workers overall are ineffective, or what other industry’s numbers look like).

Over the weekend, a post on the r/sysadmin subreddit went viral both there and on the r/overemployed subreddit. In that post, a worker said they had just sat through a sales pitch from an unnamed workplace surveillance AI company that purports to give employees “red flags” if their desktop sits idle for “more than 30-60 seconds,” which means “no ‘meaningful’ mouse and keyboard movement,” attempts to create “productivity graph” based on computer behavior, and pits workers against each other based on the time it takes to complete specific tasks.

What is becoming clear is that companies are becoming obsessed with catching employees who are underperforming or who are functionally doing nothing at all, and, in a job market that has become much tougher for software engineers, are feeling emboldened to deploy new surveillance tactics.

“In the past, engineers wielded a lot of power at companies. If you lost your engineers or their trust or demotivated the team—companies were scared shitless by this possibility,” Denisov-Blanch told 404 Media in a phone interview. “Companies looked at having 10-15 percent of engineers being unproductive as the cost of doing business.”

Denisov-Blanch and his colleagues published a paper in September outlining an “algorithmic model” for doing code reviews that essentially assess software engineer worker productivity. The paper claims that their algorithmic code assessment model “can estimate coding and implementation time with a high degree of accuracy,” essentially suggesting that it can judge worker performance as well as a human code reviewer can, but much more quickly and cheaply.

I asked Denisov-Blanch if he thought his algorithm was scooping up people whose work contributions might not be able to be judged by code commits and code analysis alone. He said that he believes the algorithm has controlled for that, and that companies have told him specific workers who should be excluded from analysis because their job responsibilities extend beyond just pushing code.

“Companies are very interested when we find these people [the ghost engineers] and we run it by them and say ‘it looks like this person is not doing a lot, how does that fit in with their job responsibilities?’” Denisov-Blanch said. “They have to launch a low-key investigation and sometimes they tell us ‘they’re fine,’ and we can exclude them. Other times, they’re very surprised.”

He said that the algorithm they have developed attempts to analyze code quality in addition to simply analyzing the number of commits (or code pushes) an engineer has made, because number of commits is already a well-known performance metric that can easily be gamed by pushing meaningless updates or pushing then reverting updates over and over. “Some people write empty lines of code and do commits that are meaningless,” he said. “You would think this would be caught during the annual review process, but apparently it isn’t. We started this research because there was no good way to use data in a scalable way that’s transparent and objective around your software engineering team.”

Much has been written about the rise of “overemployment” during the pandemic, where workers take on multiple full-time remote jobs and manage to juggle them. Some people have realized that they can do a passable enough job at work in just a few hours a day or less.

“I have friends who do this. There’s a lot of anecdotal evidence of people doing this for years and getting away with it. Working two, three, four hours a day and now there’s return-to-office mandates and they have to have their butt in a seat in an office for eight hours a day or so,” he said. “That may be where a lot of the friction with the return-to-office movement comes from, this notion that ‘I can’t work two jobs.’ I have friends, I call them at 11 am on a Wednesday and they’re sleeping, literally. I’m like, ‘Whoa, don’t you work in big tech?’ But nobody checks, and they’ve been doing that for years.”

Denisov-Blanch said that, with massive tech layoffs over the last few years and a more difficult job market, it is no longer the case that software engineers can quit or get laid off and get a new job making the same or more money almost immediately. Meta and X have famously done huge rounds of layoffs to its staff, and Elon Musk famously claimed that X didn’t need those employees to keep the company running. When I asked Denisov-Blanch if his algorithm was being used by any companies in Silicon Valley to help inform layoffs, he said: “I can’t specifically comment on whether we were or were not involved in layoffs [at any company] because we’re under strict privacy agreements.”

The company signup page for the research project, however, tells companies that the “benefits of participation” in the project are “Use the results to support decision-making in your organization. Potentially reduce costs. Gain granular visibility into the output of your engineering processes.”

Denisov-Blanch said that he believes “very tactile workplace surveillance, things like looking at keystrokes—people are going to game them, and it creates a low trust environment and a toxic culture.” He said with his research he is “trying to not do surveillance,” but said that he imagines a future where engineers are judged more like salespeople, who get commission or laid off based on performance.

“Software engineering could be more like this, as long as the thing you’re building is not just counting lines or keystrokes,” he said. “With LLMs and AI, you can make it more meritocratic.”

Denisov-Blanch said he could not name any companies that are part of the study but said that since he posted his thread, “it has really resonated with people,” and that many more companies have reached out to him to sign up within the last few days.

  • irotsoma@lemmy.world
    link
    fedilink
    English
    arrow-up
    26
    ·
    6 days ago

    I think most people misunderstand what software engineers do. Writing code is only a small portion of the work for most. Analyzing defects and performance issues, supporting production support that ends up with unqualified people due to the way support us handled these days, writing documentation or supporting those who do, design work, QE/QA/QC support, code reviews, product meetings, and tons of other stuff. That’s why “AI” is not having any luck with just replacing even junior engineers, besides the fact that it just doesn’t work.

  • mctoasterson@reddthat.com
    link
    fedilink
    English
    arrow-up
    37
    ·
    6 days ago

    This is bullshit. There are many people hired with the job title “Software Engineer” who don’t sit and generate code, and for a number of reasons.

    You could be on a hybrid team that does projects and support, so you spend 80% of your time attending meetings, working tickets, working with users, and shuffling paper in whatever asinine change management process your company happens to use.

    I have worked places where “engineers” ended up having to spend most of their time dicking around in ServiceNow/Remedy/etc. instead of doing their actual jobs. That’s shitty business process design and shitty management, and not a reflection of the employee doing nothing.

    • aesthelete@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      ·
      6 days ago

      I spend most of my time in other time wasters like jira and fucking aha as well.

      If I actually do anything, it only generates more work for me because I have to explain myself to fifteen different parties before making very minor, very necessary changes.

      My company can’t be the only one like this.

  • Mustard@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    24
    ·
    6 days ago

    Those numbers seem really sus. “We came up with some kind of bullshit metric magic algorithm and it turns out that if you look at people who score 10% of the average, that’s about 10% of the people!!!”

    Uh… yeah buddy sure is.

    • Valmond@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      ·
      6 days ago

      Exactly.

      “Why didn’t you make more commits last sprint?” Is such an idiotic take, that a friend of mine got a month or so ago.

      He worked with another programmer and they comitted with his PC so… Didn’t help, was still bashed for it because of “the numbers”.

      Sometimes they just need excuses to fire people too I guess.

    • futatorius@lemm.ee
      link
      fedilink
      English
      arrow-up
      4
      ·
      5 days ago

      And if the goal is to lay off 10% of your workforce, now you have a third-party metric to justify it.

  • mochisuki@lemmy.world
    link
    fedilink
    English
    arrow-up
    36
    ·
    6 days ago

    The old adage of the engineer paid to know where to tap an X comes to mind: https://quoteinvestigator.com/2017/03/06/tap/?amp=1

    Frankly anyone telling you they can measure the value of a line of code without any background knowledge is selling BS.

    But I welcome this new BS system as the previous system of managers not so secretly counting total commits and lines added was comically stupid.

    • futatorius@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 days ago

      Frankly anyone telling you they can measure the value of a line of code without any background knowledge is selling BS.

      the previous system of managers not so secretly counting total commits and lines added was comically stupid

      That has been known not to work since the 1970s. There’s probably something in The Mythical Man-Month ridiculing lines of code as a performance metric.

      Some of the most productive work I ever did involved ripping out 80k lines of executable code and replacing it with 1500.

      But I welcome this new BS system

      I don’t. Fuck snitchware in all its forms.

  • 2ugly2live@lemmy.world
    link
    fedilink
    English
    arrow-up
    37
    ·
    6 days ago

    What a fucking snitch. 9.5% of engineers gotta go, but the CEO getting paid buckets and buckets of money isn’t draining the company? Fire 9.5% of engineers that actually have knowledge and are skilled enough to demand a high price for their skills, or CEO fuck-all who comes in via zoom once a quarter and couldn’t open a pdf if they’re life depended on it. Hmm, what a hard choice 🤔

  • BassTurd@lemmy.world
    link
    fedilink
    English
    arrow-up
    62
    ·
    7 days ago

    It’s a long article that I admittedly didn’t read all of. I got to the part where it said the details of his algorithm are basically unknown, which means his data means nothing. If someone can’t provide the proof to their claims, they have no merit.

    An LLM that’s built entirely on code repo data, and is somehow claiming workers “do virtually nothing” without any sort of outside data, is insane.

    • JollyG@lemmy.world
      link
      fedilink
      English
      arrow-up
      16
      ·
      7 days ago

      One of my big beefs with ML/AL is that these tools can be used to wrap bad ideas in what I will call “Machine legitimacy”. Which is another way of saying that there are many cases where these models are built up around a bunch of unrealistic assumptions, or trained on data that is not actually generalizable to the applied situation but will still spit out a value. That value becomes the truth because it came from some automated process. People cant critically interrogate it because the bad assumptions are hidden behind automation.

      • aesthelete@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        6 days ago

        Yeah it’s similar to a computer spitting out 42 as the answer to life, the universe, and everything.

    • stoly@lemmy.world
      link
      fedilink
      English
      arrow-up
      48
      ·
      7 days ago

      Alternatively they are on an engineering team and providing their expertise via other means beyond code submission. This entire thing sounds like a sledgehammer trying to do the work of a scalpel.

        • wrekone@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          6
          ·
          6 days ago

          One of the best engineers I’ve worked with produced very little code at that point in his career. His primary responsibility was to do the research and planning that empowered the rest of the team to move quickly. Without a doubt, that team was far more productive due to his efforts. When needed, he could quickly whip out some top notch code, and he was heavy involved in the code review process. Writing code just wasn’t how he could deliver the most value.

        • MirthfulAlembic@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          6 days ago

          Developing standards, best practices, conventions, etc. One of the most valuable people on my team wrote some incredible quality automations a few years ago, and the only coding he does at this point is updates to them when necessary. By volume, he’s easily bottom 5% this year, but we’d be much worse off without his expertise/advise and the fact he advocates for the team.

          This is classic shit management metrics. It would take some time for the rot to set in after using a cudgel approach to a team, and by the time it did, the assholes responsible would have fucked off elsewhere with their huge bonuses.

          • futatorius@lemm.ee
            link
            fedilink
            English
            arrow-up
            2
            ·
            5 days ago

            Yeah, one of my projects right now has been delivering huge value with very few staff-hours being expended in coding. That’s because I (senior architect) and a couple software engineers researched the shit out of it before we started, and found a way to adapt free, existing, running code with minimal effort. I’ve seen two previous attempts to do this job fail expensively and catastrophically. So far, we’ve spent 15% of what either predecessor project cost, and we’ve already got operational code deployed and a solid proof of concept for the rest. That’s because of months of hard thinking and experimentation by my engineers and me. And yeah, that’s right, it meant doing some Big Design Up Front, and fuck you to every agile fanboi who thinks you can accomplish a highly complex integration project without doing that. We’ve already had a couple of those knobheads lose their jobs for failing at previous attempts, then opposing my approach. I’m hiring more real engineers with the freed-up headcount.

            Some of this work is irreducibly hard and anyone who thinks they can factorize it into a bunch of parallelized trivial processing doesn’t know the problem space. Snitchware and truncheonware are not going to change that.

    • a4ng3l@lemmy.world
      link
      fedilink
      English
      arrow-up
      19
      ·
      6 days ago

      Or architects, infrastructure engineer… plenty of peripheral functions are hired as « IT engineers » and not pushing code in a repo. What a weird article.

  • arbitrary_sarcasm@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    5 days ago

    It has not been peer reviewed.

    I could make a paper in 5 minutes about how AI can be used to uniquely identify people by smelling their farts. Doesn’t mean anything unless it’s been peer reviewed.

    Until this paper has been peer reviewed, I give it as much credit as I give a flat earth conspiracy person.

  • socsa@piefed.social
    link
    fedilink
    English
    arrow-up
    52
    ·
    edit-2
    7 days ago

    I don’t doubt the thesis, but reviewing commit history is next to useless. I’m probably not top 50% of activity within our organization but I’ve literally invented most of our tech and my name is on the patents.

    If anything, it’s the people who spend all day making pedantic code review comments just to boost git actions who have nothing better to do.

    • ribboo@lemm.ee
      link
      fedilink
      English
      arrow-up
      4
      ·
      6 days ago

      When I was a junior about 95% of my days was writing code. Nowadays? 30-40% maybe. The rest is meetings, code-review, helping colleagues that calls me among other things.

      Good luck finding that Mr Algorithm. Commit history is basically useless due to another factor as well. For bugs - finding the actual problem and the reason for it, is often far more consuming than the fix itself.

    • mctoasterson@reddthat.com
      link
      fedilink
      English
      arrow-up
      5
      ·
      6 days ago

      Yeah I was just about to say one obvious flaw in his methodology is that people could show up as “high productivity” by adding thousands of lines of worthless comments.

  • Viri4thus@feddit.org
    link
    fedilink
    English
    arrow-up
    30
    ·
    edit-2
    7 days ago

    How to scare managers into hiring me with my PoS black box software.

    Step 1: make wild claims about wasted resources.

    <- you are here

  • futatorius@lemm.ee
    link
    fedilink
    English
    arrow-up
    7
    ·
    5 days ago

    many more companies have reached out to him to sign up within the last few days

    They’re looking for a semi-randomized layoff algorithm.

    If this guy’s anything but a con artist, he’d show the data that correlates his algorithm with observed performance ratings. And he’d also validate that applying his algorithm is consistent with labor law.

  • yildolw@lemmy.world
    link
    fedilink
    English
    arrow-up
    34
    ·
    7 days ago

    The thing about being a big organization is that you need to have slack capacity most of the time in order to be able to go quickly in a different direction at certain times. If you don’t have excess capacity sitting idle, an unforeseen event can paralyze you

    • ddh@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      5
      ·
      7 days ago

      And slack capacity can be used effectively e.g., spend some time on process improvement. There’s always some saw to sharpen or some technical debt to repay.

  • distortwave@lemmy.ml
    link
    fedilink
    English
    arrow-up
    6
    ·
    5 days ago

    I’m not even going to bother to take this seriously at all.

    There’s something to be said about unfulfilling and ‘bullshit jobs’. Aside from the potentially dubious methodology here, consider the implications of this ‘finding’.

    How about look at the rentier and profit sapping features of these massive tech companies.