104
you are viewing a single comment's thread
view the rest of the comments
[-] nokturne213@sopuli.xyz 59 points 4 days ago

I thought it was good to move away from GitHub since it is owned by m$?

[-] popcar2@programming.dev 47 points 4 days ago

It's not a big deal since git repos aren't hard to migrate. GitHub is fine currently and if they push people away then there are a couple of alternatives.

Firefox hosting on Github is a good move because it lowers the barrier of entry for contributors.

[-] onlinepersona@programming.dev 34 points 4 days ago

Projects are more than just code. They are all the metadata, ecosystem, and people around it. You can easily move a git repo, but try moving github issues or github PRs, pipelines, community questions, and so on. You'll realise how much of a fallacy "It’s not a big deal since git repos aren’t hard to migrate" is.

Anti Commercial-AI license

[-] hamsterkill@lemmy.sdf.org 15 points 4 days ago

They aren't using GitHub for issues, pull requests, or (that i'm aware of) pipelines.

[-] danielquinn@lemmy.ca 17 points 4 days ago

Lowering the barrier to entry by moving from a technology few use (mercurial) to something popular (git) makes sense. Requiring participation on a proprietary platform owned by Microsoft instead of an open one like Codeberg or GitLab is just lazy. If someone wants to contribute to Firefox, asking them to create an account is a small ask, and I'd argue that if they're unwilling to do even that, then their participation in the community is likely to be far from useful.

[-] FaceDeer@fedia.io 3 points 4 days ago

Indeed. This "GitHub is owned by Microsoft, therefore evil lurks around every corner" thing has been going for many years now with no sign of the promised apocalypse and no real reason to expect said apocalypse. Back in the day I used to do a lot of modding for an open-source game and almost all the mods were hosted on GitHub, but then when Microsoft bought it about half of the modders threw an ideological fit and moved their mods to a wide scattering of other hosts. It made everything so much more of a hassle to fork and submit issues and whatnot, I'm sure it's done more harm to the project than anything Microsoft would ever do.

[-] ekky@sopuli.xyz 7 points 4 days ago

Wasn't it revealed that Microsoft was training their Copilot on Github repositories, including private ones such as paying coorporations believing their source code to be safe and secure, resulting in secrets suddenly being made semi-public?

I feel that there were other incidents too, though I can't remember them off the top of my head. Definitely not a place I'd recommend anyone to keep anything they love, even if they keep to best practices and don't store secrets in their repositories.

[-] FaceDeer@fedia.io 1 points 4 days ago

It was an open source game with open source mods. It wouldn't have made sense to have private repos.

I did a little Googling and Microsoft denies using private repositories for training. Do you have a source?

[-] ekky@sopuli.xyz 1 points 3 days ago* (last edited 3 days ago)

The claim above was off the top of my head, but I've found multiple pages of results describing the panic that ensued.

Now, Microsoft (Copilot and Github) are less than clear on what exactly is used for training, but the general consensus seems to be, that they don't train on private repositories. Though there appears to be some confusion about this, especially regarding Microsoft's honesty about not using loopholes (this article might be faked, I haven't tried confirming it, though, this topic is a shit show ripe with miscommunication, misinformation, and quite a lot of confusion and fear regardless).

It appears that the specific issue I was referring to required a human error for copilot being able to train on the private repositories. Namely, some unfortunate fool temporarily making the repository public (in which case it obviously isn't private anymore, and therefore free for grabs by scrapers). Usually this wouldn't be a problem, since no indexer or scraper can check all of Github all at once all the time, so the chance of a briefly exposed repository being cached is rather small, albeit always there.

That said, Copilot, Bing, and Github are likely better integrated than Bing simply wasting resources on continuously scraping Github for new repositories. I personally imagine that Github saving resources by sending a signal to Bing when a repository is made public isn't entirely unlikely (that's something I might do, harboring no ill intentions), meaning that it is possible (though in no way confirmed) that Bing punishes briefly exposed Github repositories instantly by forever caching them.

Is this 100% Microsoft being predatory? No, obviously not, since it requires a user error to happen in the first place, and since Copilot is technically only trained on public or exposed data. Though, Microsoft learning about this rather scammy behavior and simply classifying it a "low-impact-severity" and disabling the Bing cache for humans (but apparently not Copilot) doesn't sit right with me. I'm sure that they knew exactly which kind of data they were working with during dataset sanitation, so they could have chosen not to use sensitive data or at least inform exposed clients that they are adding their cached secrets to Copilot.

[-] melroy@kbin.melroy.org 1 points 4 days ago
[-] clove@kbin.melroy.org 4 points 4 days ago

Moving to git is one thing, but doesn't going to GitHub put all their code at risk for CoPilot AI mining by Microsoft? (If one considers that a bad thing, which many don't, I guess.)

[-] melroy@kbin.melroy.org 3 points 4 days ago

Your code is AI mined regardless where you put it today I'm afraid to tell.

Unless you put your code in a private repository self hosted behind a login. However, if your code is public. You can bet it will be used for AI training. Again regardless of which platform. And regardless which LLM. So all platforms, all internet, all LLMs.

[-] clove@kbin.melroy.org 2 points 3 days ago

Thanks, that's what I thought. I've never put anything personal in a public repo in my life for reasons just like this. Bleh.

[-] melroy@kbin.melroy.org 2 points 3 days ago

Also maybe a private repository on github might also not as private as you think. Just saying.

[-] clove@kbin.melroy.org 2 points 2 days ago

Oh yeah, definitely. I'm on self-hosted forgejo, having had to migrate off gitea recently. :\

[-] melroy@kbin.melroy.org 2 points 1 day ago

Forgejo used to be a soft fork. But is today indeeda hard fork of gitea. Just for the people out there who didn't knew about Forgejo.

[-] Aedius@lavraievie.social 1 points 3 days ago

@melroy

I wonder how many repository are created with the only goal to teach some backdoor to the LLMs.

[-] melroy@kbin.melroy.org 1 points 3 days ago

Not many. If you want a backdoor there are better and faster ways to implement that. No need to wait for llms to maybe train on your repo. And maybe hallucinate your code to somebody else.

Those chances are slim. I won't publicly tell how to easily infiltrate projects, but I give you a hint: npm, go, pip

this post was submitted on 13 May 2025
104 points (100.0% liked)

Firefox

4 readers
5 users here now

The latest news and developments on Firefox and Mozilla, a global non-profit that strives to promote openness, innovation and opportunity on the web.

You can subscribe to this community from any Kbin or Lemmy instance:

Related

Rules

While we are not an official Mozilla community, we have adopted the Mozilla Community Participation Guidelines as far as it can be applied to a bin.

Rules

  1. Always be civil and respectful
    Don't be toxic, hostile, or a troll, especially towards Mozilla employees. This includes gratuitous use of profanity.

  2. Don't be a bigot
    No form of bigotry will be tolerated.

  3. Don't post security compromising suggestions
    If you do, include an obvious and clear warning.

  4. Don't post conspiracy theories
    Especially ones about nefarious intentions or funding. If you're concerned: Ask. Please don’t fuel conspiracy thinking here. Don’t try to spread FUD, especially against reliable privacy-enhancing software. Extraordinary claims require extraordinary evidence. Show credible sources.

  5. Don't accuse others of shilling
    Send honest concerns to the moderators and/or admins, and we will investigate.

  6. Do not remove your help posts after they receive replies
    Half the point of asking questions in a public sub is so that everyone can benefit from the answers—which is impossible if you go deleting everything behind yourself once you've gotten yours.

founded 2 years ago
MODERATORS