148
submitted 1 year ago by sculd@beehaw.org to c/technology@beehaw.org

Article from The Atlantic, archive link: https://archive.ph/Vqjpr

Some important quotes:

The tensions boiled over at the top. As Altman and OpenAI President Greg Brockman encouraged more commercialization, the company’s chief scientist, Ilya Sutskever, grew more concerned about whether OpenAI was upholding the governing nonprofit’s mission to create beneficial AGI.

The release of GPT-4 also frustrated the alignment team, which was focused on further-upstream AI-safety challenges, such as developing various techniques to get the model to follow user instructions and prevent it from spewing toxic speech or “hallucinating”—confidently presenting misinformation as fact. Many members of the team, including a growing contingent fearful of the existential risk of more-advanced AI models, felt uncomfortable with how quickly GPT-4 had been launched and integrated widely into other products. They believed that the AI safety work they had done was insufficient.

Employees from an already small trust-and-safety staff were reassigned from other abuse areas to focus on this issue. Under the increasing strain, some employees struggled with mental-health issues. Communication was poor. Co-workers would find out that colleagues had been fired only after noticing them disappear on Slack.

Summary: Tech bros want money, tech bros want speed, tech bros want products.

Scientists want safety, researchers want to research...

you are viewing a single comment's thread
view the rest of the comments
[-] canis_majoris@lemmy.ca 18 points 1 year ago

Nothing about this is safe. It's easily the worst misinformation tool in decades. I've used it to help me at work, GPT-4 is built into O365 corp plans, but all the jailbroken shit scares the hell out of me.

Between making propaganda and deepfakes this shit is already way out of hand.

[-] sylverstream@lemmy.nz 5 points 1 year ago

What do you mean by jailbroken stuff?

We've recently got copilot at M365 and so far it's been a mixed bag. Some handy things but also some completely wrong information.

[-] canis_majoris@lemmy.ca 10 points 1 year ago* (last edited 1 year ago)

Stuff without the guardrails, stuff that's been designed to produce porn, or totally answer truthfully to queries such as "how do I build a bomb" or "how do I make napalm" which are common tests to see how jailbroken any LLM is. When you feed something the entire internet, or even subsections of the internet, it tends to find both legal and illegal information. Also the ones designed to generate porn have gone beyond that boring shitty AI art style and now people are generating human being deepfakes, and it's become a common tactic to spam places with artificial CSAM to cause problems with services. It's been a recent and long-standing issue with Lemmy - people like Exploding Heads or Hexbear will get defederated and then out of retaliation will spam the servers that defederated from them with said artificial CSAM.

I like copilot but that's because I'm fine with the guardrails and I'm not trying to make it do anything out of its general scope. I also like how it's covered by an enterprise privacy agreement which was a huge issue with people using ChatGPT and feeding it all kinds of private info.

[-] abhibeckert@beehaw.org 16 points 1 year ago

"how do I build a bomb” or “how do I make napalm"

... or you could just look them up on wikipedia.

load more comments (1 replies)
load more comments (1 replies)
load more comments (1 replies)
this post was submitted on 20 Nov 2023
148 points (100.0% liked)

Technology

37739 readers
828 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS