News: OpenAI Introduces Superalignment (lemmy.world)

submitted 1 year ago by Blaed@lemmy.world to c/technology@lemmy.ml

1 comments fedilink hide all child comments

cross-posted from: https://lemmy.world/post/1102882

On 07/05/23, OpenAI Has Announced a New Initiative:

Superalignment

https://openai.com/blog/introducing-superalignment

Here are a few notes from their article, which you should read in its entirety.

Introducing Superalignment

We need scientific and technical breakthroughs to steer and control AI systems much smarter than us. To solve this problem within four years, we’re starting a new team, co-led by Ilya Sutskever and Jan Leike, and dedicating 20% of the compute we’ve secured to date to this effort. We’re looking for excellent ML researchers and engineers to join us.

Superintelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems. But the vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction.

While superintelligence seems far off now, we believe it could arrive this decade.

Here we focus on superintelligence rather than AGI to stress a much higher capability level. We have a lot of uncertainty over the speed of development of the technology over the next few years, so we choose to aim for the more difficult target to align a much more capable system.

Managing these risks will require, among other things, new institutions for governance and solving the problem of superintelligence alignment:

How do we ensure AI systems much smarter than humans follow human intent?

Currently, we don't have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue. Our current techniques for aligning AI, such as reinforcement learning from human feedback, rely on humans’ ability to supervise AI. But humans won’t be able to reliably supervise AI systems much smarter than us and so our current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs.

Other assumptions could also break down in the future, like favorable generalization properties during deployment or our models’ inability to successfully detect and undermine supervision during training.

Our approach

Our goal is to build a roughly human-level automated alignment researcher. We can then use vast amounts of compute to scale our efforts, and iteratively align superintelligence.

To align the first automated alignment researcher, we will need to 1) develop a scalable training method, 2) validate the resulting model, and 3) stress test our entire alignment pipeline:

1.) To provide a training signal on tasks that are difficult for humans to evaluate, we can leverage AI systems to assist evaluation of other AI systems (scalable oversight). In addition, we want to understand and control how our models generalize our oversight to tasks we can’t supervise (generalization).

2.) To validate the alignment of our systems, we automate search for problematic behavior (robustness) and problematic internals (automated interpretability).

3.) Finally, we can test our entire pipeline by deliberately training misaligned models, and confirming that our techniques detect the worst kinds of misalignments (adversarial testing).

We expect our research priorities will evolve substantially as we learn more about the problem and we’ll likely add entirely new research areas. We are planning to share more on our roadmap in the future.

The new team

We are assembling a team of top machine learning researchers and engineers to work on this problem.

We are dedicating 20% of the compute we’ve secured to date over the next four years to solving the problem of superintelligence alignment. Our chief basic research bet is our new Superalignment team, but getting this right is critical to achieve our mission and we expect many teams to contribute, from developing new methods to scaling them up to deployment.

Click Here Read More.

I believe this is an important notch in the timeline to AGI and Synthetic Superintelligence. I find it very interesting OpenAI is ready to admit the proximity of breakthroughs we are quickly encroaching as a species. I hope we can all benefit from this bright future together.

If you found any of this interesting, please consider subscribing to /c/FOSAI!

Thank you for reading!

top 1 comments

sorted by: hot top controversial new old

[-] secret_ninja@feddit.nl 1 points 1 year ago

I like that they’re making this a priority so it’s a step in the right direction but with greedy tech companies racing to improve the tech for profit, we as a race need to do a LOT MORE if we want to avoid extinction

this post was submitted on 06 Jul 2023

9 points (84.6% liked)

Technology

34745 readers

152 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago

MODERATORS

MinutePhrase@lemmy.ml