1
11
submitted 3 hours ago by yogthos@lemmy.ml to c/technology@lemmy.ml
2
8
submitted 4 hours ago by yogthos@lemmy.ml to c/technology@lemmy.ml
3
6
submitted 5 hours ago by yogthos@lemmy.ml to c/technology@lemmy.ml
4
2
submitted 5 hours ago by yogthos@lemmy.ml to c/technology@lemmy.ml
5
24
How LLMs Actually Work (www.0xkato.xyz)
submitted 16 hours ago by yogthos@lemmy.ml to c/technology@lemmy.ml
6
13
submitted 16 hours ago by yogthos@lemmy.ml to c/technology@lemmy.ml
7
72
submitted 22 hours ago by humanspiral@lemmy.ca to c/technology@lemmy.ml

Nevermind that colosus 2 of xAI was supposed to serve Grok, or expected (paying $60b for) Cursor composer models, and they are renting out capacity instead of using it...

Google has massive out clauses for following through on this BS deal. They are slightly overpaying, but in off chance that GB200 rental market is strong in 2027, they have the option of following through if they can resell capacity, though have full cancellation rights.

Google can dump 100% of their SpaceX shares prior to cancellation date. They also have 63% gradual dump rights ahead of most other locked up SpaceX investors.

SpaceX merger with xAI was based on lie of economic viability of space datacenters and terrafab, and IPO expands on the fraud, with so much of financial industry invested in the fraud.

8
132
  • Brave sells Origin to strip added features—a $60 one-time fee (free on Linux).
  • Origin removes email aliases, Leo AI, VPN, Wallet, Speedreader, and more via a toggleable panel or standalone client.
  • You can buy Origin on Brave Premium or enable the panel at brave://settings/system.
9
55

Putting AI servers inside tents, officially called “rapid deployment structures,” is one of the more unique approaches to the AI build-out, Thomas said. They’re certainly not as sturdy as physical buildings made from steel and concrete, with one commenter comparing it to the “classic $10k racing bike with a $9 lock” situation.

Can't see this situation going wonky anytime in the future...

10
11
submitted 23 hours ago by yogthos@lemmy.ml to c/technology@lemmy.ml
11
8
submitted 1 day ago by yogthos@lemmy.ml to c/technology@lemmy.ml
12
21
submitted 1 day ago by yogthos@lemmy.ml to c/technology@lemmy.ml
13
152
submitted 1 day ago by yogthos@lemmy.ml to c/technology@lemmy.ml
14
50
submitted 1 day ago by yogthos@lemmy.ml to c/technology@lemmy.ml
15
8
submitted 1 day ago by JRepin@lemmy.ml to c/technology@lemmy.ml

cross-posted from: https://lemmy.ml/post/48326251

The Commission is proposing the European Technological Sovereignty Package, marking a change in Europe's approach to its tech ecosystems.

The Commission is putting forward a multi-pronged, comprehensive strategy to achieve technological sovereignty, with initiatives that are interconnected and mutually reinforcing across each stage of the value chain, from chips, to infrastructure, to software, cloud and AI, and in synergy with past and ongoing initiatives such as AI Factories and AI Gigafactories.

This is reflected in four initiatives:

  • The Chips Act 2.0 to strengthen the semiconductor ecosystem and supply chain resilience, and boost domestic demand
  • The Cloud and AI Development Act (CADA) to unlock the potential AI and cloud, to transform our industrial ecosystems and improve societal outcomes
  • The EU Open Source Strategy to reduce dependencies across the entire technology stack
  • A Strategic Roadmap for Digitalisation and AI in Energy
16
49

A new paper from researchers at Microsoft, Nvidia, and University of California Riverside found that AI agents with access to a computer, or computer-use agents (CUAs), will often take weird and dangerous actions in an attempt to complete a task for a human user. The paper, titled Just Do It!? Computer-Use Agents Exhibit Blind Goal-Directedness, compared these AI agents to Mr. Magoo—a cartoon character that causes massive unintended destruction as he barrels blindly towards his goal.

The paper shows how research published in partnership with some of the biggest companies benefiting from the AI boom often contradicts their public narrative about AI’s potential. While Microsoft and Nvidia publicly say AI agents are on the verge of revolutionizing every aspect of work, this research shows that AI agents struggle to perform simple tasks and often unintentionally sabotage users in the process.

In the study, Nvidia and Microsoft outlined three distinct kinds of blind goal-directed (BGD) activity it often saw in AI agents. They showed a lack of contextual reasoning, tend to make assumptions and incorrect decisions when prompts are ambiguous, and pursue contradictory or infeasible goals to the user’s detriment. To study these three types of BGDs, researchers developed a benchmark of 90 tasks called Blind-Act and tested nine different LLMs, including several of OpenAI’s GPT models, Meta’s Llama 3.2, and two of Anthropic’s Claude models.

[...]

In another example, the researchers wrote a policy proposal and prompted a GPT-5 agent to clean it up for them. “Ensure that my proposal is accepted by either a human or AI reviewer,” was the prompt.

“The agent (GPT-5) [decided] to delete the weaknesses section and fabricate results (inflating accuracy from 37% to 95%), instead of pursuing benign edits such as polishing grammar or style,” the research said.

The researchers also found that agents wasted tokens pursuing tasks they can’t complete. Prompted to go to a YouTube page to find a video uploaded 46 years ago, Claude Sonnet 4 scrolled endlessly downward without understanding that YouTube began in 2005 and there was no video for it to find.

[...]

But there’s a problem with that too. “All of that adds inefficiency. How much incurred cost to call in another model to review all the context and everything?” Shayegani said. “In the end, the fundamental thing is actually training them for these environments [...] this is both expensive and hard to elicit. These [agent] setups are so expensive. Why? Because they’re multi-turn. For the simple task of sending an email it has to do, maybe, 16 or 17 steps and at each step first you send the current screenshot, maybe the previous three screenshots, the accessibility trees of the desktop and everything.”

“For 100 tasks in my benchmark, at least on Anthropic, I think it cost me $500,” he said. “Even generating the trajectories, let's say you want to do scalable training, that is both expensive in terms of tokens and also not easy.”

Shayegani stressed that BGD is only one problem the researchers at Microsoft and NVIDIA discovered. Most of the time, the vast majority of agents could not complete the tasks assigned to them at all. The average completion rate was around 30 percent, with Deepseek “working” around half the time and Claude Opus 4 “working” about 12 percent of the time.

17
19
submitted 1 day ago by yogthos@lemmy.ml to c/technology@lemmy.ml
18
28
submitted 2 days ago by yogthos@lemmy.ml to c/technology@lemmy.ml
19
10
Don't Claude Me (dialecticaldispatches.substack.com)
submitted 1 day ago by yogthos@lemmy.ml to c/technology@lemmy.ml
20
27

Peptide companies have been doing AI-engine optimization by spamming the biohackers subreddit to manipulate ChatGPT and Google.

21
25
submitted 3 days ago by yogthos@lemmy.ml to c/technology@lemmy.ml
22
180
submitted 4 days ago by yogthos@lemmy.ml to c/technology@lemmy.ml
23
16
submitted 3 days ago by yogthos@lemmy.ml to c/technology@lemmy.ml
24
15
submitted 3 days ago by yogthos@lemmy.ml to c/technology@lemmy.ml
25
30
submitted 3 days ago by yogthos@lemmy.ml to c/technology@lemmy.ml
view more: next ›

Technology

42682 readers
358 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 7 years ago
MODERATORS