[-] piccolo@hexbear.net 27 points 3 months ago* (last edited 3 months ago)

Just as they ban Venezuelan president Nicolás Maduro, they unban the anti-vaxxers yea

[-] piccolo@hexbear.net 38 points 3 months ago

RIP to the good bit about saying "I wonder what Milo Yabadabadopolis-fe-fi-filopolis has to say about this" then linking to his Twitter account that just said "@Nero was banned from Twitter"

[-] piccolo@hexbear.net 68 points 3 months ago* (last edited 3 months ago)

Right after your quote:

For companies and investors caught in the fray, it’s been “total misery,” says Dan Wang, research fellow at Stanford University’s Hoover History Lab and the author of Breakneck. China’s model relies on “a lot of state power, a lot of consumer power, but not very much financial investor benefit,” he says.

porky-point won't someone think of the poor shareholders? porky-scared

xigma-male No.

45
submitted 4 months ago by piccolo@hexbear.net to c/technology@hexbear.net

I've been meaning to try to write a bit more, and unfortunately I can't put this on a blog post attached to my name if I wish to be employable in tech in the US, so I figured I'd write a bit of an effortpost about the state of LLMs in the world. I've been learning a lot about LLMs lately (don't ask me why the things that become hyperfocuses for me become hyperfocuses) and I figured that some people here might be interested in learning more.

I was inspired to post by this article posted to /c/news, and all I have to say about this is JDPON Don back at it with another banger. In all seriousness, I think this is very good for the state of Chinese AI, which is already very good.

For those not following recent LLM updates (very understandable), the TL;DR is that a lot of new open-source models coming out of China are really good, and pushing the state of the art. Generally, they're still less good than the best closed-source models from the US (Claude in particular is the best currently), but they're much much much cheaper and honestly getting quite good. Plus, they seem to be giving US-based AI companies a good scare, which is always fun.

For reference, the best models from US firms in general are Claude (by Anthropic), Gemini (by Google), and OpenAI's models, though it seems like GPT-5 was a bit of a disappointment. My bet's on Anthropic in general for all of the closed source models - they seem to be killing it, in general, and have some very interesting research about understanding LLMs. This is a very cool paper from them that covers trying to understand how LLMs work with a quite novel model of it that I think could give a lot of explainability to how they operate.

[Side note: I think it's quite scary that leading AI research firms making leading AI models generally don't know how they work or how to reason about what they're doing, especially given that they can tell when they're being evaluated and notably suppress the "scheming" part of them when they think they're being tested on scheming.]

Anyways, back to China. One of the most significant LLMs to come out of China in the last while was DeepSeek-R1, which was able to match or outperform OpenAI's state of the art model o1 (the best model at the time) on most benchmarks. R1 completely changed the metagame - it changed the dominant type of model for LLM (dense LLM vs Mixture-of-Experts) singlehandedly, and scared OpenAI into dropping its prices for o1. And DeepSeek did this while there is a huge GPU shortage in China because of the export controls. And they did this while spending only $5.5M USD, compared to the estimated $100M to train GPT-4 (which is less powerful than o1). This is absolutely bonkers, and there's a reason this caused the stock market in the US to dip for a bit.

Now, R1 is not quite as good as the closed source models, despite the benchmarks. In particular, its English flows less well and it struggles with some types of queries. But it's crazy that a company came out of nowhere, trained a new type of model for 1/20 the cost of OpenAI training a worse model, released it for free, and completely changed the meta. And it also reasons, which isn't new, but it is a particularly good reasoner, and I think they got a lot of things right with how it works.

Anyways, R1 is old news now. There are a billion new open source models coming out from China now. Some notable companies include Alibaba (Qwen), Moonshot AI (Kimi), and Z.ai (formerly Zhipu AI; GLM). People on reddit-logo say that Qwen3 Coder and Qwen3 235B A22B (both Thinking and Instruct) are very good - for my use cases (mostly programming), I much prefer GLM 4.5. I was impressed with Qwen for questions about code, but I found it to be less good at actually writing it, for the most part. YMMV, though, I think this is a somewhat unpopular opinion. But anyways, it seems like each week a new top open source model appears, from China. Far and away they are leading the open source efforts. And even if they aren't quite as good as Claude, Claude Sonnet 4 costs $15/million tokens of output, whereas Qwen3 Coder is free up to 2000 requests per day from Alibaba, and costs $0.80/million tokens of output, which is crazy cheap.

Another notable thing about Chinese open source models is that they are generally much easier to jailbreak than Western models, except for older less powerful open source models like Llama's and Mistral's models, which are also very easy. So you can get them to write all the erotic bomb making content you'd want (I'm happy to provide tips on jailbreaking if anyone would like).

Also, it seems that in the current market, companies in general are tripping over each other to give you free access to open source LLMs as each tries to become the place to get LLM access from, which means it's a really good time to be mooching access to these guys. Alibaba will give lots and lots of Qwen3 Coder credits, OpenRouter will give you 2000 free requests a day for eternity to a lot of good models if you at any point put $10 into their system, Chutes will give you 200 free requests/day for basically any open source model for a one time payment of $5, etc. Even Google will give you free access to their top tier model (though a pretty small amount per day) via Gemini CLI.

Anyways, my main point is that China is doing all of this despite a huge GPU shortage in the country. So if JDPON Don really wants to give them more access to Nvidia chips, it must be because he wants to boost their LLM market even further.

Thanks for coming to my Theodore lecture.

[-] piccolo@hexbear.net 17 points 1 year ago

Parenti stopped being friends with Bernie after the Kosovo vote afaik

[-] piccolo@hexbear.net 20 points 1 year ago

Have you seen that TikTok video that was like a 3 way split screen of Yellow Parenti, someone playing Subway Surfers, and one of those visual ASMR videos? We need to make more of that type of content

[-] piccolo@hexbear.net 22 points 1 year ago

I think CloudFlare uses lava lamps because it's a cool story, but there are ways you can get truly random bits from other things, like this. Generally, you want to have some sort of physical process going on that provides random entropy, because CPUs by themselves can only produce pseudorandom numbers. For example, random.org uses atmospheric noise, which is random and unpredictable when you look at very tiny variances. You can also use, e.g. a super sensitive Geiger counter to measure random bits of radiation, or if you shoot photons at a semi-reflective surface, sometimes they go through and sometimes they reflect. For the type shown here, though, the most common kind of noise they use is from quantum effects relating to transistors, as far as I know. This is an actual source of randomness, so if it's done right it can be just as good as lava lamps or Geiger counters or whatever.

[-] piccolo@hexbear.net 46 points 1 year ago

Lol I was there recently and also noticed that. The whole museum was full of propaganda, but it was cool to see some of the exhibits, like the Trabant. Some "fun" anticommunist highlights included:

  • The dystopian evil jail cell run by the communist dictators (about the same size as the room I rent in [major city, imperial core country])
  • The dystopian evil kindergarten with a rigid schedule including play and nap time
  • The dystopian evil standard allocated apartment that EVERYONE had and there was NO individuality (much bigger than the room I rent, and for much less money)
  • The dystopian evil SEX that all the HORNY EAST GERMANS were having (the museum explained it as a result of there being nothing else to do, lol)
  • The dystopian evil DIY repair culture
  • The dystopian evil tired bureaucrats who looked more like people than Bond villains
  • The dystopian evil LGBTs who weren't forbidden from existing by the state
  • The factory farming that happened under the DDR (which, like, as a vegan, sure, but it's not like the Federal Republic or any Western country wasn't doing this)
  • There was literally a panel saying that all Eastern bloc states weren't allowed to deviate from the USSR's policies or will, then gave an example of the DDR doing just that to resist Soviet reforms in '85

And every single bullet point here made my blood boil (supposed to be a translation of some of the key terms, without propaganda):

spoiler (sorry, I only photographed the English text)

Anyways, I guess I funded anticommunist propaganda by visiting and also buying a DDR patch

[-] piccolo@hexbear.net 28 points 2 years ago

I fucking hate this bazinga brained "throw precaution (and/or regulations) to the wind" bullshit. I know that's like the SV MO but it makes me really just want to go out and do something really cool.

melon-musk speech-side-l-1 "Oh yeah let's just send thousands of new satellites up every year and every 5 years we'll let them burn up upon reentry and send up more!" speech-side-l-2

Another thing, and I know these techbrained libertarian types don't understand this, is that even if you accept this is something that humanity needs, you certainly don't need companies competing by each running their own satellite arrays. This is a natural monopoly to end all natural monopolies.

[-] piccolo@hexbear.net 39 points 2 years ago

Yeah actually it seems that SawStop is being very disingenuous with this where they're trying to lobby for this "safety" bill to be passed that makes it so only they can sell table saws. If they really cared about safety, they should make the patent free for all to use like the 3 point seatbelt

[-] piccolo@hexbear.net 17 points 2 years ago

Do you have any book recommendations for learning more about this? Really anything about how the world bank or IMF function as an arm of imperialism would be much appreciated.

59

Love voting for the lesser of two evils candidate who won't let women wearing a hijab in to their campaign rallies! Can you imagine the lib reaction if this happened at a Trump event?

view more: next ›

piccolo

joined 2 years ago