overview for piggy

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself! by piggy in c/technology@hexbear.net

[-] piggy@hexbear.net 1 points 1 day ago* (last edited 1 day ago)

I've never said that AI is the cause of those problems that's words you're putting in my mouth. I've said that AI is being used as a solution to those problems in the industry when in reality the use of AI to solve those problems exacerbates them while allowing companies to reap "productive" output.

For some reason programmers can understand "AI Slop" but if the AI is generating code instead of stories, images, audio and video it's no longer "AI Slop" because we're exalted in our communion with the machine spirits! Our holy logical languages could never encode the heresy of slop!

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself! by piggy in c/technology@hexbear.net

[-] piggy@hexbear.net 1 points 1 day ago* (last edited 1 day ago)

This is a quantization function. It's a fairly "math brained" name I agree, but the function is called qX_K_q8_K because it quantizes a value with a quantization index of X (unknown) to one with a quantization index of 8 (bits) which correlates to the memory usage. The 0 vs K portions are how it does rounding, 0 means it does rounding by equal distribution (without offset), and K means it creates a distribution that is more fine grained around more common values and is more rough around least common values. e.g. I have a data set that has a lot of values between 4 and 5 but not a lot of 10s. I have lets say 10 brackets between 4 and 5 but only 3 between 5 and 10.

Basically it's a lossy compression for a data set into a specific enumeration (roughly correlates with size), so it's a way to given 1,000,000 numbers from 1-1000000, of putting their values into a range of numbers based on the q level How using different functions affects the output of models is more voodoo than anything else. You get better "quality" output from higher memory space, but quality is a complex metric and doesn't necessarily map to factual accuracy in the output, just statistical correlation with the model's data set.

An example of a common quantizer is an analog to digital converter. It must take continuous values from a wave that goes 0 to 1 and transform them into digital values of 0 and 1 with a specific sample rate.

Taking a 32 bit float and copying the value into 32 bit float is an identity quantizer.

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself! by piggy in c/technology@hexbear.net

[-] piggy@hexbear.net 1 points 1 day ago* (last edited 1 day ago)

You're making up a giant straw man of how you pretend software development works which is utterly divorced from what we see happening in the real world. The AI doesn't change this one bit.

Commenting this under a post where an AI has spit out a dot product function optimization for an existing dot product function that's already ~150-250 lines long depending on architectural implementation of which there are about 6. The PR for which has an interaction that is two devs finger pointing about who is responsible for writing tests. The PR for which notes that the original and new function often don't give the correct answer. Just an amazing response. Chefs kiss.

What a wonderful way to engage with my post. You win bud. You're the smartest. This industry would never mystify a basic concept that's about 250 years old with a 716 line PR through its inability to communicate, organize and follow an academic discipline.

Is DeepSeek about to cause a stock market crash? by piggy in c/news@hexbear.net

[-] piggy@hexbear.net 7 points 3 days ago* (last edited 3 days ago)

As a sidenote "putting things in boxes" is the very thing that itself upholds bourgeois democracies and national boundaries as well.

I mean at this raw of an argument you might as well argue for Lysenkoism because unlike Darwinism/Mendelian selection it doesn't "put things in boxes". In practice things are put in boxes all the time, it's how most systems work. The reality is that as communists we need to mediate the negative effects of the fact that things are in boxes, not ignore the reality that things are in boxes.

The failure of capitalism is the fact that it's systems of meaning making converge into the arbitrage of things in boxes. At the end of the day this is actually the most difficult part of building communism, the Soviet Union throughout it's history still fell ill with the "things in boxes" disease. It's how you get addicted to slave labor, it's how you make political missteps because it's so easy to put people in a "kulak" in a box that doesn't even mean anything anymore, it's how you start disagreements with other communist nations because you really insist that they should put certain things into a certain box.

Is DeepSeek about to cause a stock market crash? by piggy in c/news@hexbear.net

[-] piggy@hexbear.net 9 points 3 days ago* (last edited 3 days ago)

It doesn't work in the average case. I've seen this tactic from the company that I work for and multiple companies I have contacts at. Bosses think they can simply use "AI" to fix their hollowed out documentation, on-boarding, employee education systems by pushing a bunch of half correct, barely legible "documentation" through an LLM.

It just spits out garbage for 90% of people doing this. It's a garbage in garbage out process. In order for it to even be useful you need a specific type of LLM (a RAG) and for your documentation to be high quality.

Here's an example project: https://github.com/snexus/llm-search

The demo works well because it uses a well documented open source library. It's also not a guarantee that it won't hallucinate or get mixed up. A RAG works simply by priming the generator with "context" related to your query, if your model weights are strong enough your context won't outweigh the allure of statistical hallucination.

Melk by piggy in c/chapotraphouse@hexbear.net

[-] piggy@hexbear.net 47 points 3 days ago

"State Beverage" is midcentury marketing brain worms for large agribusinesses. It might be quaint but they sold a shit ton through idiotic reflexive reactionary nationalism.

Brace for impact! by piggy in c/chapotraphouse@hexbear.net

[-] piggy@hexbear.net 21 points 3 days ago

By "retire" I mean, when I have aged out of software and I can just burn all my bridges.

Brace for impact! by piggy in c/chapotraphouse@hexbear.net

[-] piggy@hexbear.net 27 points 3 days ago* (last edited 3 days ago)

Haha..... Boy do I have stories..... I worked in a terrible evil company (aren't they all but this one was a bit egregious).

The CEO was an absolute moron whose only skill was being a contracts guy and being a money raising guy. We had an internal app for employees to do their work on in the field. He was adamant about getting it in the app store after he took some meeting with another moron. We kept telling him there's no point, and there's a shit ton of work because weh ave to get the app to apples standards. He wouldn't take no for an answer. So we allocated the resources to go ahead, some other projects got pushed way back for this.

A month goes by and we have another meeting, and he says why isn't X done. We told him, we had to deprioritize X to get the app in the app store. He says well who decided that. We tell him that he did. You know how a normal person would be a bit ashamed of this right? Well guess what he just had a little tantrum and still blamed everyone else but himself.

Same guy fired a dude (VP level) because his nepo hire had it out for him. That dude documented all his work out in the open, and then when that section of the business collapsed a day later they had to hire him back as a contractor and the CEO still didn't trust him and trusted his nepo hire, and didn't see the fact that his decision making was the inefficiency.

When I retire I swear to god I'm going to write "this is how capitalism actually works" books about my experiences working with these people.

Brace for impact! by piggy in c/chapotraphouse@hexbear.net

[-] piggy@hexbear.net 19 points 3 days ago* (last edited 3 days ago)

I'm confident a lot of startups will spring out of the ground that will be developing DeepSeek wrappers and offering the same service as your OpenAIs

This is true. But I don't think OpenAI is even cornering the tech market really. The company I work for makes a lot of content for various things and a lot of engineers are tech fetishists and a lot of executives are IP protectionist obsessives. We are banned from using publicly available AI offerings, we don't contract with Open AI but we do contract with Maia for creating models (because their offering specifically talks through the "steal your IP" problems). So OpenAI itself is not actually in many of these spaces.

But yeah your average chat girlfriend startup is going to remove the ChatGPT albatross from its neck, given it's engineers/founders are just headlines guys. A lot of this ecosystem is really the "Uber but for " style guys.

Brace for impact! by piggy in c/chapotraphouse@hexbear.net

[-] piggy@hexbear.net 48 points 3 days ago* (last edited 3 days ago)

I agree with the majority of your comment.

no one is gonna pay thousands of dollars for a corporate LLM that's only 10% better than the free one.

This is simply not true in how businesses actually work. It certainly limits your customer base organically but there are plenty of businesses who in "tech terms" overpay for things that are even free because of things like liability and corruption. Enterprise sales is completely perverse in its logic and economics. In fact most open source giants (e.g. Redhat) exist because of the fact that corps do in-fact overpay for free things for various reasons.

DeepSeek Buzz Puts Tech Stocks on Track for $1 Trillion Wipeout by piggy in c/news@hexbear.net

[-] piggy@hexbear.net 22 points 3 days ago

The reason closed source models are unoptimized is a way to mislead competitors and attempt to move the competition from one of technological prowess to one of courting investment.

Is DeepSeek about to cause a stock market crash? by piggy in c/news@hexbear.net

[-] piggy@hexbear.net 20 points 3 days ago* (last edited 3 days ago)

So LLM's the "AI" that everyone is typically talking about are really good at one statistical thing:

"CLASSIFYING"

What is "CLASSIFYING" you ask? Well it's basically attempting to take a data and put it into specific boxes. If you want to classify all the dogs you could classify them based on breed for example. LLMs are really good at classifying better than anything we've ever made and they adapt very well to new scenarios and create emergent classifications of data fed to them.

However they are not good at basically anything else. The "generation" that these LLMs do is based on the classifier and the model, which basically generates responses based on statistically what the next word is. So for example it's entirely possible that if you fed an LLM the entirety of Shakespeare and only Shakespeare and you gave it "Two households both alike" as a prompt, it practically may spit out the rest or Romeo and Juliet.

However this means AI's are not good at the following:

discerning truth from fiction
following technical processes (like counting r's in strawberry)
having "human like" understandings of the connections between concepts (think of the "is soup a salad" type memes)

So… is what I said above really just how AI is being used in the US, and is that the reason for the huge bubble in asset values of companies like Nvidia and Microsoft.

Don't get me wrong, yes this is a solution in search of problem. But the real reason that there is a bubble in the US for these things is because companies are making that bubble on purpose. The reason isn't even rooted in any economic reality. The reason is rooted in protectionism. If it takes a small lake of water and 10 data centers to run ChatGPT, that means it's unlikely you will lose a competitive edge because you are misleading your competition. If every year you need more and more compute to run the models it concentrates who can run them and who ultimately has control of them. This is what the market has been doing for about 3 years now. This is what DeepSeek has undone.

The similarities to BitCoin and crypto bubbles are very obvious in the sense that the mining network is controlled by whoever has the most compute. Etherium specifically decided to cut out the "middle man" of who owns compute and basically says whoever pays into the network's central bank the most controls the network.

This is what 'tech as assets' means practically. Inflate your asset as much as possible regardless of it's technical usefulness.