4229

Musk is undeniably just trying to run twitter into the ground at this point. (lemmy.world)

submitted 2 years ago by STRIKINGdebate2@lemmy.world to c/mildlyinfuriating@lemmy.world

801 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] noodle@feddit.uk 23 points 2 years ago

Almost certainly this isn't anything to do with scraping. Like with Reddit, those with a stake in Twitter stand to benefit from AI and, as far as I know, there's no mass reposting (retweeting?) effort to something like Mastodon.

That would be trivial to block anyway, since it would be easy to identity the service accounts and source IP's of the requests. No need to impact average users.

What's more likely is he hasn't paid the bill for his cloud infrastructure and no longer has the capacity to serve so many users.

IMO, that's what you get when you fire half of your staff.

[-] oatscoop@midwest.social 8 points 2 years ago

IMO, that’s what you get when you fire half of your staff.

And pander to extremists, drive advertisers away, refuse to pay your bills, etc.

[-] Stallone@lemmy.world 4 points 2 years ago

I'm not so sure, there are a lot of businesses and people training their AI models right now and sites like reddit or twitter are very attractive huge collections of user generated content. It's not the most outrageous assumption that they'll try to get that data for free by scraping instead of paying for API access.

[-] sergih123@eslemmy.es 9 points 2 years ago

I don't think however, that it is that hard to differentiate an AI scraper between an actual user, since AI scrapers would be scraping huge amounts of data, which the average user doesn't. Correct me if I'm wrong. wdyt

[-] noodle@feddit.uk 6 points 2 years ago* (last edited 2 years ago)

No, you're correct. Service accounts can consume data way faster than a human user ever could. A smart business always implements rate limits or you could bankrupt them with a simple curl command. They could bankrupt themselves in testing with a simple loop!

This can be fixed in many ways, not just by putting limitations on credentials but also on source addresses. If a certain address or range of addresses seems to be running multiple service accounts and pulling huge amounts of data, you can deny requests from those IP's.

In short, this AI angle smells like BS to save face. Musk effectively fired the SRE team who looked after critical infrastructure. It was their job to ensure service reliability, so it should not be a surprise that Twitter now has issues with service reliability.

[-] billiam0202@lemmy.world 3 points 2 years ago

They could bankrupt themselves in testing with a simple loop!

You mean exactly like what Twitter did this past weekend?

[-] Veddit@lemmy.world 6 points 2 years ago

But also, hasn't that boat left already for several AI companies? They've already trained it up, no need to scrape again, they just use what they got last time for their core training, it's only the last couple of years/months they're missing.

[-] Bilbo@vlemmy.net 2 points 2 years ago

That would be trivial to block anyway

Is just ridiculously false. If you think it is true, make a service to do this trivial thing for people, and become a millionaire overnight.

[-] noodle@feddit.uk 2 points 2 years ago

Funnily enough, I do. I'm an SRE myself.

Services like Akamai have tools that are literally designed to block requests from known bad locations and IP ranges.

this post was submitted on 01 Jul 2023

4229 points (96.3% liked)

Mildly Infuriating

45094 readers

3 users here now

Home to all things "Mildly Infuriating" Not infuriating, not enraging. Mildly Infuriating. All posts should reflect that. Please post actually infuriating posts to !actually_infuriating@lemmy.world

I want my day mildly ruined, not completely ruined. Please remember to refrain from reposting old content. If you post a post from reddit it is good practice to include a link and credit the OP. I'm not about stealing content!

It's just good to get something in this website for casual viewing whilst refreshing original content is added overtime.

Rules:

1. Be Respectful

Refrain from using harmful language pertaining to a protected characteristic: e.g. race, gender, sexuality, disability or religion.

Refrain from being argumentative when responding or commenting to posts/replies. Personal attacks are not welcome here.

...

2. No Illegal Content

Content that violates the law. Any post/comment found to be in breach of common law will be removed and given to the authorities if required.

That means: -No promoting violence/threats against any individuals

-No CSA content or Revenge Porn

-No sharing private/personal information (Doxxing)

...

3. No Spam

Posting the same post, no matter the intent is against the rules.

-If you have posted content, please refrain from re-posting said content within this community.

-Do not spam posts with intent to harass, annoy, bully, advertise, scam or harm this community.

-No posting Scams/Advertisements/Phishing Links/IP Grabbers

-No Bots, Bots will be banned from the community.

...

4. No Porn/Explicit

Content

-Do not post explicit content. Lemmy.World is not the instance for NSFW content.

-Do not post Gore or Shock Content.

...

5. No Enciting Harassment,

Brigading, Doxxing or Witch Hunts

-Do not Brigade other Communities

-No calls to action against other communities/users within Lemmy or outside of Lemmy.

-No Witch Hunts against users/communities.

-No content that harasses members within or outside of the community.

...

6. NSFW should be behind NSFW tags.

-Content that is NSFW should be behind NSFW tags.

-Content that might be distressing should be kept behind NSFW tags.

...

7. Content should match the theme of this community.

-Content should be Mildly infuriating. If your post better fits !Actually_Infuriating put it there.

-The Community !actuallyinfuriating has been born so that's where you should post the big stuff.

...

8. Reposting of Reddit content is permitted, try to credit the OC.

-Please consider crediting the OC when reposting content. A name of the user or a link to the original post is sufficient.

...

Also check out:

Partnered Communities:

1.Lemmy Review

2.Lemmy Be Wholesome

3.Lemmy Shitpost

4.No Stupid Questions

5.You Should Know

6.Credible Defense

Reach out to LillianVS for inclusion on the sidebar.

All communities included on the sidebar are to be made in compliance with the instance rules.

founded 2 years ago

MODERATORS

LillianVS@lemmy.world

STRIKINGdebate2@lemmy.world

Tenthrow@lemmy.world

Lasherz12@lemmy.world