355
submitted 2 years ago by MicroWave@lemmy.world to c/news@lemmy.world

As the world recovers from the largest IT outage in history, it shows the danger of one point of failure in IT infrastructure

A global IT failure wreaked havoc on Friday, grounding flights and disrupting everything from hospitals to government agencies. Over all the chaos hung a question: how did a flawed update to Microsoft Windows software bring large swaths of society to a screeching halt?

The problem originated with an Austin, Texas-based cybersecurity firm called CrowdStrike, relied upon by most of the global technology industry, including Microsoft, for its Falcon program, which blocks the execution of malware and cyber-attacks. Falcon protects devices by securing access to a wide range of internal systems and automatically updating its defenses – a level of integration that means if Falcon falters, the computer is close behind. After CrowdStrike updated Falcon on Thursday night, Microsoft systems and Windows PCs were hit with a “blue screen of death” and rendered unusable as they were trapped in a recovery boot loop.

Microsoft is a juggernaut with significant market power, dominating cloud-computing infrastructure across Europe and the United States. So it wasn’t just computers that were affected, but servers and a host of other systems as well. Overwhelming requests from users, devices, services and businesses ushered in a cascading series of failures with Microsoft products – namely Azure Cloud and Microsoft 365. Failures plaguing Azure led to additional but separate disruptions with 365 services. A giant clusterfuck ensued.

all 25 comments
sorted by: hot top controversial new old
[-] Ooops@feddit.org 103 points 2 years ago

No... the Crowdstrike debacle primarily shows the dangers of today's corporate culture in software development.

Ship as fast as possible, fix issues later if necessary...

[-] dugmeup@lemmy.world 58 points 2 years ago

Yup. Push to prod!

This is the Boeing debacle in software land. Kill the engineering and pay the executives. QA? Testing? Strict standards? People? Naaah, more conferences! More logos on F1 cars!

[-] mynameisigglepiggle@lemmy.world 1 points 2 years ago

I totally agree. But without knowing a bit more about the specifics, I can't help but think that just maybe... The updating mechanism could have perhaps just rolled back an update if it caused a bsod?

Seems like that infrastructure is really the biggest oversight and people would have been none the wiser.

Also surprised just how many things are running windows. I thought for sure the self checkout registers would have been some embedded Linux system.

[-] Rentlar@lemmy.ca 23 points 2 years ago* (last edited 2 years ago)

I disagree. You are correct that the cause of the fuck up is because of bad development practices. However, if every firm is being reckless with development, but only one out of a myriad of competing firms fucks up because of it, maybe you'd take one airline or hospital network offline or something like that.

It's only because of consolidation and market monopolization of the sector, that an outage at such a global scale was even possible to begin with.

[-] Wooki@lemmy.world 3 points 2 years ago

You’re partly right as is the article.

Centralization is dangerous for security, innovation and cost (monopolies, duopolies).

[-] yggdar@lemmy.world 24 points 2 years ago* (last edited 2 years ago)

Am I missing something? I thought the outage was caused by CrowdStrike and had nothing to do with Microsoft or Windows?

[-] pycorax@lemmy.world 12 points 2 years ago

The article actually talks about Azure which was using CrowdStrike internally so their point is valid but the headline is absolutely wrong. Azure is nowhere near a monopoly and it ends up implying that Windows, now Azure was the issue they're describing.

[-] Blaster_M@lemmy.world 6 points 2 years ago

This is the typical Guardian sensationalism. Gotta make it look like it was Microsoft's fault, although this one is square on CrowdStrike's head. Imagine if a security update for a remote administration tool caused an on-boot kernel panic on every linux server in the world...

[-] bolexforsoup@lemmy.blahaj.zone 0 points 2 years ago

Microsoft also had issues yesterday

[-] hangonasecond@lemmy.world 5 points 2 years ago

Microsoft's use of CrowdStrike meant that a significant number of their cloud and SaaS offerings also failed, impacting users who likely didn't know what CrowdStrike was.

[-] EtherWhack@lemmy.world -2 points 2 years ago

Only systems running CloudStrike were affected, but all systems were Windows-based as that is the only OS it works with.

I think it's more touching on the vulnerability of infrastructure if a larger portion is run by only one OS. Something a lot of usb here may realize, but the general public has never really understood it. Where a scenario like this or similar can can cause a wide-spread blackout, all from a single bug; be it from popular software, or the OS itself.

[-] ImADifferentBird@lemmy.blahaj.zone 8 points 2 years ago* (last edited 2 years ago)

That's not correct. Crowdstrike does also work with Mac and Linux, but this particular incident only impacted the Windows sensor.

They actually had a similar issue with the Linux sensor a couple of months ago, which... doesn't speak well of their update process.

[-] TrickDacy@lemmy.world -2 points 2 years ago

This is the extremely important akshually line anyway. Let's all pretend that every OS is just as shitty because it lets us correct others on the Internet constantly

[-] FenrirIII@lemmy.world -3 points 2 years ago

Another hivemind circlejerk

[-] catloaf@lemm.ee 12 points 2 years ago

What monopolization? Crowdstrike has plenty of competitors.

[-] FlyingSquid@lemmy.world 12 points 2 years ago

So does Google. And yet they're still being taken to court for monopolistic practices by the U.S. Department of Justice.

Monopoly doesn't mean literally no competitors in the real world. It means no competitors worth noting because everyone has been corralled into using a single company.

[-] sandalbucket@lemmy.world 8 points 2 years ago

Crowdstrike is big, but not that big.

About half of my clients use them; and of those, about a third are halfway through ripping them out in favor of MS defender.

(MS is definitely “that big”)

[-] catloaf@lemm.ee 0 points 2 years ago

It means no competitors worth noting because everyone has been corralled into using a single company.

Right. That's not the case here. Crowdstrike competes with a dozen other EDR products. Using the number of ratings as a proxy for popularity, they're not even the most popular.

[-] FlyingSquid@lemmy.world 8 points 2 years ago

That dozen don't seem to add up to much together considering the massive global nature of this outage.

[-] catloaf@lemm.ee 4 points 2 years ago

Because "we didn't crash" doesn't make the news. My company wasn't affected, so nobody cared about us.

[-] FlyingSquid@lemmy.world 2 points 2 years ago

That's great that your company wasn't affected.

I hope no one was trying to fly a plane to get to your company yesterday.

[-] Marduk73@sh.itjust.works 4 points 2 years ago

The title reminds me of the microsoft monopoly issue ages ago. They had to bail ou Apple so ms could have competition.

[-] highduc@lemmy.ml 3 points 2 years ago

It's a shitty state of affairs but I bet nothing's gonna change.

[-] sircac@lemmy.world 3 points 2 years ago* (last edited 2 years ago)

Insert Nicholas Cage “You don’t say?” meme here

this post was submitted on 20 Jul 2024
355 points (97.8% liked)

News

35714 readers
843 users here now

Welcome to the News community!

Rules:

1. Be civil


Attack the argument, not the person. No racism/sexism/bigotry. Good faith argumentation only. This includes accusing another user of being a bot or paid actor. Trolling is uncivil and is grounds for removal and/or a community ban. Do not respond to rule-breaking content; report it and move on.


2. All posts should contain a source (url) that is as reliable and unbiased as possible and must only contain one link.


Obvious biased sources will be removed at the mods’ discretion. Supporting links can be added in comments or posted separately but not to the post body. Sources may be checked for reliability using Wikipedia, MBFC, AdFontes, GroundNews, etc.


3. No bots, spam or self-promotion.


Only approved bots, which follow the guidelines for bots set by the instance, are allowed.


4. Post titles should be the same as the article used as source. Clickbait titles may be removed.


Posts which titles don’t match the source may be removed. If the site changed their headline, we may ask you to update the post title. Clickbait titles use hyperbolic language and do not accurately describe the article content. When necessary, post titles may be edited, clearly marked with [brackets], but may never be used to editorialize or comment on the content.


5. Only recent news is allowed.


Posts must be news from the most recent 30 days.


6. All posts must be news articles.


No opinion pieces, Listicles, editorials, videos, blogs, press releases, or celebrity gossip will be allowed. All posts will be judged on a case-by-case basis. Mods may use discretion to pre-approve videos or press releases from highly credible sources that provide unique, newsworthy content not available or possible in another format.


7. No duplicate posts.


If an article has already been posted, it will be removed. Different articles reporting on the same subject are permitted. If the post that matches your post is very old, we refer you to rule 5.


8. Misinformation is prohibited.


Misinformation / propaganda is strictly prohibited. Any comment or post containing or linking to misinformation will be removed. If you feel that your post has been removed in error, credible sources must be provided.


9. No link shorteners or news aggregators.


All posts must link to original article sources. You may include archival links in the post description. News aggregators such as Yahoo, Google, Hacker News, etc. should be avoided in favor of the original source link. Newswire services such as AP, Reuters, or AFP, are frequently republished and may be shared from other credible sources.


10. Don't copy entire article in your post body


For copyright reasons, you are not allowed to copy an entire article into your post body. This is an instance wide rule, that is strictly enforced in this community.

founded 2 years ago
MODERATORS