I got stuck on a lunchtime video conference today with some people in my department and a vendor. I hate meetings at noon, but with timezones being a factor, it was the only available time the vendor could meet. Usually, I'm only on these for technical consultation, so I rarely need to speak other than to clarify a point or answer questions about our infrastructure. Those are usually toward the end of the meeting.
That said, I just muted myself and ate at my desk because I was starving and would have plenty of time to eat a quick bite before it got to the point where I had to say anything.
What I did not realize was that even though I was muted, my webcam was on. So 6 people I work with plus three vendors all watched me eat a bag of tacos, and no one said anything!
Like, when I eat a taco with people around, I do eat it in a dignified way. Less so when I'm alone (or think I'm alone) and wolfing it down before I have to do something So, yeah, it was not a pretty sight.
I'm still mortified, but at least I am laughing about it as I'm typing this out.
My day consisted of users complaining about speed, everything on analytics looked fine, I checked some random high demand applications and they indicated that they were waiting on network IO pretty consistently, so I go to check the file server where the data for those apps is centrally located with no redundancy, and I managed to.... Turn off the network interface on the file server.
🤦♂️
Don't ask me how, I'm still confused about it myself. My manager calls me not 5 minutes later after he got a call from the client. I'm sitting there absolutely shitting myself trying to figure out how to turn the network card back on, and every method I'm trying to use to connect to the system is failing.
Even my manager had some serious trouble trying to figure it out.
Took about 45 minutes until the system was back on the network. Right at the end of the day, on one of the busiest days of the year for that specific customer.
I feel really stupid.
Ooof. I've accidentally done that (though on a less important system), but thankfully was able to get in via iLO console to reset it.
I.... Didn't have ILO or any ipmi to the system. It was a cloud VM on Azure. Looking at their management tools, they're all IP based, which means you need a valid IP connection to the system to control it.
The setting I think that did it, was in the advanced network interface settings. I'm not sure which one, but long story short, whatever it was, messed up the network interface and basically everything was useless. My manager was doing something and I pushed an Azure CLI network interface reset, whatever he was doing, and/or what I did, eventually brought it back online. Luckily throughout this all the settings I changed were reverted, and the system was power cycled, so all evidence was destroyed.... Unless it logged something to the system logs. IDK.
The programs are very file heavy and most of that weight is on IO, not throughput, so I was tinkering with the buffers, but I can't be sure that the buffer settings were to blame. I'm sure I clicked on more than just buffer settings while I was in the network adapter settings.
I'm still pretty upset about it. Azure really doesn't make it easy to connect to the stupid console. Then again, hyper-V is mostly the same way, so....
I prefer running another hypervisor technology, but we're pretty heavily invested into Azure at my workplace, so I'm not sure that even suggesting it will get any traction.
The part that annoys me is that everything is built glass cannon style. Get big, fast systems in Azure, throw everything into those few systems and run it. It only takes one person running prime 95 for fun and profit(?) to wreck the ability for anyone to do meaningful work.
Glass cannon. Azure isn't a golden ticket that can handle all the workloads with a minimal number of virtual machines, but we're committed to a server-less architecture, even when the client would be better off with local, or colo, given their relative size. Like a quarter rack in a nearby colo, and a couple of hyperconverged systems and they would be very well served. It wouldn't be very different from what they're doing right now, either functionally or physically (or even cost-wise), but they would get a lot more out of it.
Suggesting it would be a lot like talking to a wall.