I've spent the past day working on my newest Poweredge R620 acquisition, and trying to nail down what things I can do without checking. Google has shown me that everyone seems to be having similar issues regardless of brand or model. Gone are the days when a rack server could be fully booted in 90 seconds. A big part of my frustration has been when the USB memory sticks are inserted to get firmware updated before I put this machine in production, easily driving times up to 15-20 minutes just to get to the point where I find out if I have the right combination of BIOS/EUFI boot parameters for each individual drive image.
I currently have this machine down to 6:15 before it starts booting the OS, and a good deal of that time is spent sitting here watching it at the beginning, where it says it's testing memory but in fact hasn't actually started that process yet. It's a mystery what exactly it's even doing.
At this point I've turned off the lifecycle controller scanning for new hardware, no boot processes on the internal SATA or PCI ports, or from the NICs, memory testing disabled... and I've run out of leads. I don't really see anything else available to turn off sensors and such. I mean it's going to be a fixed server running a bunch of VMs so there's no need for additional cards although some day I may increase the RAM, so I don't really need it to scan for future changes at every boot.
Anyway, this all got me thinking... it might be fun to compare notes and see what others have done to improve their boot times, especially if you're also balancing your power usage (since I've read that allowing full CPU power during POST can have a small effect on the time). I'm sure different brands will have different specific techniques, but maybe there's some common areas we can all take advantage of? And sure, ideally our machines would never need to reboot, but many people run machines at home only while being used and deal with this issue daily, or want to get back online as quickly as possible after a power outage, so anything helps...
I concur and it just gets worse the more hardware you have in them. 256G of memory and 24 disks? Might as well go have lunch while it boots.
Damn are all 24 disks internal? That's some rig! I have the hardware on my latest NAS to connect up to 56 drives in hot-swap bays, and at one point while migrating data to the new drives I had 27 active units. Now that I've cleaned it up I'm only running 17 drives but it still seems like quite a stack.
Yea they're internal. That's normal for a fully loaded 2u storage server. Some even have 2-4 extra disk slots in the rear to cram in a few more.
Wow that's packing a lot in 2u. I've only ever had 1u servers so eight 2.5" slots is a lot for these.
To be fair, that’s for something like the R720xd, which drops the disk drive and tape drive slots to fit an extra 8 disks in the front. I have a regular ole R720 and it only has 16 bays. I didn’t need that many bays, and wanted better thermals for the GPU in it.
Edit: and I went with the 2U because it’s so much quieter.
The 2u (R720) is quieter than the 1u (R620)? Or quieter than the R720xd?
Unfortunately the 720 wouldn't have worked for me as the majority of my drives are 3.5" (8x18TB + 5x6TB). What I ended up doing is designing a 3D-printed 16-drive rack using some cheap SATA2 backplanes. Speed tests showed the HDDs were slower than SATA2 anyway, so despite the apparent hardware limitation I actually still clock around 460MB/s transfer rates from those arrays. Then I use the internal 2.5" slots for SSDs. Seems to be working a hell of a lot better than my previous server (a PE 1950 which only had a PCIx 4x slot and topped out at about 75MB/s).
And beyond the UEFI/boot stuff, it takes 10 minutes just for my ZFS pool to mount