39
How often does branchless programming actually matter?
(programming.dev)
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Follow the wormhole through a path of communities !webdev@programming.dev
Not that much - it's useful when you have a very hot loop where branches can cause the branch predictor to guess wrong and have to rollback computation it already did unnecessarily.
This StackOverflow question explains it fairly well: https://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-processing-an-unsorted-array
I understand the principles, how branch prediction works, and why optimizing to help out the predictor can help. My question is more of, how often does that actually matter to the average developer? Unless you're a developer on numpy, gonum, cryptography, digital signal processing, etc, how often do you have a hot loop that can be optimized with branchless programming techniques? I think my career has been pretty average in terms of the projects I've worked on and I can't think of a single time I've been in that situation.
I'm also generally aggravated at what skills the software industry thinks are important. I would not be surprised to hear about branchless programming questions showing up in interviews, but those skills (and algorithm design in general) are irrelevant to 99% of development and 99% of developers in my experience. The skills that actually matter (in my experience) are problem solving, debugging, reading code, and soft skills. And being able to write code of course, but that almost seems secondary.
I've never had to care about it in 16 years of coding. I've also seen a few absolutely horrifying code designs in the name of being branchless. Code readability is often way more important than eeking out every bit of compute out of a CPU. And it gets in a domain where architecture matters too: if you're coding for a microprocessor or some low power embedded ARM processor, those don't even have branch predictors so it's a complete waste of time
I'd say, being able to identify bottlenecks is what really matters, because it's what will eventually lead you to the hot loop you'll want to optimize.
But the overwhelming majority of software is not CPU bound, it's IO bound. And if it is CPU bound, it's relatively rare that you can't just add more CPUs to it.
I do get your concern however, these interview questions are the plague and usually asked by companies with zero need for it. Personally I pass on any job interview that requires some LeetCode exercises. I know my value and my value isn't remembering CS exercises from 10 years ago. I'll absolutely unfuck your webserver or data breach at 3am though. Frontend, backend, Linux servers, cloud infrastructure, databases, you name it, I can handle it no problem.
This. 100% this. The only thing more important than readability is whether it actually works. If you can't read it, you can't maintain it. The only exception is throw away scripts I'm only going to use a few times. My problem is that what I find readable and what the other developers find readable are not the same.
I love Go. I can modify a program to activate the built-in profiler, or throw the code in a benchmark function and use the tool chain to profile it, then have it render a flame graph that shows me exactly where the CPU is spending its time and/or what calls are allocating. It makes it so easy (most of the time) to identify bottlenecks.