48
Good Retry, Bad Retry: An Incident Story
(medium.com)
This is a most excellent place for technology news and articles.
All of what you're saying seems correct. I think this is more of a meta discussion, on how (in this case) retries, even with exponential back off, aren't a solution by themselves when you look at the system overall. There are interesting hidden caveats to any common solutions, this is one I personally wasn't aware of.
Practically, adding a timeout budget so that the clients themselves just error out (forcing a manual refresh) sorta accomplishes the same as what you're positing.