I tried pausing and reading and their "arguments" were so yes-man like that they seemed to not really want to debate or lean one way or the other and basically were saying it depends on context or that it could be seen as either. Which is fine, but meaningless in the context of wanting to come up with an answer. Any question can be replied to with "it depends", without really answering the question in a satisfying way.
I think it would make more sense to either use an odd number of LLMs, or let them abstain if they are undecided - to try to force them to come up with a clear cut answer.
Then there is also the issue of swarm intelligence, which does not get used here at all, because it only works if the voters DO NOT discuss their thinking before the vote, thus influencing each other. One LLM could be confidently wrong, but because they all are such yes-man - the strongest, most confident sounding voice linguistically, might overweight the correct "thinking".
So yeah, this seems like a bad approach to a really interesting problem.
Here are some interesting reads on this topic: