PS3 most certainly had a separate GPU - was based on the GeForce 7800GTX. Console GPUs tend to be a little faster than their desktop equivalents, as they share the same memory. Rather than the CPU having to send eg. model updates across a bus to update what the GPU is going to draw in the next frame, it can change the values directly in the GPU memory. And of course, the CPU can read the GPU framebuffer and make tweaks to it - that's incredibly slow on desktop PCs, but console games can do things like tone mapping whenever they like, and it's been a big problem for the RPCS3 developers to make that kind of thing run quickly.
The cell cores are a bit more like the 'tensor' cores that you'd get on an AI CPU than a full-blown CPU core. They can't speak to the RAM directly, just exchange data between themselves - the CPU needs to copy data in and out of them in order to get things in and out, and also to schedule any jobs that must run on them, they can't do it themselves. They're also a lot more limited in what they can do than a main CPU core, but they are very very fast at what they can do.
If you are doing the kind of calculations where you've a small amount of data that needs a lot of repetitive maths done on it, they're ideal. Bitcoin mining or crypto breaking for instance - set them up, let them go, check in on them occasionally. The main CPU acts as an orchestrator, keeping all the cell cores filled up with work to do and processing the end results. But if that's not what you're trying to do, then they're borderline useless, and that's a problem for the PS3, because most of its processing power is tied up in those cores.
Some games have a somewhat predictable workload where offloading makes sense. Got some particle effects - some smoke where you need to do some complicated fluid-and-gravity simulations before copying the end result to the GPU? Maybe your main villain has a very dramatic cape that they like to twirl, and you need to run the simulation on that separately from everything else that you're doing? Problem is, working out what you can and can't offload is a massive pain in the ass; it requires a lot of developer time to optimise, when really you'd want the design team implementing that kind of thing; and slightly newer GPUs are a lot more programmable and can do the simpler versions of that kind of calculation both faster and much more in parallel.
The Cell processor turned out to be an evolutionary dead end. The resources needed to work on it (expensive developer time) just didn't really make sense for a gaming machine. The things that it was better at, are things that it just wasn't quite good enough at - modern GPUs are Bitcoin monsters, far exceeding what the cell can do, and if you're really serious about crypto breaking then you probably have your own ASICs. Lots of identical, fast CPU cores are what developers want to work on - it's much easier to reason about.
PS3 most certainly had a separate GPU - was based on the GeForce 7800GTX. Console GPUs tend to be a little faster than their desktop equivalents, as they share the same memory. Rather than the CPU having to send eg. model updates across a bus to update what the GPU is going to draw in the next frame, it can change the values directly in the GPU memory. And of course, the CPU can read the GPU framebuffer and make tweaks to it - that's incredibly slow on desktop PCs, but console games can do things like tone mapping whenever they like, and it's been a big problem for the RPCS3 developers to make that kind of thing run quickly.
The cell cores are a bit more like the 'tensor' cores that you'd get on an AI CPU than a full-blown CPU core. They can't speak to the RAM directly, just exchange data between themselves - the CPU needs to copy data in and out of them in order to get things in and out, and also to schedule any jobs that must run on them, they can't do it themselves. They're also a lot more limited in what they can do than a main CPU core, but they are very very fast at what they can do.
If you are doing the kind of calculations where you've a small amount of data that needs a lot of repetitive maths done on it, they're ideal. Bitcoin mining or crypto breaking for instance - set them up, let them go, check in on them occasionally. The main CPU acts as an orchestrator, keeping all the cell cores filled up with work to do and processing the end results. But if that's not what you're trying to do, then they're borderline useless, and that's a problem for the PS3, because most of its processing power is tied up in those cores.
Some games have a somewhat predictable workload where offloading makes sense. Got some particle effects - some smoke where you need to do some complicated fluid-and-gravity simulations before copying the end result to the GPU? Maybe your main villain has a very dramatic cape that they like to twirl, and you need to run the simulation on that separately from everything else that you're doing? Problem is, working out what you can and can't offload is a massive pain in the ass; it requires a lot of developer time to optimise, when really you'd want the design team implementing that kind of thing; and slightly newer GPUs are a lot more programmable and can do the simpler versions of that kind of calculation both faster and much more in parallel.
The Cell processor turned out to be an evolutionary dead end. The resources needed to work on it (expensive developer time) just didn't really make sense for a gaming machine. The things that it was better at, are things that it just wasn't quite good enough at - modern GPUs are Bitcoin monsters, far exceeding what the cell can do, and if you're really serious about crypto breaking then you probably have your own ASICs. Lots of identical, fast CPU cores are what developers want to work on - it's much easier to reason about.
So what you're saying is that Cell 2 is gonna bring back cool fluid and cloth simulation ๐