Try the IQ4_XS quant of mistral nemo
If you want a more roleplay based model with more creativity at the cost of other things you can try the arliai finetune of nemo.
If you want the model to remember long term you need to bump its context size up. You can trade GPU layers for context size or go down a quant or go to a smaller model like llama 8b.