34
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 28 Jan 2026
34 points (100.0% liked)
Gaming
4350 readers
135 users here now
The Lemmy.zip Gaming Community
For news, discussions and memes!
Community Rules
This community follows the Lemmy.zip Instance rules, with the inclusion of the following rule:
- No NSFW content
You can see Lemmy.zip's rules by going to our Code of Conduct.
What to Expect in Our Code of Conduct:
- Respectful Communication: We strive for positive, constructive dialogue and encourage all members to engage with one another in a courteous and understanding manner.
- Inclusivity: Embracing diversity is at the core of our community. We welcome members from all walks of life and expect interactions to be conducted without discrimination.
- Privacy: Your privacy is paramount. Please respect the privacy of others just as you expect yours to be treated. Personal information should never be shared without consent.
- Integrity: We believe in the integrity of speech and action. As such, honesty is expected, and deceptive practices are strictly prohibited.
- Collaboration: Whether you're here to learn, teach, or simply engage in discussion, collaboration is key. Support your fellow members and contribute positively to shared learning and growth.
If you enjoy reading legal stuff, you can check it all out at legal.lemmy.zip.
founded 2 years ago
MODERATORS
I notice you are using a nineteen point rating scale, going from 1 to 10 with halves in between and a slider. You will get better ratings if you use a more standard scale that's compatible with other sites and a better method for inputting ratings.
You'll want to link your rating data to other sites (eg, backloggd.com, igdb.com) if you have any hope of this being used, so that's why compatibility is valuable. Mapping a 10 point scale to a 19 point scale is a silly wrench to throw in, and how will you translate your users' 19 point scale to push to sites with 10? You need to be able to keep users from entering scores over again to survive at all.
As to entry, something almost everything gets wrong is you actually get better data if you present ratings with the right number of points to the scale and use a tiered grouping (visually, not as in requiring a series of questions for a single rating). There's basically a right answer here, and its 10 points grouped 3-4-3. The grouping helps cognitively because you're basically picking high-mid-low twice instead of analyzing a 10 point spread. People are significantly statistically worse at using a wide, flat rating scale, and the two-tier version corrects that and gives you richer and more accurate data, especially if you label the tiers, to help reduce individual bias about how they apply their feelings to numbers (eg the modern 6/10=bad syndrome).
We need better rating analyzers than we have, but it will never work without connecting to other rating systems and processing games outside itch.io. And if you keep your recommendation mechanism under wraps with only manual rating entries, especially limited to itch.io games, you're asking far too much from someone to see if it's potentially relevant to them, both in the sense of effort and the sense of trust ("non-biased, community driven").
I also saw a solution for normalizing the scores of every person to battle this bias. This is used in Criticker system (recommendation for movies):
https://www.criticker.com/critickers-algorithm-explained/
Currently, I'm investigating if this is something my recommendation algorithm needs, some maybe I will implement some kind of normalization for scores on Gamescovery, we will see.
Sounds good. Doesn't actually work :/ Sure, if everyone gave a statistically valid data spread covering every rating point, then you could probably normalize them so it doesn't matter what numbers an individual used. But people don't do that. Maybe someone only rates 8-10, but is that because they like everything, because they don't rate anything they didn't like, because they think an 8 is bad, because they just lump everything they don't like in the "8 or below" group, or some other random thing? They don't know, and what about the obvious fact that most everyone watches more movies they rate good than bad, so ratings have a huge implicit skew to the distribution? They don't know that either, but they scale the ratings anyway, and that's some of why they don't really work if you get down to it. The rest is just that their analysis concept is broken.
I actually use criticker for my movie rating, and it doesn't really do me any good (but it'd be a pain to move everything, so I haven't :). Their system still falls prey to the usual issues, just not as obviously as say Steam which basically just always throws the most popular candidate it can shoehorn into a rec. If you have weird taste, you get grouped with rating profiles that happen to agree enough on something, but that don't actually have real connection to your taste. Eg, if I like some movies everyone likes (and let's face it, we pretty much all have some close-enough-to-universally appreciated likes), my "close rating" users will be focused on people who also liked those movies, and a lot of meaningful stuff becomes noise, but one's taste is much more in the noise data than in the big obvious strokes. Alternatively, if I watch and like some fringe thing no one sees, suddenly anyone else who did is closer to me, mainly because there's so little data in common between us to go on.
Criticker is convinced I love esoteric foreign drama (I really don't), because I scour deep into horror during part of the year and occasionally find a gem that gets a good rating, often from some dark corner of Asia. They also think my 50 is 77th percentile, probably for the same reason (ie I do have a lot of low ratings, because I'm watching things just because they are horror). A 50 is where I put "pretty decent/not really that good" stuff, which seems a lot lower than 77 to me, but I can't tell Criticker that because of their "helpful" scaling. After my partner (who watches basically everything I do and has very similar taste), the next closest TCI (their code for how close your normalized ratings are to someone else's, and the basis for their rating prediction) comes at thirty. That basically says they're useless, which is more accurate than any given rating prediction they generate for me, with my mere 1,845 ratings to go by ;)
I really think one needs to find and minimize the "common" elements to focus on the uncommon in rating analysis, and in prediction. Eg if people tend to like X but I don't, that actually means a lot more than if I also like X. And recommending I rush out to watch The Godfather (thanks again, Criticker, never heard that one...), doesn't do me any good, because everyone already knows it. It's an "easy" rec, but it's not a good rec.
If Criticker used the 3-4-3 system for their ratings instead of telling us it will just work out, that would lead me to apply my numbers differently, which on its own is kinda telling for improving their data. I didn't make up the 3-4-3 thing, BTW. I was working on a related web/database project, and that was passed on to me as studied and statistically well-proven for producing better survey results (and that was someone from an industry that definitely cares about that). Does make a lot of sense, though. It's nice when something has a clear right answer like that... except you get a little frustrated seeing nothing actually use it ;)
I see, thanks for your opinion, I will definitely take that into account. I also find Criticker not that useful, although I have less rates than you have (around 400 rated things). And as you, I also believe that trying to compare me to other people to find "commons" - it's not the best ways ot do recommendations.
Thanks again for that 3-4-3 system, I started doing my research yesterday, pretty sure it will end up implemented in Gamescovery until the beta.
Hi, thanks very much for this feedback, love it ❤️
I never heard about backloggd.com, but it's a good long-term thinking from you, that if the project survives to a stage where it has a big userbase, it's a good idea to have compatibility or "plugins" with other trackers and data sources.
I also like your idea with the rating scale, I will definitely think about implementing your idea or some variation of it.
Yeah, it would be stupid to lock the project inside of itch.io games only. I started from itch.io and indie games for a few reasons:
In the future, I definitely plan to support all popular PC game stores in the following order:
I also think about the support of the consoles, but this will be in the rather distant future.
I have one more question, if you don't mind - what is your feeling about game recommendations after you rated 3-4 games? Were recommendations lean towards predicable "correct" way, or were they completely random and off?
I didn't rate any games, just looked at what it would take and had some quick feedback to offer. Part of the issue with Itch is that to rate games, you have to first find things on itch, as well as find things that'd be representative so you might see how recs do. For testing something that isn't going to do much right now, that's a fair bit of trouble, especially since my key interest would be whether recommendations really take taste into account or use one of the usual shortcuts that either lump you into categories or fall prey to the "well everyone likes X so X" syndrome. Either of those would take a fair bit of data for me to put in, and a rather surprising amount of data for you to already have at such an early stage.
Hi, I see, hopefully you will be willing to participate in further testing (for example, in beta) when the project is in better shape.
The only reason I bring the current alpha to the public is to test the concept and see if people are interested in it at all. I spent around 1 year (1 year of time, not of working hours) to make the current alpha, and there is no sense in spending one more year on a project nobody actually wants. For now, feedback was somewhat positive, so I want to continue and see what I will build next.
The main idea of my recommendation algorithm is to calculate the unique test for every user. It doesn't and wouldn't compare the tastes of different users to calculate assumptions. I hate this, and those kinds of recommendation algorithms seem to never work for me. When doing my research on the beginning of the project, I found that such algorithms were first used for social media, but I don't feel these algorithms are correct (as I feel it, I can't prove this with real numbers for now).
So, hopefully, Gamescovery recommendation algorithms wouldn't have biases like "well everyone likes X so X”, since it never tries to compare 2 or more users. Besides that, Gamescovery will allow users to tweak the algorithm so that users can actually customize it to make the algorithm better for them. That doesn't mean users will be able to completely change the behavior of the algorithm, but rather direct it in a direction they want.