546
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 14 Feb 2024
546 points (97.4% liked)
Technology
59259 readers
1375 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
Even ignoring privacy arguments, I think that voice control is a great use case for running services locally - lower latency due to not having up upload your sample and the option of having it learn your accent is very attractive.
That said, voice control is irritatingly error-prone and seems to be slower than just reaching for the remote control. I agree that automatic stuff would be best, but some stuff you can't have rules for.
Something that would be interesting is a more eye- and gesture-based system: I'm thinking something like you look at the camera and slice across your throat for stop or squeeze fingers together to reduce volume. This is definitely one to run locally, for privacy and performance reasons.
Assistive technology has been focused on this for a while.
My brother had severe cerebral palsy and for years (80s-90s) communicated via analog technology, a literal alpha/iconography communication board, which he could tap on with a head wand. By 2000 he had a digital voice, but still had to use a wand.
Stephen Hawking demonstrated eye sensing technology almost as soon as it was invented and that’s been over a decade ago.
In most cases, there is a definite aspect of “bespokeness” to implementing assistive consumer communication technology, but the barriers implementing the same for an able audience would appear much lower.
But where do you put the camera? If you're sitting in front of the TV, then near the TV makes sense. What if you're sitting facing a different direction with a book though? What if your hands are full?
A camera based system would be much more limited, and probably wouldn't work in the dark.
You're assuming that we can't have both. Why not have it as an complementary input?
I think looking at a device and talking is better than saying hey $brandname before everything, but having both would be better still.