15

Maybe this is just my phone (and laptop), but volume control is irritating when some tracks are configured so that I need to set the volume to 70-80% and some tracks are so "naturally loud" that the lowest setting (5% ish for my phone) is distractingly loud.

On some of my tracks (especially for the classical music ones), within the same track I need to change the volume from 20% to 80% depending on what part I am listening to if I want to hear everything without killing my ear drums.

I get that it would be difficult to do anything about this for streaming or live audio since the phone doesn't know in advance what the input will be, but for a pre-recorded mp3 file, couldn't my phone do some digital signal processing?

Do I just have terrible electronic items and is this an issue anyone else experiences? Ot is this problem just harder to solve than I am expecting?

you are viewing a single comment's thread
view the rest of the comments
[-] The_Grinch@hexbear.net 3 points 3 months ago

I have a semi-related question if you don't mind. People often complain about the voice tracks in movies being hard to hear, especially if you don't have a speaker for the center channel (but even then I have trouble)

Why haven't they solved this problem by packaging the voice track separately on the bluray/stream so you can turn up the volume of the voices only without blowing your ears out when the music hits?

[-] dizzy@lemmy.ml 4 points 3 months ago

I don't know why they don't, I work in music rather than TV/Film but it infuriated me too! Give me a voice volume control! It would be technically very easy to do implement as a standard but the powers that be just haven't come together and done it!

[-] The_Grinch@hexbear.net 3 points 3 months ago

I'm glad to hear I'm not the only one thinking it!

Do you think it could be done by diffing a few of the different language tracks?

[-] dizzy@lemmy.ml 3 points 3 months ago* (last edited 3 months ago)

Unfortunately no, audio files are actually really dumb in that they’re basically just a file of 44100 (or 48000 or 96000 etc) amplitude numbers per second.

So there’s nothing really to diff because it’s basically just a squiggly line, set of squiggly lines or, when compressed, a mathematical expression that when decompressed, recreates a squiggly line.

You could isolate the dialog if you got ahold of a version with no dialog at all and then inverse the polarity of that and sum it with the original but it’s unlikely you’ll find a version without any vocals.

Machine learning vocal isolation tools are probably going to be the best way to go about it as a DIY approach. Ultimate Vocal Remover 5 with the demucs 4 algo is great FOSS software to extract vocals and you could sum that with the original track and adjust the gain to get louder dialogue… it would be a lot of work though…

[-] The_Grinch@hexbear.net 2 points 3 months ago

I don't really understand still but thanks for trying all the same. 07

this post was submitted on 24 Aug 2025
15 points (100.0% liked)

Asklemmy

51698 readers
516 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 6 years ago
MODERATORS