• Mic_Check_One_Two@reddthat.com
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    2 days ago

    To me it’s much more unclear how sound is first encoded into a digital signal, transmitted as a digital signal through wires and radio waves, and then translated back into sound in a phone. I mean it’s essentially the same physics as the analog electronics, just with a bunch of extra steps added.

    Yeah, this is where sample rate and bit depth come into play. In case you’re curious, digital audio is possible due to the Nyquist-Shannon Sampling Theorem. The TL;DR is that you don’t record a continuous stream of audio data; You just sample the wave at regular intervals by recording the current amplitude. And then you can recreate it on the other end. The theorem states that an analog wave can be perfectly recorded and replicated, as long as you have a sufficiently high sample rate and bit depth. Since human hearing generally tops out at 20kHz, we need to sample the audio signal at least 40k times per second; Most consumer-grade audio equipment uses 44.1 or 48kHz. Phones actually use a much lower sample rate for calls, but more on that later.

    Again, as long as your sample rate is at least 2x the rate of the highest frequency being recorded, you’re able to perfectly recreate the wave. For an example, here’s a gif:

    The image on the left shows the wave being recorded, and the dots are samples. As you add more samples, the reproduced wave gets more accurate. By the time you have 2x the fastest frequency, there is only one possible wave that will fit every sample. Again, human hearing tops out around 20kHz, so we use a sample rate just above 40kHz.

    Phone calls will often put a filter on the high and low ends, and only capture the mid-range. It gives that distinct “this is shitty phone call quality” sound, but means they can use a much lower sample rate; Since they’re lopping off most of the high end with that filter, they may only need a sample rate closer to 15 or 20kHz. Because fewer samples means less data. The intelligibility happens in the mid-range, so that’s what the phone makers (and telecom companies) focus on. This low sample rate is also why hold music sounds so fucking awful. It’s essentially being passed through a “make this sound as shitty as possible while still being intelligible” filter.

    And then bit depth simply determines how detailed each sample is. If you use 8 bits per sample, that gives you 256 potential values per sample. 12 bits gives you 4096. The trade-off is that a higher bit depth means each sample takes exponentially more data; Audiophiles will generally push for higher bit depths, so each sample is more accurate. In contrast, phone calls often use lower bit depths, (again, to save data).

    As for how it actually transmits the data, that’s just 1’s and 0’s. It’s a little more complicated than that, (packets, for example) but in the digital realm, as long as the 1’s and 0’s get to where they need to be, you’re good to go.

    • conditional_soup@lemm.ee
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 days ago

      Holy shit, so I’m not just uniquely terrible at understanding people on the phone? I’ve searched so long for a phone that does high-quality phone calls, and I can’t believe I never figured that it was a problem with both the phones and the carriers.

      • Trainguyrom@reddthat.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 day ago

        Yup it’s also why hold music sounds terrible is the sample rate and ranges are so small there’s basically no music which would sound decent over the connection