Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Normalizing PCM Audio
QtWS25 Last Chance

Normalizing PCM Audio

Scheduled Pinned Locked Moved Solved General and Desktop
qaudioinputaudio engineaudio waveformqiodevice
13 Posts 3 Posters 3.1k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Kent-DorfmanK Offline
    Kent-DorfmanK Offline
    Kent-Dorfman
    wrote on last edited by
    #3

    Ok, so looks like you're asking about a complete sound sample and not "on the fly", which is good...because you really cannot normalize sound "on the fly". Keep in mind that sound energy is non-linear. It's propagating in 3 dimensions so the dB scale is logarithmic, and not linear. Mathematical mid point will be based on the PCM format you are using: signed vs unsigned, but as I mentioned 128 as a midpoint of [0..255] signed is not an auditory midpoint. you should probably map your midpoint based on a logarithmic scale in the available range, and be careful about signed conversions. I never use signed data to represent PCM data because the electronics of the sound card are always some positive voltage level.

    R 1 Reply Last reply
    1
    • Kent-DorfmanK Kent-Dorfman

      Ok, so looks like you're asking about a complete sound sample and not "on the fly", which is good...because you really cannot normalize sound "on the fly". Keep in mind that sound energy is non-linear. It's propagating in 3 dimensions so the dB scale is logarithmic, and not linear. Mathematical mid point will be based on the PCM format you are using: signed vs unsigned, but as I mentioned 128 as a midpoint of [0..255] signed is not an auditory midpoint. you should probably map your midpoint based on a logarithmic scale in the available range, and be careful about signed conversions. I never use signed data to represent PCM data because the electronics of the sound card are always some positive voltage level.

      R Offline
      R Offline
      rtavakko
      wrote on last edited by
      #4

      @Kent-Dorfman Thanks a lot for your response! Yes, the writeData method gives me a buffer of data which has a format already determined by the QAudioFormat of the device supplying it (including unsigned / signed).

      I understand the need to convert to a logarithmic scale and do this at a later stage when I take the FFT of the data but I need a reference amplitude for that conversion and I'm not sure what I need to use for that.

      One thing I noticed is that you can set the data type (unsigned / signed) of the QAudioFormat but whether or not it actually sets will depend on if that setting is supported by the device. I'll try messing around with that to see if it does something useful.

      1 Reply Last reply
      0
      • R Offline
        R Offline
        rtavakko
        wrote on last edited by
        #5

        Any more thoughts on this guys? I'm stuck on this

        1 Reply Last reply
        0
        • SGaistS Offline
          SGaistS Offline
          SGaist
          Lifetime Qt Champion
          wrote on last edited by
          #6

          Hi,

          You might want to check the DSPfilters. It might offers you what you need.

          Interested in AI ? www.idiap.ch
          Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

          R 1 Reply Last reply
          1
          • SGaistS SGaist

            Hi,

            You might want to check the DSPfilters. It might offers you what you need.

            R Offline
            R Offline
            rtavakko
            wrote on last edited by
            #7

            @SGaist Thanks for that link! The library looks cool, I'm going through the source to see how they did things but I'm trying to build my own little audio engine.
            Do you know by any chance if there is a set standard for how an audio signal is supposed to be processed, as in the midpoint, min and max levels?

            SGaistS 1 Reply Last reply
            0
            • R rtavakko

              @SGaist Thanks for that link! The library looks cool, I'm going through the source to see how they did things but I'm trying to build my own little audio engine.
              Do you know by any chance if there is a set standard for how an audio signal is supposed to be processed, as in the midpoint, min and max levels?

              SGaistS Offline
              SGaistS Offline
              SGaist
              Lifetime Qt Champion
              wrote on last edited by
              #8

              @rtavakko said in Normalizing PCM Audio:

              Do you know by any chance if there is a set standard for how an audio signal is supposed to be processed, as in the midpoint, min and max levels?

              I am sorry but I am not sure to understand exactly what you are looking for.

              Interested in AI ? www.idiap.ch
              Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

              R 1 Reply Last reply
              0
              • SGaistS SGaist

                @rtavakko said in Normalizing PCM Audio:

                Do you know by any chance if there is a set standard for how an audio signal is supposed to be processed, as in the midpoint, min and max levels?

                I am sorry but I am not sure to understand exactly what you are looking for.

                R Offline
                R Offline
                rtavakko
                wrote on last edited by rtavakko
                #9

                @SGaist The issue I'm stuck solving is that I'm not sure where a 'silent' audio signal that is represented as signed int data should be.

                For example 16-bit WAV files are in the range -32,768 to 32,767 and 0 is the midpoint but the 8-bit signal that I get from a microphone or other live feeds are in the -128 to 127 range with -128 also being the midpoint (silent signal).

                So I'm stuck trying to find a universal way to normalize audio to the 0 - 1 range with 0.5 being the midpoint. Anything I've read so far suggests that 0 should always be the midpoint for signed audio but I don't know for sure at this point

                Kent-DorfmanK 1 Reply Last reply
                0
                • R rtavakko

                  @SGaist The issue I'm stuck solving is that I'm not sure where a 'silent' audio signal that is represented as signed int data should be.

                  For example 16-bit WAV files are in the range -32,768 to 32,767 and 0 is the midpoint but the 8-bit signal that I get from a microphone or other live feeds are in the -128 to 127 range with -128 also being the midpoint (silent signal).

                  So I'm stuck trying to find a universal way to normalize audio to the 0 - 1 range with 0.5 being the midpoint. Anything I've read so far suggests that 0 should always be the midpoint for signed audio but I don't know for sure at this point

                  Kent-DorfmanK Offline
                  Kent-DorfmanK Offline
                  Kent-Dorfman
                  wrote on last edited by Kent-Dorfman
                  #10

                  @rtavakko

                  For PCM audio a silent signal is represented by a a contiuous stream of values that are the same. It is the changes in amplitude that form the sound waveform. You can have silence at any output amplitude if the sample values don't change. Obviously any changes to those values create a waveform. So you cannot look for silence in the method you are thinking.

                  If you use 8khz as your carrier and create a u16 stream of shorts such as

                  16384,0,16384,0,16384,0... then you will get a loud 8khz (harsh) tone.

                  1000,0,1000,0,1000,0... give the same harsh 8khz tone, but at a greatly diminished volume.

                  any stream of x,x,x,x,x,x,x... will create silence.

                  download and play with audacity, and programmatically create audio files to experiment with different effects: sin, square, sawtooth waveforms of different amplitudes.

                  EDIT - actually I screwed up. If the sample rate is 8khz, then you can only reproduce frequencies up to 4khz, since it's the change that forms the wave, not the data points themselves.

                  R 1 Reply Last reply
                  0
                  • Kent-DorfmanK Kent-Dorfman

                    @rtavakko

                    For PCM audio a silent signal is represented by a a contiuous stream of values that are the same. It is the changes in amplitude that form the sound waveform. You can have silence at any output amplitude if the sample values don't change. Obviously any changes to those values create a waveform. So you cannot look for silence in the method you are thinking.

                    If you use 8khz as your carrier and create a u16 stream of shorts such as

                    16384,0,16384,0,16384,0... then you will get a loud 8khz (harsh) tone.

                    1000,0,1000,0,1000,0... give the same harsh 8khz tone, but at a greatly diminished volume.

                    any stream of x,x,x,x,x,x,x... will create silence.

                    download and play with audacity, and programmatically create audio files to experiment with different effects: sin, square, sawtooth waveforms of different amplitudes.

                    EDIT - actually I screwed up. If the sample rate is 8khz, then you can only reproduce frequencies up to 4khz, since it's the change that forms the wave, not the data points themselves.

                    R Offline
                    R Offline
                    rtavakko
                    wrote on last edited by
                    #11

                    @Kent-Dorfman I understand the concept but I'm still not sure how I would go about normalizing the signal. I'm still processing the signal as it comes in as an instantaneous set of values. Do I need to compare each value in the array to the previous one and set it to the lower limit of a dBFS scale if they are equal?

                    1 Reply Last reply
                    0
                    • R Offline
                      R Offline
                      rtavakko
                      wrote on last edited by
                      #12

                      Still trying to figure this out. Any thoughts on converting to the right scale (log scale seems to be appropriate) would help

                      1 Reply Last reply
                      0
                      • R Offline
                        R Offline
                        rtavakko
                        wrote on last edited by
                        #13

                        After a few months of searching for an definitive answer to this topic, I've reached the conclusion that my original assumption would be correct. This page describes how to determine the midpoint of a standard PCM audio signal:

                        https://gist.github.com/endolith/e8597a58bcd11a6462f33fa8eb75c43d

                        For example an 8-bit signed PCM signal has these ranges:

                        Min: -128
                        Max: 128
                        Midpoint: 0

                        As to why the signal I'm getting from my soundcard sits at -128 when there is no sound, I'm going to assume that this is related to a driver problem or could be that this particular piece of hardware does not follow the PCM standard.

                        Converting to the logarithmic scale in my understanding is not related to this issue because you should be able to normalize the signal in time-domain even though eventually you will most likely need to convert it to the log scale if you are doing anything in the frequency domain (e.g. FFT).

                        If anyone has any input, please feel free to add it.

                        1 Reply Last reply
                        0

                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • Users
                        • Groups
                        • Search
                        • Get Qt Extensions
                        • Unsolved