Capturing & forwarding raw audio from microphone to speaker hogs UI.

johnco3

I have a Windows based QT 6.5.2 Multimedia Application that uses a worker thread to handle the low level audio.

This worker thread captures audio from the default input device (microphone) and immediately copies/forwards the captured audio data to the default output device. This works, however the UI becomes non responsive and the buffer keeps growing while the audio is being forwarded. Both the microphone and the speaker are in 'pull' mode. I need help to figure out how to correctly setup a buffer object (I don't think it has to be thread safe, as all read/write accesses occur from the worker thread). I need to know how to sleep/yield or wait for readyRead for ex. when either the producer encounters a full buffer or the consumer requests audio from an empty buffer. I also need help with choosing an appropriate SPSC buffer that can be limited in size and acts as a circular buffer instead of a growing QBuffer with the readPos chasing the writePos and growing indefinitely - this buffer object will have to play nice with the 'bytesAvailable' callback from the pull audio client (the speaker).

From the input device (microphone/producer) perspective, the worker thread sets up a connection to capture the raw audio data as follows:

connect(mpSourceDevice.get(), &SourceDevice::audioAvailable, this, &RtpWorker::handleAudioAvailable);

The 'RtpWorker::handleAudioAvailable' slot is called regularly in the worker thread context with the captured audio data rAudioData as a parameter that needs to be sent directly to the speaker. The raw audio is copied through an intermediary buffer associated with the QAudioSink in 'pull' mode.

This buffer acts as a Single Producer Single Consumer SPSC circular buffer named mpSinkDevice containing a separate readPos and writePos members. Under the covers this is just a QBuffer subclass (QBuffer is already a QIODevice subclass). I initially tried to setup this buffer as a sequential device, as that would make the SPSC handling a lot easier - with the 'bytesAvailable' indicating the number of bytes sitting in the buffer that were not sent to the speaker. As I could not get this to work I setup the SPSC buffer in default non sequential mode (supporting random access and seek as you can see from the code below).

void
RtpWorker::handleAudioAvailable(const QAudioBuffer& rAudioBuffer) const {
    // append captured audio to mpSinkDevice's QBuffer internal FIFO
    // remember to seek back to make the bytes available
    const auto pos = mpSinkDevice->pos();
    mpSinkDevice->write(rAudioBuffer.constData<
        const char>(), rAudioBuffer.byteCount());
    mpSinkDevice->seek(pos);
}

From the output device's perspective (the speaker), the sink is setup in 'pull' mode. This means that the speaker ('QWindowsAudioSink' which contains a pointer to this shared SPSC buffer QIODevice) will fire a 'consumer' timer when it needs audio data for the speaker. The timer will call the following function within QT's audio sink implementation QWindowsAudioSink::pullSource(). This callback function has access to the m_pullSource (which is a pointer to the SinkDevice below - which it queries for the raw audio data.

class SinkDevice : public QBuffer {
    Q_OBJECT
public:
    /**
     * Explicit constructor
     *
     * @param parent        [in] parent.
     */
    explicit SinkDevice(QObject* parent = nullptr)
        : QBuffer(parent)
        , readPos(0)
        , writePos(0) {}

    /**
     * Explicit constructor
     *
     * @param byteArray     [in] array of multi-channel audio
     *                      samples.
     * @param parent        [in] parent.
     */
    explicit SinkDevice(QByteArray* byteArray, QObject* parent = nullptr)
        : QBuffer(byteArray, parent)
        , readPos(0)
        , writePos(0)
    {}

    ~SinkDevice() override = default;

    //! Audio device should give sequential access
    [[nodiscard]] bool isSequential() const override {
        return false;
    }

    /**
     * Start the IO device - open in read/write mode
     * so it can act like a ring buffer.
     */
    void start();

    /**
     * Close IO device.
     */
    void stop();

    qint64 readData(char* data, qint64 maxlen) override;
    qint64 writeData(const char* data, qint64 len) override;
    [[nodiscard]] qint64 bytesAvailable() const override;
private:
    // disable copy & move semantics on QObject subclasses
    // https://www.cleanqt.io/blog/why-qobject-subclasses-are-not-copyable
    Q_DISABLE_COPY_MOVE(SinkDevice)
    qint64 readPos;
    qint64 writePos;
};


/**
 * Start the IO device - open in read/write mode
 * so it can act like a ring buffer.
 */
void
SinkDevice::start()
{
    open(ReadWrite);
}

/**
 * Close IO device.
 */
void
SinkDevice::stop()
{
    close();
}

qint64
SinkDevice::readData(char* data, qint64 maxlen)
{
    const auto nBytesRead = 
        QBuffer::readData(data, maxlen);
    if(nBytesRead > 0) {
        readPos += nBytesRead;
    }
    return nBytesRead;
}

qint64
SinkDevice::writeData(const char* data, qint64 len)
{
    const auto nBytesWritten = 
        QBuffer::writeData(data, len);
    if (nBytesWritten > 0) {
        writePos += len;
    }
    return nBytesWritten;
}

qint64
SinkDevice::bytesAvailable() const
{
    return writePos - readPos;
}

Bonnie

I don't think you should use QBuffer here. QBuffer doesn't delete the data after readData, so it will keep growing the size. You can just subclass QIODevice and use QByteArray to store the data and make it sequential.
Something like

class SinkDevice : public QIODevice
{
    Q_OBJECT
.......
private:
    QByteArray m_buffer;
};

bool SinkDevice::isSequential() const
{
    return true;
}

qint64 SinkDevice::bytesAvailable() const
{
    return m_buffer.size() + QIODevice::bytesAvailable();
}

qint64 SinkDevice::readData(char *data, qint64 maxlen)
{
    auto len = qMin(maxlen, m_buffer.size());
    memcpy(data, m_buffer.constData(), len);
    m_buffer.remove(0, len);
    return len;
}

qint64 SinkDevice::writeData(const char *data, qint64 len)
{
    m_buffer.append(data, len);
    emit readyRead();
    return len;
}

But I also don't think using QBuffer here will hog the UI. So you may have some other problems in your code which you haven't posted.

johnco3

@Bonnie Thank you for the reply.

Since my last post, I made some significant progress by debugging the Windows Qt multimedia source. I swapped QBuffer for a QRingBuffer. As it turns out, the performance issues were not related to the QBuffer in my previous implementation, however I prefer to use a lighter weight object as the Single Producer Single Consumer (SPSC) buffer between the microphone (the default input device) and the speaker (the default output device).

The remaing problem that I need help with is how to schedule a restart after I encounter a buffer underrun or Eof condition (where there are no bytes available in the SPSC buffer).

I swapped QBuffer for Qt's private QRingBuffer class which is used in other multimedia QIODevice derived objects. This class is not really well documented but it is relatively straight forward to understand - this link shows the impmentation). The ring buffer is basically made up from a list of RingChunks - each of which is effectively a wrapper around a QByteArray (with supporting head and tail offsets).

The problem now is that I once the pull mode AudioSink encounters an error in its timer callback method - in this case the pullSource method (see below for Qt's windows AudioSink implementation) (QWindowsAudioSink::pullSource), the audio output changes its state to QAudio::IdleState with QAudio::IOError or QAudio::UnderrunError, and stops the pull timer. The pull timer callback is resposible for requesting the next raw chunk of audio from the QRingBuffer via the m_pullSource->read(readLen)) and writing it to the speaker ourput device. Meanwhile the capture slot in my worker thread keeps appending microphone data to this shared mpSinkDevice - so the buffer keeps growing (which would prevent the buffer underrun/eof condition) but I have no idea how to restart the audio output.

void
RtpWorker::handleAudioAvailable(const QAudioBuffer& rAudioBuffer) const {

    // append captured audio to mpSinkDevice's QRingBuffer
    mpSinkDevice->write(rAudioBuffer.constData<
        const char>(), rAudioBuffer.byteCount());
}

The m_pullSource field in the code below is a pointer to the QIODevice containing the QRingBuffer which was opened for Read/Write (write is required by the audio capture slot to copy the raw audio from the microphone to the QRingBuffer).

void QWindowsAudioSink::pullSource()
{
    qCDebug(qLcAudioOutput) << "Pull source";
    if (!m_pullSource)
        return;

    auto bytesAvailable = m_pullSource->isOpen() ? qsizetype(m_pullSource->bytesAvailable()) : 0;
    auto readLen = qMin(bytesFree(), bytesAvailable);
    if (readLen > 0) {
        QByteArray samples = m_pullSource->read(readLen);
        if (samples.size() == 0) {
            deviceStateChange(QAudio::IdleState, QAudio::IOError);
            return;
        } else {
            write(samples.data(), samples.size());
        }
    }

    auto playTimeUs = remainingPlayTimeUs();
    if (playTimeUs == 0) {
        deviceStateChange(QAudio::IdleState, m_pullSource->atEnd() ? QAudio::NoError : QAudio::UnderrunError);
    } else {
        deviceStateChange(QAudio::ActiveState, QAudio::NoError);
        m_timer->start(playTimeUs / 2000);

    }
}

Here is my SinkDevice

//! Modeled after the Generator class example from QT 6.x audio output example.
class SinkDevice : public QIODevice {
    Q_OBJECT
public:
    /**
     * Explicit constructor
     *
     * @param parent        [in] parent.
     */
    explicit SinkDevice(QObject* parent = nullptr)
        : QIODevice(parent)
        , mBuffer{}
    {}

    /**
     * Explicit constructor
     *
     * @param rByteArray    [in] array of multi-channel audio
     *                      samples.
     * @param parent        [in] parent.
     */
    explicit SinkDevice(const QByteArray& rByteArray, QObject* parent = nullptr)
        : QIODevice(parent)
        , mBuffer{}
    {
        mBuffer.append(rByteArray);
    }

    ~SinkDevice() override = default;

    /**
     * Start the IO device - open in read/write mode
     * so it can act like a ring buffer.
     */
    void start();

    /**
     * Close IO device.
     */
    void stop();

    //! Audio device should give sequential access
    [[nodiscard]] bool isSequential() const override {
        return true;
    }

    [[nodiscard]] qint64 bytesAvailable() const override {
        return mBuffer.size() + QIODevice::bytesAvailable();
    }

    // Our size
    [[nodiscard]] qint64 size() const override {
        return mBuffer.size();
    }
protected:
    [[nodiscard]] qint64 readData(char* data, qint64 maxlen) override;
    [[nodiscard]] qint64 writeData(const char* data, qint64 maxlen) override;
private:
    // disable copy & move semantics on QObject subclasses
    // https://www.cleanqt.io/blog/why-qobject-subclasses-are-not-copyable
    Q_DISABLE_COPY_MOVE(SinkDevice)
    void generateData(const QAudioFormat& format, qint64 durationUs, int sampleRate);
    QRingBuffer mBuffer;
};