Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. QtConcurrent vs QThread CPU Usage
QtWS25 Last Chance

QtConcurrent vs QThread CPU Usage

Scheduled Pinned Locked Moved Solved General and Desktop
qtconcurrentqthreadcpu usagemultithreading
11 Posts 4 Posters 2.5k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R Offline
    R Offline
    rtavakko
    wrote on 18 Oct 2019, 18:51 last edited by JKSH
    #1

    Hey everyone,

    I've been using QtConcurrent::run to implement a live video playback / processing application with a typical thread implemented this way:

    QFutureWatcher<void>* inProcessWatcher = new QFutureWatcher<void>;
    
        QFuture<void> inProcess = QtConcurrent::run(this, &Mixer::processInput,vid);
    
        QObject::connect(inProcessWatcher, &QFutureWatcher<void>::finished, this, [=](){inputsProcessed(vid);delete inProcessWatcher;});
    
        inProcessWatcher->setFuture(inProcess);
    

    Everything seems to be running normally, I have 5 of these running constantly but not in parallel. What I mean is that each of my two video inputs has a thread which just gets an input frame and calls a next thread to process it so a maximum of one thread running per video.

    There is an output stage thread which constantly checks if there are processed frames in the buffer of each video so it can do some more stuff with them and send them out to be displayed. So overall there are 5 threads but only 3 running at the same time / parallel. My CPU usage is through the roof (~40-60% of an intel i7) and its threads are contemplating a strike.

    Could it be that the implementation above (a few short-lived threads every cycle) itself is more costly than having a couple of worker QThreads that live longer and perform the same function? Or is this related to the priority of a QtConcurrent thread not being set the same way as a QThread?

    Cheers!

    1 Reply Last reply
    0
    • S Offline
      S Offline
      SGaist
      Lifetime Qt Champion
      wrote on 26 Oct 2019, 16:49 last edited by
      #9

      Did you consider moving the image processing stuff to the GPU ?

      Interested in AI ? www.idiap.ch
      Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

      R 1 Reply Last reply 26 Oct 2019, 17:35
      1
      • S Offline
        S Offline
        SGaist
        Lifetime Qt Champion
        wrote on 19 Oct 2019, 06:36 last edited by
        #2

        Hi,

        How do you know that you only have 5 of these ?

        For a video with a classic 25 frames per second frame rate, I would expect to have 50 of them created during a second since you have two videos.

        Unless they can do their processing below the duration of a frame, I wouldn't be surprised about your result.

        Interested in AI ? www.idiap.ch
        Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

        R 1 Reply Last reply 19 Oct 2019, 13:20
        3
        • S SGaist
          19 Oct 2019, 06:36

          Hi,

          How do you know that you only have 5 of these ?

          For a video with a classic 25 frames per second frame rate, I would expect to have 50 of them created during a second since you have two videos.

          Unless they can do their processing below the duration of a frame, I wouldn't be surprised about your result.

          R Offline
          R Offline
          rtavakko
          wrote on 19 Oct 2019, 13:20 last edited by
          #3

          @SGaist Each cycle or frame would have 5 created but the number you mentioned sounds right adjusted for the framerate. For 30fps it would be 150 of them but still there would be 3 threads for each frame since frame reading / processing for each video has 2 threads working in series and output processingh has its own thread.

          So the large number of threads created every second is what drives CPU usage so high? If that's the case it would make sense to use the worker QThread model and have 3 of them start to finish.

          1 Reply Last reply
          0
          • S Offline
            S Offline
            SGaist
            Lifetime Qt Champion
            wrote on 20 Oct 2019, 19:02 last edited by
            #4

            I don't know how exactly your pipeline works so I can't really comment on that.

            Did you try to use a profiler to see what is happening where in your application ?

            Did you try to measure the performance of the methods you apply to video frame processing ?

            Interested in AI ? www.idiap.ch
            Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

            R 1 Reply Last reply 20 Oct 2019, 20:30
            1
            • S SGaist
              20 Oct 2019, 19:02

              I don't know how exactly your pipeline works so I can't really comment on that.

              Did you try to use a profiler to see what is happening where in your application ?

              Did you try to measure the performance of the methods you apply to video frame processing ?

              R Offline
              R Offline
              rtavakko
              wrote on 20 Oct 2019, 20:30 last edited by
              #5

              @SGaist I've some benchmarking using std::chrono to see how much time each section of the process takes but I'm not familiar with profiling tools. Any tools you would recommend? I came across this page:

              https://doc.qt.io/qtcreator/creator-cache-profiler.html

              O 1 Reply Last reply 21 Oct 2019, 08:03
              0
              • J Offline
                J Offline
                Jackson Leee
                wrote on 21 Oct 2019, 07:36 last edited by
                #6

                Maybe you need to provide more detailed content and data.....

                1 Reply Last reply
                1
                • R rtavakko
                  20 Oct 2019, 20:30

                  @SGaist I've some benchmarking using std::chrono to see how much time each section of the process takes but I'm not familiar with profiling tools. Any tools you would recommend? I came across this page:

                  https://doc.qt.io/qtcreator/creator-cache-profiler.html

                  O Offline
                  O Offline
                  ODБOï
                  wrote on 21 Oct 2019, 08:03 last edited by ODБOï
                  #7

                  hi @rtavakko said in QtConcurrent vs QThread CPU Usage:

                  profiling tools

                  see gammaray https://www.kdab.com/development-resources/qt-tools/gammaray/
                  https://doc.qt.io/GammaRay/index.html

                  1 Reply Last reply
                  1
                  • R Offline
                    R Offline
                    rtavakko
                    wrote on 26 Oct 2019, 15:51 last edited by
                    #8

                    Thanks guys I'll check out the cache profile. To give you a bit more detail about the overall process:

                    void Mixer::mix(unsigned int vid)
                    {
                        videoProcessStart[vid] = std::chrono::high_resolution_clock::now();
                    
                        QFutureWatcher<void>* inProcessWatcher = new QFutureWatcher<void>;
                    
                        QFuture<void> inProcess = QtConcurrent::run(this, &Mixer::processInput,vid);
                    
                        QObject::connect(inProcessWatcher, &QFutureWatcher<void>::finished, this, [=](){videoEffectsFinished(vid);delete inProcessWatcher;});
                    
                        inProcessWatcher->setFuture(inProcess);
                    }
                    
                    void Mixer::processInput(unsigned int vid)
                    {
                        videoFrameRead[vid] = readInput(vid);
                        addVideoEffectsMultiChannel(vid);
                    }
                    
                    void Mixer::videoEffectsFinished(unsigned int vid)
                    {
                        mix(vid);
                    }
                    

                    In Mixer::addVideoEffectsMultiChannel, I process each frame using OpenCV functions (main operations here are cv::split, cv::addWeighted and cv::merge to allow for RGB processing of a 3-channel cv::Mat). This function was time-consuming if called in series for both frames so I decided to split the processing into two parallel threads.

                    After each frame is processed its pushed onto a "buffer" (an std::dequecv::Mat) so the output stage doesn't need to wait for both frames to be finished. The output stage functions essentially create a constantly spinning thread the same way above that takes the oldest frame from each buffer and mixes them using cv::addWeighted and emits its data array to be displayed.

                    So I have 3 threads running in parallel at this point. I've confirmed that each one is unique by comparing the IDs of the threads and also the priority of each thread does not affect CPU usage.

                    At this point I think all the per-element OpenCV operations are causing the high load since if you have let's say two frames of 1920x1080 being processed constantly, that would be pretty CPU-heavy. If anyone has any tips / ideas I would definitely appreciate them.

                    Cheers!

                    1 Reply Last reply
                    0
                    • S Offline
                      S Offline
                      SGaist
                      Lifetime Qt Champion
                      wrote on 26 Oct 2019, 16:49 last edited by
                      #9

                      Did you consider moving the image processing stuff to the GPU ?

                      Interested in AI ? www.idiap.ch
                      Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                      R 1 Reply Last reply 26 Oct 2019, 17:35
                      1
                      • S SGaist
                        26 Oct 2019, 16:49

                        Did you consider moving the image processing stuff to the GPU ?

                        R Offline
                        R Offline
                        rtavakko
                        wrote on 26 Oct 2019, 17:35 last edited by
                        #10

                        @SGaist Yes, I think that's most likely the route I have to take. I'm building OpenCV with Cuda today. I think I totally took the wrong approach from the start. I'll let you know how it turns out.

                        1 Reply Last reply
                        0
                        • R Offline
                          R Offline
                          rtavakko
                          wrote on 1 Nov 2019, 22:17 last edited by rtavakko 11 Feb 2019, 13:48
                          #11

                          I CUDA not build it unfortunately. CUDA is only supported in Visual Studio for Windows so I guess its back to good old uncle GL ...

                          1 Reply Last reply
                          0

                          • Login

                          • Login or register to search.
                          • First post
                            Last post
                          0
                          • Categories
                          • Recent
                          • Tags
                          • Popular
                          • Users
                          • Groups
                          • Search
                          • Get Qt Extensions
                          • Unsolved