Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Qt algorithm benchmark app , weird result

Qt algorithm benchmark app , weird result

Scheduled Pinned Locked Moved Solved General and Desktop
9 Posts 4 Posters 203 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C Offline
    C Offline
    cronos4455
    wrote last edited by
    #1

    I am creating an app to visualize algorithms and perform benchmark on them, I am having trouble figuring out why I am getting weird result. The problem is that I start benchmark on BFS algorithm and the results look like this.

    source code : https://github.com/krystian5642/Algorithms.git

    The time complexity should be O(V+E) and it is but not always. Third run is O(V+E) and I am not sure what is happening in the first and the second run, the result is always the same (after third run, it is not always O(V+E)). The whole benchmark runs as a separate thread (I got the same results when it was main thread).

    debug.png release.png

    1 Reply Last reply
    0
    • Kent-DorfmanK Offline
      Kent-DorfmanK Offline
      Kent-Dorfman
      wrote last edited by
      #2

      multi-processing timesharing machine OSs are not good for benchmarking events with tight timing windows. The schedulers cannot be considered deterministic in how they schedule processes or how much CPU time each thread gets.

      This is generally why simulations on linux and windoze platforms use a pseudo-clock/counter, so that the sim provides reproducable data.

      C 1 Reply Last reply
      3
      • Kent-DorfmanK Kent-Dorfman

        multi-processing timesharing machine OSs are not good for benchmarking events with tight timing windows. The schedulers cannot be considered deterministic in how they schedule processes or how much CPU time each thread gets.

        This is generally why simulations on linux and windoze platforms use a pseudo-clock/counter, so that the sim provides reproducable data.

        C Offline
        C Offline
        cronos4455
        wrote last edited by cronos4455
        #3

        @Kent-Dorfman is there something I can do about this to mitigate it or It has to be like this ? As far as I remember when I tested algorithms in console app , I didn't have this problem so I assumed I could be something Qt-releted

        I JonBJ 2 Replies Last reply
        0
        • C cronos4455

          @Kent-Dorfman is there something I can do about this to mitigate it or It has to be like this ? As far as I remember when I tested algorithms in console app , I didn't have this problem so I assumed I could be something Qt-releted

          I Offline
          I Offline
          IgKh
          wrote last edited by
          #4

          @cronos4455 For start - it appears that you are using QElapsedTimer to measure algorithm execution time, and that is a poor choice for your use case. The main reason being that what it measures is elapsed wall clock time, which has the highest variance as it is the most sensitive to whatever else is happening on the system at the same time (it is very possible that during a good chunk of the elapsed time, your application is not even running - as pointed out by @Kent-Dorfman)

          If nothing else, what you should measure is elapsed user time - just the time the thread running the algorithm actually spent on-CPU executing application (and not kernel-level) code. To my knowledge Qt has no API for that, so you'll need to use OS API directly (e.g on Windows you'll use the GetThreadTimes API) or take a dependency on a C++ benchmarking library that has the requisite APIs.

          Note that even using user time will still have some noticeable variance - there are effects related to content of CPU cache lines and branch prediction misses (which are most visible in the first time the code runs). The best way to deal with it is to do each run multiple times (as much as required so the total running time is at least a few seconds) and take the average time for each input size - possibly without the first one or two repetitions of each run. By the law of large numbers - this will be much closer to the "true" run time.

          C 1 Reply Last reply
          2
          • I IgKh

            @cronos4455 For start - it appears that you are using QElapsedTimer to measure algorithm execution time, and that is a poor choice for your use case. The main reason being that what it measures is elapsed wall clock time, which has the highest variance as it is the most sensitive to whatever else is happening on the system at the same time (it is very possible that during a good chunk of the elapsed time, your application is not even running - as pointed out by @Kent-Dorfman)

            If nothing else, what you should measure is elapsed user time - just the time the thread running the algorithm actually spent on-CPU executing application (and not kernel-level) code. To my knowledge Qt has no API for that, so you'll need to use OS API directly (e.g on Windows you'll use the GetThreadTimes API) or take a dependency on a C++ benchmarking library that has the requisite APIs.

            Note that even using user time will still have some noticeable variance - there are effects related to content of CPU cache lines and branch prediction misses (which are most visible in the first time the code runs). The best way to deal with it is to do each run multiple times (as much as required so the total running time is at least a few seconds) and take the average time for each input size - possibly without the first one or two repetitions of each run. By the law of large numbers - this will be much closer to the "true" run time.

            C Offline
            C Offline
            cronos4455
            wrote last edited by
            #5

            @IgKh I could not use GetThreadTimes because it has low precision, but I used QueryThreadCycleTime and got the same results just like QElapsedTime

            I 1 Reply Last reply
            0
            • C cronos4455

              @IgKh I could not use GetThreadTimes because it has low precision, but I used QueryThreadCycleTime and got the same results just like QElapsedTime

              I Offline
              I Offline
              IgKh
              wrote last edited by
              #6

              @cronos4455 said in Qt algorithm benchmark app , weird result:

              I could not use GetThreadTimes because it has low precision

              Documentation says that it has 100ns precision, no? If that is too low, that's a strong indicator that the runtime of what you are measuring isn't long enough, and needs to be inflated to get a stable reading. Micro-benchmarking is really hard... In my experience measuring things that take less than around 10ms is hard to do reliably, but that's very architecture and environment related.

              @cronos4455 said in Qt algorithm benchmark app , weird result:

              I used QueryThreadCycleTime and got the same results just like QElapsedTime

              That's surprising. I'd expect the effect to be mitigated at least somewhat. I'd hazard a guess that the main effect would be then that at some input size ranges the generated code for the algorithm being tested runs into CPU pipeline stalling or the like. It is difficult to pinpoint why, but as I said the best way to get through with this is to run the algorithm with each given input size multiple times and take the average minus outliers.

              C 1 Reply Last reply
              0
              • I IgKh

                @cronos4455 said in Qt algorithm benchmark app , weird result:

                I could not use GetThreadTimes because it has low precision

                Documentation says that it has 100ns precision, no? If that is too low, that's a strong indicator that the runtime of what you are measuring isn't long enough, and needs to be inflated to get a stable reading. Micro-benchmarking is really hard... In my experience measuring things that take less than around 10ms is hard to do reliably, but that's very architecture and environment related.

                @cronos4455 said in Qt algorithm benchmark app , weird result:

                I used QueryThreadCycleTime and got the same results just like QElapsedTime

                That's surprising. I'd expect the effect to be mitigated at least somewhat. I'd hazard a guess that the main effect would be then that at some input size ranges the generated code for the algorithm being tested runs into CPU pipeline stalling or the like. It is difficult to pinpoint why, but as I said the best way to get through with this is to run the algorithm with each given input size multiple times and take the average minus outliers.

                C Offline
                C Offline
                cronos4455
                wrote last edited by cronos4455
                #7

                @IgKh I mean I had this issue with GetThreadTimes
                https://stackoverflow.com/questions/30432982/getthreadtimes-returning-0-zero-as-the-kernel-and-user-time-for-all-the-thr
                Returned values were either 0 or 15,625ms.

                I was was surprised too, especially that first run takes up to 1 min , second 40s and third less than 20s but what I tested algorithms in a console app then created a chart in excel from results I waited 2-3min for the first run , and the results were normal (linear), I mean a console app without Qt stuff, just C++ console app in visual studio

                Now , I will just run it multiple times.

                1 Reply Last reply
                1
                • C cronos4455

                  @Kent-Dorfman is there something I can do about this to mitigate it or It has to be like this ? As far as I remember when I tested algorithms in console app , I didn't have this problem so I assumed I could be something Qt-releted

                  JonBJ Offline
                  JonBJ Offline
                  JonB
                  wrote last edited by
                  #8

                  @cronos4455 said in Qt algorithm benchmark app , weird result:

                  is there something I can do about this to mitigate it or It has to be like this ? As far as I remember when I tested algorithms in console app , I didn't have this problem so I assumed I could be something Qt-releted

                  What sort of an answer are you expecting here, given that nobody knows anything about what you code does? People won't know what is specific about the first, second or third runs.

                  Qt does not mean anything has to be graphical or UI. Why do you suspect a Qt issue in your case? You can write Qt console programs just as much as UI ones, so have you tested your algorithm using that?

                  You seem to run your test three times. What happens when you run it a hundred times? Does it settle down into something consistent between runs? There is, for example, a lot of "cache hitting" which can go on in runs of code, especially in the first few times.

                  C 1 Reply Last reply
                  0
                  • JonBJ JonB

                    @cronos4455 said in Qt algorithm benchmark app , weird result:

                    is there something I can do about this to mitigate it or It has to be like this ? As far as I remember when I tested algorithms in console app , I didn't have this problem so I assumed I could be something Qt-releted

                    What sort of an answer are you expecting here, given that nobody knows anything about what you code does? People won't know what is specific about the first, second or third runs.

                    Qt does not mean anything has to be graphical or UI. Why do you suspect a Qt issue in your case? You can write Qt console programs just as much as UI ones, so have you tested your algorithm using that?

                    You seem to run your test three times. What happens when you run it a hundred times? Does it settle down into something consistent between runs? There is, for example, a lot of "cache hitting" which can go on in runs of code, especially in the first few times.

                    C Offline
                    C Offline
                    cronos4455
                    wrote last edited by
                    #9

                    @JonB I managed to solve it, It was a problem with QSet, function ,,insert'' took a lot of time event though I used ,,reserve''. I didn't have this problem in my console app because I used stl containers.

                    1 Reply Last reply
                    0
                    • C cronos4455 has marked this topic as solved

                    • Login

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • Users
                    • Groups
                    • Search
                    • Get Qt Extensions
                    • Unsolved