Improving drawing/painting performance

kshegunov

This thread gets a packet, decides where to put it and simply adds the packet to a list.

Why is the list needed?

It then signals the appropriate object that a new packet of data is available.

How is the thread signalling an object, where is the thread object located, where is the receiver located, how is the connection done?

The "foreground" version of the app, these signals travel all the way up to the UI widget, that triggers a repaint.

You should rather schedule regular update() calls (with a timer) instead of reapint()-ing the widget.

So there is one thread that renders the FFT data, another that renters the bottom part called the waterfall.

Is the FFT threaded itself? As far as I understand to get the "waterfall" you need the FFT data, so if the "rendering" workers are also doing the FFT either both should make their wn calculation or one will have to wait for the other.

The UI simply gets an image and does painter.drawImage( 0,0, theImage ); Yet this is much slower than doing all the work in the foreground to draw the graph.

The problem might be completely unrelated to drawImage, although drawing images is not the most efficient way (pixmaps would be proffered, but they are not reentrant).

Is painter.drawImage() slow?

Yes, but only relatively to painting pixmaps for example.

Is there a better/faster way to move an image from a QImage to the screen?

Painting on an offscreen buffer and then manually doing the double-buffering would be one option. However, you should be sure that this is your last resort before starting such implementations.

I release that signals by default are in "auto" mode and supposedly signals from other threads are asynchronous.

By themselves signals are oblivious to threads. The connection between a signal and another signal/slot can be in different modes. Auto means queued for different threads, or direct when sender and receiver are in the same thread.

I'm wondering if there is a threading/signaling issue.

This would be my suspicion, so probably yes.

I am wondering if I could improve performance in this thread by being able to keep a QImage around that is my master image and when new data comes in I need to be able to shift the data in the kept QImage down one row making the top row of the image available to render the new data into. Is there an easy way to take a QImage and shift the image down one row and paint in new pixels into the top row?

I'd rather keep a QPixmap in the GUI thread with the old state, scroll the old data, paint the new data onto the pixmap and finally paint the whole pixmap on the screen.

SysTech

Hello Sir,

Thanks for the reply:

Why is the list needed?

It might not be... My goal was to allow the UDP thread to stuff data in pretty much as quickly as possible and if something delayed the processing of that data at some other point the list would grow but then shrink back down as data was processed.

How is the thread signalling an object, where is the thread object located, where is the receiver located, how is the connection done?

A Qt signal. The objects are probably sitting in the foreground main thread. This is why my question about signals. I'm suspecting:

UDP thread -> signal main thread (object) -> signal main thread paint. But still investigating this.

You should rather schedule regular update() calls (with a timer) instead of reapint()-ing the widget.

Ok... I can certain try this.

Is the FFT threaded itself? As far as I understand to get the "waterfall" you need the FFT data, so if the "rendering" workers are also doing the FFT either both should make their wn calculation or one will have to wait for the other.

No that is not correct. Both the FFT data and waterfall data are pre-computed by the radio and come down in UDP packets.

For the FFT data it is actually already setup to draw. It has been converted to points relative to the size of the widget so it is really as simple as lineto( point1x, point1y, point2x, point2y ) in a loop.

The waterfall data on the other hand has to be interpreted. It comes over as a single line (pixel row) of data but you must interpolate to match frequency with the FFT display.

The idea on the waterfall is that you get a line of data, that gets drawn at the top, pushing the other lines down until you decide to kill the historical data.

The problem might be completely unrelated to drawImage, although drawing images is not the most efficient way (pixmaps would be proffered, but they are not reentrant).

Well I was thinking that I could render all of this stuff in a thread and then tell the foreground that a new image was available and simply "copy" it to screen. While this seems to work it is not as quick as I expected. In fact the threaded FFT seems to fall behind. Again all it has to do is draw points. So in a thread I just draw the points onto an image. I then signal the foreground an image is ready and it grabs and paints that image.

I thought that was going to really help. Turns out it is worse than drawing in the foreground.

Painting on an offscreen buffer and then manually doing the double-buffering would be one option. However, you should be sure that this is your last resort before starting such implementations.

It seemed to me that since Qt4 some buffering was done for you. I remember reading that somewhere. But in effect my threaded rendering is kind of like a double buffer. I mean the thread should be working on a new image while the foreground is painting a new new image (or rather copying an image to screen) But it doesn't seem to be that fast.

I wonder about your comment on scheduling updates. I should give that a try and see if maybe it helps to not be relying on signals so much.

I'd rather keep a QPixmap in the GUI thread with the old state, scroll the old data, paint the new data onto the pixmap and finally paint the whole pixmap on the screen.

Hum... interesting... I'll take a lot at that.

Thanks so much for taking the time to reply!

kshegunov

@SysTech said:
Hi.

It might not be... My goal was to allow the UDP thread to stuff data in pretty much as quickly as possible and if something delayed the processing of that data at some other point the list would grow but then shrink back down as data was processed.

You should already have a queue in the thread! Use that instead. Just emit a signal with each new piece of data and connect that to a slot of an object that resides in a separate thread. I suppose you haven't derived from QThread, or am I wrong?
See here for a decent threading tutorial in case you have.

No that is not correct. Both the FFT data and waterfall data are pre-computed by the radio and come down in UDP packets.

For the FFT data it is actually already setup to draw. It has been converted to points relative to the size of the widget so it is really as simple as lineto( point1x, point1y, point2x, point2y ) in a loop.

The waterfall data on the other hand has to be interpreted. It comes over as a single line (pixel row) of data but you must interpolate to match frequency with the FFT display.

The idea on the waterfall is that you get a line of data, that gets drawn at the top, pushing the other lines down until you decide to kill the historical data.

Well, if the data is almost ready, then you don't need so many threads. You'll have to serialize the painting anyway, as widgets can be painted from the GUI thread only. I'd only thread the communication channel, and do the interpolation there. If the throughput of said communication is not enough (interpolation is heavy and the comm thread lags behind) only then I'd consider making a separate thread for the interpolation only. Once the data is prepared for painting, the painting itself is better done in the main thread.

It seemed to me that since Qt4 some buffering was done for you. I remember reading that somewhere. But in effect my threaded rendering is kind of like a double buffer.

Yes, it is, but you can double-buffer manually. However, as I said this should be a last resort, most of the time it's not really worth the trouble.

I wonder about your comment on scheduling updates.

Update events are compressed, so you're definitely better off using update() instead of forcing repaint()s.

I should give that a try and see if maybe it helps to not be relying on signals so much.

On the contrary, you should only(!) rely on Qt signals and slots. You're working threaded, so it's the best way to both decouple your components and have a thread-safe way of transferring reentrant data.

It's hard to give decent advice without code, but I hope this is somewhat helpful.
Cheers!

SysTech

You should already have a queue in the thread! Use that instead. Just emit a signal with each new piece of data and connect that to a slot of an object that resides in a separate thread. I suppose you haven't derived from QThread, or am I wrong?

Thank you. I have read that. I have a queue in the thread but was trying to see if moving the data out of the thread caused issues. I did not derive. I'm using the worker concept.

Well, if the data is almost ready, then you don't need so many threads.

This was my original design. Since the FFT data is so close to ready to draw, it is only missing grids and labels that I thought I could use UDP thread to gather the data and signals to get it to draw.

This indeed does work very well up to a point where it begins to get a little slow and jerky and the UI starts to get kind of overloaded.

On the contrary, you should only(!) rely on Qt signals and slots. You're working threaded, so it's the best way to both decouple your components and have a thread-safe way of transferring reentrant data.

It's hard to give decent advice without code, but I hope this is somewhat helpful.

It is very helpful and the project is not downsizeable right now to be able to post small bits of code. If I continue to have issues I will boil things down to some examples and see help.

My point about signals is this: I was relying 100% on signals. IE in my original invocation of this thing:

UDP thread got data, sent a signal
GUI thread received signal, draw

That was the simplest form. It works but like I said when the data rate starts to become about 30 fps it starts to smother the UI so it becomes somewhat unresponsive.

I need to try and figure out what is taking the time and for that I need some benchmarking which I don't have installed at the moment.

I'm going to try something this morning: Move all of the data queues back to the UDP thread. So it does one single job for the most part: receive data and push on to queues.

As each data is received I'm emit a signal. Attach those signals to the Widgets that draw and try gathering data and calling "update".

The second thing I'm going to try is to not trigger updates on the signal but rather on a timer as you suggested before. So the widgets would have fairly high-speed timers polling for updates. I want to see if this works better.

Thanks again for the help and conceptual ideas. This is very helpful and at least gets me thinking about options.

kshegunov

Hello,

As each data is received I'm emit a signal. Attach those signals to the Widgets that draw and try gathering data and calling "update".

The second thing I'm going to try is to not trigger updates on the signal but rather on a timer as you suggested before. So the widgets would have fairly high-speed timers polling for updates. I want to see if this works better.

Another thing you could try (and probably would scale/perform much better) is to "request" data for display, instead of pushing the data from the worker threads. It seems your workers are very prolific, and even if you were able to display the data 60+ times per second it wouldn't be really visible/perceptible for the user. So suppose you set a fixed frame rate for display (let's say 30fps). Then you start a timer for that frame rate and on each tick you request data from the workers (you can simply signal the worker object) only then, in the slot handling the "request data" signal the worker sends back (again with a signal) the data to the GUI thread for drawing. This'd mean that you may skip a few frames, but I'm quite sure it'll be unnoticeable. Still, the worker should keep some history of the FFT data (for the "waterfall") and you may draw in the GUI thread a few lines at once, instead of single one, but I think this would work much, much better.

Thanks again for the help and conceptual ideas. This is very helpful and at least gets me thinking about options.

No problem.

SysTech

I have to report that your ideas and concepts really did help.

I will mark the thread as solved. What I did was this:

In my low-level UDP thread I setup mutex protected queues for the different kinds of data.

I then have two processing threads. One that takes the FFT data from the queue in the UDP thread and draws a QImage with it. It pushes the QImage onto a mutex protected image list.

The other thread takes the waterfall data and processes it into a line list. Each time new waterfall data arrives it paints the lines onto a QImage and pushes that into a mutex protected list.

Basically the two processing threads are currently polling the UDP thread for more data to work on. This seems to work great. I have the option of using signals here too but I decided to try the polling method and see what it did. So far it seems rock solid.

I've even increased my polling thread count to 8 all banging on the UDP thread for data and have not seen any issues.

At the UI level I decided to use QTimer with a time of 0 since it states this will run as fast as possible but not block UI ops. In my first invocation the UI widgets with timer(0) calls are polling the thread image lists for new images to paint. If one is there then it grabs it, paints it to screen and removes it.

So far this is working wonders. I realize I'm not using signals at this point but I have an option to re-enable them. What I will say is this is working very very well as it stands.

The FFT frame rate is actually controllable by a command to the radio. So I don't really have to do that work in my GUI. I can just send a command to the radio that I want 10, 15 etc FPS and it just happens.

Likewise the waterfall data rate can be controlled as well. I have the option to limit what the user can select. In my testing on my MacBookPro with a core i7 I can easily support the maximum FPS and line rate on the waterfall for up to four displays.

This causes no delays, no backlogs or anything. For the FFT data I have a routine that keeps about 50 of the last FFT lines and I plot them in a 3D fashion. When I do this with a high frame rate after a while it can get a little behind when I have a bunch of displays working.

Anyway here is a short movie of the process in action with two displays:

https://dl.dropboxusercontent.com/u/7578983/3DPanaFall.mp4

As you can see a short time into the movie if I shift or change parameters I'm not currently killing the FFT history but that will be added.

The movie is showing about 15 fps for the FFT data and the waterfall data is about 50 ms I think between lines but I can't remember for sure.

Anyway this is all based on the help you gave me in thinking about things differently.

Thanks again!

mrjj

wow
That looks cool!

kshegunov

@SysTech
It certainly looks great. And what is even more impressive, it was done with Qt's raster paint engine (if I understood your setup correctly). Good job!

SysTech

@kshegunov said:

@SysTech
It certainly looks great. And what is even more impressive, it was done with Qt's raster paint engine (if I understood your setup correctly). Good job!

I believe that answer would be yes. I'm not doing anything special. In the paint event I get a painter and draw the QImage.

The vertical yellow lines are receivers that can be tuned to a specific frequency. Those are drawn AFTER the QImage is painted.

Not all issues are solved but it does work very well.

Thanks again!

SysTech

@mrjj said:

wow
That looks cool!

Thanks very much! Still working on how I want the UI to look. Those 3D displays are kind of handy for visualizing the radio output but they are right now pretty CPU intensive.

kshegunov

@SysTech

Thanks again!

You're very welcome.

Those 3D displays are kind of handy for visualizing the radio output but they are right now pretty CPU intensive.

If I may insert yet another suggestion here. While I don't believe you'd gain much by using OpenGL painting for the "waterfall" data, I think switching to it for the 3D displays would work better. I nice side effect would be that you can also render OpenGL from different threads, provided the appropriate locking mechanisms are in place. You could, as the most simple test, try using QOpenGLWidget for those FFT displays.

Kind regards.