Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. Mobile and Embedded
  4. QtWayland compositor very bad performance
Forum Updated to NodeBB v4.3 + New Features

QtWayland compositor very bad performance

Scheduled Pinned Locked Moved Unsolved Mobile and Embedded
waylandwestonqt 5.8.0qtwaylandglmark2
6 Posts 3 Posters 2.6k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • teleT Offline
    teleT Offline
    tele
    wrote on last edited by
    #1

    Hi all,

    I'm testing QtWayland performance on an embedded armhf device with imx6q processor. It is supported SoC (by Qt5) long ago.
    Its Qt 5.8.0 and the OS is Debian Stretch

    When I run the QtWayland Compositor: examples/qwindow-compositor
    $ glmark2-es2-wayland
    =======================================================
    glmark2 2014.03
    =======================================================
    OpenGL Information
    GL_VENDOR: Vivante Corporation
    GL_RENDERER: Vivante GC2000
    GL_VERSION: OpenGL ES 3.0 V5.0.11.p8.41671
    =======================================================
    [build] use-vbo=false: FPS: 119 FrameTime: 8.403 ms
    [build] use-vbo=true: FPS: 200 FrameTime: 5.000 ms
    <snip-snip-snip>
    ....
    =======================================================
    glmark2 Score: 102
    =======================================================

    When I run the QtWayland Compositor: examples/pure-qml
    $ glmark2-es2-wayland
    tele@stretch-dev:/opt/qt5/examples/qt3d$ glmark2-es2-wayland
    =======================================================
    glmark2 2014.03
    =======================================================
    OpenGL Information
    GL_VENDOR: Vivante Corporation
    GL_RENDERER: Vivante GC2000
    GL_VERSION: OpenGL ES 3.0 V5.0.11.p8.41671
    =======================================================
    [build] use-vbo=false: FPS: 72 FrameTime: 13.889 ms
    [build] use-vbo=true: FPS: 111 FrameTime: 9.009 ms
    <snip-snip-snip>
    ....
    =======================================================
    glmark2 Score: 79
    =======================================================

    When I run the good old Weston compositor:
    $ glmark2-es2-wayland
    tele@stretch-dev:/opt/qt5/examples/qt3d$ glmark2-es2-wayland
    =======================================================
    glmark2 2014.03
    =======================================================
    OpenGL Information
    GL_VENDOR: Vivante Corporation
    GL_RENDERER: Vivante GC2000
    GL_VERSION: OpenGL ES 3.0 V5.0.11.p8.41671
    =======================================================
    [build] use-vbo=false: FPS: 419 FrameTime: 2.387 ms
    [build] use-vbo=true: FPS: 682 FrameTime: 1.466 ms
    <snip-snip-snip>
    ....
    =======================================================
    glmark2 Score: 253
    =======================================================

    As you can see the performance with the same glmark2-es2-wayland client is :
    C++ QtWayland Compositor 102
    QML QtWayland Compositor 79
    Pure C Weston compositor 253

    Just for comparison, on X11 the glmark2-es2 score is 194 with this board.
    Wayland is supposed to be faster than x11. And Weston proves that, its about 30% faster.
    But Qtwayland compositors are much worse, the pure qml compositor actually unusable in this embedded environment.

    My questions are :
    Is this expected ? Can anyone confirm this ? Or should I expect better performance ? Do I need to check settings or something ?
    Seemingly everything works good, no error, no warning, even the spinning cubes and animals look the same, its not sluggish (of course 72 fps is just fast enough for the eye)
    In my opinion wayland is much more important on embedded, than on desktop platform.
    On a desktop PC you can't ask any graphics task on x11 what is slow with a modern nVidia or AMD video-card and with a good SW, they are bloody fast GPU's and the desktop graphics stupid effects are nothing to them. They start to swear only at 4k high FPS 3D games, but those are not written for x11 or wayland. So the faster wayland makes no big difference on PC.
    But running on low-end embedded GPU's, wayland could make a big difeerence compared to X11.
    In theory. But I experience the opposite and wayland compositor is much worse than X11. At least if its QtWayland.

    This is a huge disappointment. Will this be any better in Qt5.9 ? I mean the QtWayland Compositor class.

    Thanks,
    Laci

    1 Reply Last reply
    0
    • SGaistS Offline
      SGaistS Offline
      SGaist
      Lifetime Qt Champion
      wrote on last edited by
      #2

      Hi and welcome to devnet,

      You should rather bring that question to the interest mailing list. You'll find there Qt's developers/maintainers. This forum is more user oriented.

      Interested in AI ? www.idiap.ch
      Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

      1 Reply Last reply
      1
      • teleT Offline
        teleT Offline
        tele
        wrote on last edited by
        #3

        Thank you Gaist !
        I'm noobcake here this was my first question.

        1 Reply Last reply
        0
        • N Offline
          N Offline
          nbaldy
          wrote last edited by
          #4

          I'm finding that my QT applications have very poor performance on top of the weston compiler, even with libEGL connections testing properly and utilizing RHI - setup (time between launching the example "calculator" app, fullscreened, and it showing up on the screen) is about 3 seconds, with another 2-3 seconds before input is accepted. Once touch input is accepted, there is ~1/2 second delay between the touch and response. I have much better performance with EGLFS, but we want to use wayland (so that we can utilize things like waylandvnc). I will report back once I do some testing with sway and/or the QT compiler.

          SGaistS 1 Reply Last reply
          0
          • N nbaldy

            I'm finding that my QT applications have very poor performance on top of the weston compiler, even with libEGL connections testing properly and utilizing RHI - setup (time between launching the example "calculator" app, fullscreened, and it showing up on the screen) is about 3 seconds, with another 2-3 seconds before input is accepted. Once touch input is accepted, there is ~1/2 second delay between the touch and response. I have much better performance with EGLFS, but we want to use wayland (so that we can utilize things like waylandvnc). I will report back once I do some testing with sway and/or the QT compiler.

            SGaistS Offline
            SGaistS Offline
            SGaist
            Lifetime Qt Champion
            wrote last edited by
            #5

            @nbaldy Hi and welcome to devnet,

            You do not say what your device/machine specifications are nor the version of Linux, Qt, wayland/weston or the graphics setup of your device.

            All these are important piece to factor in when testing performances.

            Interested in AI ? www.idiap.ch
            Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

            1 Reply Last reply
            0
            • N Offline
              N Offline
              nbaldy
              wrote last edited by nbaldy
              #6

              Sorry, yes, good points.

              # ./sgx_check.sh
              WSEGL settings
              [default]
              WindowSystem=libpvrDRMWSEGL.so
              DefaultPixelFormat=RGB888
              #DefaultPixelFormat=RGB565
              
              ------
              ARM CPU information
              processor       : 0
              model name      : ARMv7 Processor rev 2 (v7l)
              BogoMIPS        : 597.60
              Features        : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpd32
              CPU implementer : 0x41
              CPU architecture: 7
              CPU variant     : 0x3
              CPU part        : 0xc08
              CPU revision    : 2
              
              Hardware        : Generic OMAP36xx (Flattened Device Tree)
              Revision        : 0000
              Serial          : 0000000000000000
              ------
              SGX driver information
              Version SGX_DDK sgxddk 1.17@4948957 (release) dm37xx_linux
              System Version String: SGX revision = 125
              ------
              Framebuffer settings
              
              mode "1280x720"
                  geometry 1280 720 1280 720 32
                  timings 0 0 0 0 0 0 0
                  accel true
                  rgba 8/16,8/8,8/0,0/0
              endmode
              
              Frame buffer device information:
                  Name        : omapdrmdrmfb
                  Address     : (nil)
                  Size        : 3686400
                  Type        : PACKED PIXELS
                  Visual      : TRUECOLOR
                  XPanStep    : 1
                  YPanStep    : 1
                  YWrapStep   : 0
                  LineLength  : 5120
                  Accelerator : No
              ------
              Rotation settings
              0
              ------
              PVR Module information
              Module                  Size  Used by
              pvrsrvkm              393216  2
              ------
              Boot settings
              console=ttyO0,115200n8 rootwait=1 rw ubi.mtd=7,512 rootfstype=ubifs root=ubi0:compu-XXXX mtdoops.mtddev=omap2.nand earlyprintk=ttyO0,115200n8 nohlt omapfb.rotate=0  vram=40M omapfb.vram=20M,1:1M,2:1M omapfb.vrfb=y cma=64MB 5
              ------
              Linux Kernel version
              Linux compu-XXXX 5.10.168-1-ctx-g991c5ce91e #1 SMP PREEMPT Fri Apr 7 09:34:04 UTC 2023 armv7l GNU/Linux
              ------
              Weston.ini
              [core]
              require-input=false
              idle-timeout=0
              gbm-format=xrgb8888
              #gbm-format=rgb565
              
              [output]
              name=DPI-1
              
              [libinput]
              touchscreen_calibrator=true
              calibration_helper=/bin/echo
              
              [shell]
              locking=false
              animation=none
              panel-position=none
              close-animation=none
              startup-animation=none
              focus-animation=none
              ------
              /etc/profile.d/qt_env.sh
              #!/bin/sh
              
              ### QT Environment Variables ###
              # export QT_QPA_EVDEV_TOUCHSCREEN_PARAMETERS="rotate=180"
              export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
              export QT_QPA_EGLFS_KMS_CONFIG=/etc/qt6/eglfs_kms_cfg.json
              #export QT_QPA_EGLFS_INTEGRATION=eglfs_kms
              export QT_QPA_EGLFS_ALWAYS_SET_MODE=1
              export QT_WAYLAND_SHELL_INTEGRATION=xdg-shell
              
              # SECCOMP-BPF Sandbox does not work due to unexpected FUTEX_UNLOCK_PI call
              # from the pthread implementation. Disable this feature temporarily until
              # those issues are resolved.
              export QTWEBENGINE_CHROMIUM_FLAGS="--disable-seccomp-filter-sandbox"
              
              export QT_QPA_EGLFS_INTEGRATION=none
              export QSG_RHI_PREFER_SOFTWARE_RENDERER=0
              export QT_WIDGETS_RHI_BACKEND=opengl
              export QT_WIDGETS_HIGHDPI_DOWNSCALE=1
              export QT_WIDGETS_RHI=1
              export QT_OPENGL_NO_SANITY_CHECK=1
              
              
              export QT_QPA_PLATFORM="wayland-egl"
              export QT_WAYLAND_CLIENT_BUFFER_INTEGRATION="linux-dmabuf-unstable-v1"
              export QT_WAYLAND_HARDWARE_INTEGRATION="linux-dmabuf-unstable-v1"
              export QT_WAYLAND_SERVER_BUFFER_INTEGRATION="linux-dmabuf-unstable-v1"
              export QT_WAYLAND_SHELL_INTEGRATION="xdg-shell"
              export QT_WAYLAND_TEXT_INPUT_PROTOCOL="zwp_text_input_v1"
              
              ---
              
              Version info:
              # weston --version
              weston 10.0.2
              
              nsions string:
                  EGL_EXT_client_extensions EGL_EXT_device_base
                  EGL_EXT_device_enumeration EGL_EXT_device_query EGL_EXT_platform_base
                  EGL_KHR_client_get_all_proc_addresses EGL_KHR_debug
                  EGL_EXT_platform_device EGL_EXT_platform_wayland
                  EGL_KHR_platform_wayland EGL_MESA_platform_gbm EGL_KHR_platform_gbm
                  EGL_MESA_platform_surfaceless
              
              GBM platform:
              MESA: info: Loaded libpvr_dri_support.so
              EGL API version: 1.4
              EGL vendor string: Mesa Project
              EGL version string: 1.4
              EGL client APIs: OpenGL_ES
              EGL extensions string:
                  EGL_EXT_buffer_age EGL_EXT_create_context_robustness
                  EGL_EXT_image_dma_buf_import EGL_EXT_yuv_surface
                  EGL_KHR_config_attribs EGL_KHR_create_context EGL_KHR_fence_sync
                  EGL_KHR_get_all_proc_addresses EGL_KHR_gl_renderbuffer_image
                  EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image
                  EGL_KHR_image EGL_KHR_image_base EGL_KHR_image_pixmap
                  EGL_KHR_no_config_context EGL_KHR_reusable_sync
                  EGL_KHR_surfaceless_context EGL_EXT_pixel_format_float
                  EGL_KHR_wait_sync EGL_MESA_configless_context EGL_MESA_drm_image
                  EGL_WL_bind_wayland_display EGL_IMG_cl_image
              Configurations:
                   bf lv colorbuffer dp st  ms    vis   cav bi  renderable  supported
                id sz  l  r  g  b  a th cl ns b    id   eat nd gl es es2 vg surfaces
              ---------------------------------------------------------------------
              0x01 32  0  8  8  8  8  0  0  0 0 0x34325241--      a     y  y     win,pb
              0x02 32  0  8  8  8  8  0  0  4 1 0x34325241--      a     y  y     win,pb
              0x03 32  0  8  8  8  8 24  8  0 0 0x34325241--      a     y  y     win,pb
              0x04 32  0  8  8  8  8 24  8  4 1 0x34325241--      a     y  y     win,pb
              0x05 24  0  8  8  8  0  0  0  0 0 0x34325258--      y     y  y     win,pb
              0x06 24  0  8  8  8  0  0  0  4 1 0x34325258--      y     y  y     win,pb
              0x07 24  0  8  8  8  0 24  8  0 0 0x34325258--      y     y  y     win,pb
              0x08 24  0  8  8  8  0 24  8  4 1 0x34325258--      y     y  y     win,pb
              0x09 16  0  5  6  5  0  0  0  0 0 0x36314752--      y     y  y     win,pb
              0x0a 16  0  5  6  5  0  0  0  4 1 0x36314752--      y     y  y     win,pb
              0x0b 16  0  5  6  5  0 24  8  0 0 0x36314752--      y     y  y     win,pb
              0x0c 16  0  5  6  5  0 24  8  4 1 0x36314752--      y     y  y     win,pb
              MESA: info: Unloaded libpvr_dri_support.so
              
              Wayland platform:
              MESA: info: Loaded libpvr_dri_support.so
              EGL API version: 1.4
              EGL vendor string: Mesa Project
              EGL version string: 1.4
              EGL client APIs: OpenGL_ES
              EGL extensions string:
                  EGL_EXT_buffer_age EGL_EXT_create_context_robustness
                  EGL_EXT_image_dma_buf_import EGL_EXT_present_opaque
                  EGL_EXT_swap_buffers_with_damage EGL_EXT_yuv_surface
                  EGL_KHR_config_attribs EGL_KHR_create_context EGL_KHR_fence_sync
                  EGL_KHR_get_all_proc_addresses EGL_KHR_gl_renderbuffer_image
                  EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image
                  EGL_KHR_image_base EGL_KHR_no_config_context EGL_KHR_reusable_sync
                  EGL_KHR_surfaceless_context EGL_KHR_swap_buffers_with_damage
                  EGL_EXT_pixel_format_float EGL_KHR_wait_sync
                  EGL_MESA_configless_context EGL_MESA_drm_image
                  EGL_WL_bind_wayland_display EGL_WL_create_wayland_buffer_from_image
                  EGL_IMG_cl_image
              Configurations:
                   bf lv colorbuffer dp st  ms    vis   cav bi  renderable  supported
                id sz  l  r  g  b  a th cl ns b    id   eat nd gl es es2 vg surfaces
              ---------------------------------------------------------------------
              0x01 32  0  8  8  8  8  0  0  0 0 0x00--      a     y  y     win,pb
              0x02 32  0  8  8  8  8  0  0  4 1 0x00--      a     y  y     win,pb
              0x03 32  0  8  8  8  8 24  8  0 0 0x00--      a     y  y     win,pb
              0x04 32  0  8  8  8  8 24  8  4 1 0x00--      a     y  y     win,pb
              0x05 24  0  8  8  8  0  0  0  0 0 0x00--      y     y  y     win,pb
              0x06 24  0  8  8  8  0  0  0  4 1 0x00--      y     y  y     win,pb
              0x07 24  0  8  8  8  0 24  8  0 0 0x00--      y     y  y     win,pb
              0x08 24  0  8  8  8  0 24  8  4 1 0x00--      y     y  y     win,pb
              MESA: info: Unloaded libpvr_dri_support.so
              
              
              ---
              QT Settings:
              
              export QT_QPA_PLATFORM="wayland-egl"
              export QT_WAYLAND_SHELL_INTEGRATION="xdg-shell"
              export QT_WIDGETS_RHI=1
              export QT_WIDGETS_RHI_BACKEND=opengl
              
              
              

              Results (All taken with the PVRTune server running for looking at the results)

              • Running (Qt 6.8.3):
                rhiwindow on Weston compositor: ~3.5 fps
                rhiwindow on QT Fancy compositor: ~3 fps
                rhiwindow in sway compositor : 0.5 fps. ~2.3 fps when pvrtune is not running
                rhiwindow without compositor, using EGLFS: 35 fps

              Note: Looks like GLES2 doesn't connect fully in Sway:

              • 00:00:01.840 [wlr] [render/gles2/renderer.c:704] Failed to create GLES2 renderer

              calculator (maximized):

              • Weston : ~1/4 second delay in reaction to touch. Quickly pressing a number 5 times takes 12 seconds to resolve all 5 presses (counting from end of the last touch).

              • QT Fancy compositor: ~1/4 s delay in reaction, 5 numbers takes 5 seconds to resolve all presses. Note that maximization of the calculator fails, so this is not as large as the Weston example. Error: Can't configure xdg_toplevel with an invalid size QSize(-1, -1)

              • Sway: Several seconds between press and response.

              • Sway: Run WITHOUT any RHI components (unset the QT_RHI... variables):

                • ~1/4 second delay in response, ~1 second to resolve all presses. Not utilizing GPU at all.
                • When I turn off PVTune, no noticable delay in either case.

              I think this actually indicates to me that there is a problem in my EGL setup with QT, more than the compositor, because in the pure-GPU case of using sway without RHI, we are very fast. However, it should be noted that the weston-simple-egl application gets around 30 fps when fullscreened, and 60fps when about 1/2 size and does utilize the GPU. I will post this information in the PVR forum, as it could be a problem with my PowerVR EGL connection... but it's odd to me that the simple-egl test application in Weston works perfectly well. that QT with EGLFS (no compositor) is substantially faster, and I get ~35 FPS when running rhiwindow and see good GPU utilization.

              So it's not QT->EGL, and it's not Weston->EGL, it's QT-><any Compositor>->EGL which has a slowdown, even further in simple applications like the calculator than pure-cpu rendering.

              EDIT:
              I wrote up a post to debug the EGL/PowerVR side here: https://forums.imgtec.com/t/qt-slow-to-connect-to-pvr-using-weston/4167/2

              What I noticed while writing it was that there is a big difference in performance between QT having to change the calculator number or not. (Pressing "clear" 5x was at least twice as fast as pressing a number 5x).

              1 Reply Last reply
              0

              • Login

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • Users
              • Groups
              • Search
              • Get Qt Extensions
              • Unsolved