Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. Qt WebKit
  4. Unicode/UTF-8 encoding while getting data from site via QWebElement

Unicode/UTF-8 encoding while getting data from site via QWebElement

Scheduled Pinned Locked Moved Qt WebKit
webkitwebview webkitqwebframewebelementcolleqwebelementunicodeutf-8encodingjsonjson parser
1 Posts 1 Posters 1.2k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    slesher
    wrote on 11 Aug 2015, 09:13 last edited by slesher 8 Nov 2015, 10:24
    #1

    Hello
    I am getting list of films from the website, each of films is between <div class="header">...</div> tags.
    In the previous versions of Qt (now I have Qt 5.5) the data received were viewed appropriately, like in site.
    But now only Latin symbols are viewed directly, and the others (Cyrillic) are just in their encoded representation.
    Watching different topics and posts, which have some similar problems, i discovered that it was connected with Unicode, the web page had utf-8 encoding, and some people told that it should have been parsed using json parser.
    My code:

    Seeker::Seeker(QWidget *parent) :
    QDialog(parent),
    ui(new Ui::Seeker)

    {

       ui->setupUi(this);
        webView = new QWebView();
        webView->setUrl(QUrl("http://kino-butterfly.com.ua/cinema.php?sTheater=cosmopolite"));
    
       connect(webView, SIGNAL(loadFinished(bool)), this, SLOT(parse()));
    

    }

    void Seeker::parse() {

    QStringList * filmList = new QStringList();
    
    qDebug()<<"loading finished";
    
    qDebug()<<webView->title();
    QWebFrame *frame = webView->page()->mainFrame();
    
         QWebElement document = frame->documentElement();
         QWebElementCollection elements = document.findAll("div.header");
    
         foreach (QWebElement element, elements) {
             filmList->append(element.toPlainText());
    
         }
    
         foreach (QString film, *filmList)
            qDebug()<<film;
    

    }

    So, for example. I get "\u041B\u044E\u0434\u0438\u043D\u0430-\u043C\u0443\u0440\u0430\u0445\u0430 3D" instead of Людина-Мураха 3D . How should I get the true text?

    UPD: I really discovered, that these letters in Ukrainian language have the Unicode representation, like in the Unicode standards table.
    Can anyone tell me if there is any function or method of QString which can give me a simple letter?

    1 Reply Last reply
    0

    1/1

    11 Aug 2015, 09:13

    • Login

    • Login or register to search.
    1 out of 1
    • First post
      1/1
      Last post
    0
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Get Qt Extensions
    • Unsolved