Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. Qt WebKit
  4. Problem about obtain text node with QWebElement
Forum Updated to NodeBB v4.3 + New Features

Problem about obtain text node with QWebElement

Scheduled Pinned Locked Moved Qt WebKit
7 Posts 4 Posters 6.7k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B Offline
    B Offline
    billconan
    wrote on last edited by
    #1

    Hello there,

    I am trying to port a javascript library to c++ with QtWebkit. Although most of the parts worked, there is one specific task I cannot do with the QtWebkit.

    the task is, if I find a <div> tag in the current document tree, I add <p></p> to the inner text node of the <div> tag

    for example, if I see a piece of HTML code like this:

    @<div>hello there! how are you?<a href="xxx">xxx</a> I'm ok. </div>@

    I should output this:

    @<div><p>hello there! how are you?</p><a href="xxx">xxx</a><p> I'm ok.</p> </div>@

    But because QWebElement doesn't treat text nodes as web elements, I cannot see them from the QtWebkit document tree. So I cannot add "<p></p>" to enclose them.

    I tried to use more general xml parser, the QDomDocument and QDomNode. It works sometimes, but since this xml parser is more strict than the HTML parser, it fails when the inner HTML of the <div> tag has <br> or <img> or anything it considers as "tag miss match".

    now I don't know how to do this.

    thank you.

    1 Reply Last reply
    0
    • J Offline
      J Offline
      jichi
      wrote on last edited by
      #2

      I am experiencing exactly the same problem. I am writing an HTML parser and I wanted to capture text like this:

      <p>seg1<span>seg2</span>seg3</p>

      I could get either "seg2" or "seg1<span>seg2</span>seg3"; but what should I do to capture text between QWebElement siblings such as "seg1" and "seg3"?

      1 Reply Last reply
      0
      • P Offline
        P Offline
        Pawelitel
        wrote on last edited by
        #3

        2billconan
        My solution is to edit HTML as QString, and already the result to parsing
        @QString pTag(QString html)
        {
        QWebPage page;
        QString innerHtml;
        QWebElement div;
        page.mainFrame()->setHtml(html);
        div=page.mainFrame()->documentElement().findFirst("body").firstChild();
        innerHtml=div.toInnerXml();
        innerHtml.replace(QString("<"),QString("<<"),Qt::CaseInsensitive);
        innerHtml.replace(QString(">"),QString(">>"),Qt::CaseInsensitive);
        innerHtml.replace(QString("<<"),QString("< / p ><"),Qt::CaseInsensitive);
        innerHtml.replace(QString(">>"),QString("> < p >"),Qt::CaseInsensitive);
        innerHtml="< p >"+innerHtml+"< / p >";
        div.setInnerXml(innerHtml);
        return div.toOuterXml();
        }@
        dumb postparser does not write tags "< p >" without spaces, in the source erase any extra spaces

        live long and prosper

        1 Reply Last reply
        0
        • P Offline
          P Offline
          Pawelitel
          wrote on last edited by
          #4

          2billconan
          Another solution would be to use CSS "Pseudo-elements":http://www.w3.org/TR/CSS2/selector.html#pseudo-element-selectors

          live long and prosper

          1 Reply Last reply
          0
          • J Offline
            J Offline
            jichi
            wrote on last edited by
            #5

            @Pawelitel Why not directly replace "<" with "</p><"?

            Anyway, thanks a lot for your hints! Wish nokia would extend the current api in the near future.

            1 Reply Last reply
            0
            • J Offline
              J Offline
              jichi
              wrote on last edited by
              #6

              I see why. what a clever trick! thx again! >_<

              1 Reply Last reply
              0
              • N Offline
                N Offline
                neuxb
                wrote on last edited by
                #7

                Hi, you could see href attribute with:
                QWebElement.attribute('href')
                that shows 'http://www.domain.com"

                1 Reply Last reply
                0

                • Login

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • Users
                • Groups
                • Search
                • Get Qt Extensions
                • Unsolved