Use only * wildcard in QString or QRegularExpression to match?
-
Hi, I am trying to find a way to make use of just
*
wildcard to match inQString
orQRegularExpression
For example it should behave something like this:QString string{"This is a string"}; string.contains("th*str", Qt::CaseInsensitive); // should return true string.contains("th*is", Qt::CaseInsensitive) // should return true
I know I can use
.*
inQRegularExpression
but then it also translates[
,]
, etc. My input is coming directly from the user and it can contain characters like [,], (,), etc. Is it possible to use only*
as a single wildcard that matches any number of characters in searching a string? -
@CJha
OK, so the user is supplying the pattern to search for, not the data to be searched.If you want user to type just
*
to represent "any number of characters" then just pre-process the pattern string they type to change*
to.*
before submitting as a regular expression search. And if there are other characters possible in what he types which would be regular expression "special" characters but you do not want treated like that, protect/escape them.It looks to me that maybe
QRegularExpression QRegularExpression::fromWildcard(QStringView pattern, Qt::CaseSensitivity cs = Qt::CaseInsensitive, QRegularExpression::WildcardConversionOptions options = DefaultWildcardConversion)
from 6.0+ might do waht you want: produce a regular expression from a wildcard string, which I would guess means*
is "any number of characters and?
is any one character. You would have to check if that is the case. Whether it will also treat[...]
as "any of the enclosed characters" (like Linux shell wildcard would) I do not know.You're going to have be very specific on just which characters you want treated as "special" and which you only want as "literal".
UPDATE
Use only * wildcard
If that is only thing you want. And assuming you want to use a regular expression rather than doing it yourself in code. A "cheaty" way which might be adequate for you is:
- Take search string from user.
- Alter any
*
s to some literal character/sequence which you are happy the user won't use, and is not special as a regular expression. Maybe~
, or#
, would do you? - Call QString QRegularExpression::escape(const QString &str) to escape any and all other characters to literals.
- Now replace the
~
or#
with.*
.
Yes, people do stuff like this to get a desired regular expression :)
-
@CJha
I don't understand what you (think) you want. You need to useQRegularExpression
with.*
.My input is coming directly from the user and it can contain characters like [,], (,), etc.
You mean the string to search is coming from the user? So it does not matter if that contains any "special" characters, they are only significant in the regular expression pattern.
Do you (really) mean that the user is not supplying the text to search but rather the pattern to use as the match? In that case pre-process/escape the pattern before using it.
Please make clear whether the user is supplying the string(s) to search or the pattern to match.
-
@JonB Hi! The user is supplying a string to search, but the list of strings is too long so I want to provide just one wildcard (say '*') so that if the user remembers part of the string he/she can find it. For example, if the user is looking for a string "This a long string with many words" but does not remember what it says between 'long' and 'words' then the user can simply search for the string by using "long *words". The strings also contain square brackets, for example, "this is a [strong] string", and so if the user wants to search only the string that has square brackets in it then he/she can just search for "[ *]" (adding space so that it does not become italic).
-
@CJha
OK, so the user is supplying the pattern to search for, not the data to be searched.If you want user to type just
*
to represent "any number of characters" then just pre-process the pattern string they type to change*
to.*
before submitting as a regular expression search. And if there are other characters possible in what he types which would be regular expression "special" characters but you do not want treated like that, protect/escape them.It looks to me that maybe
QRegularExpression QRegularExpression::fromWildcard(QStringView pattern, Qt::CaseSensitivity cs = Qt::CaseInsensitive, QRegularExpression::WildcardConversionOptions options = DefaultWildcardConversion)
from 6.0+ might do waht you want: produce a regular expression from a wildcard string, which I would guess means*
is "any number of characters and?
is any one character. You would have to check if that is the case. Whether it will also treat[...]
as "any of the enclosed characters" (like Linux shell wildcard would) I do not know.You're going to have be very specific on just which characters you want treated as "special" and which you only want as "literal".
UPDATE
Use only * wildcard
If that is only thing you want. And assuming you want to use a regular expression rather than doing it yourself in code. A "cheaty" way which might be adequate for you is:
- Take search string from user.
- Alter any
*
s to some literal character/sequence which you are happy the user won't use, and is not special as a regular expression. Maybe~
, or#
, would do you? - Call QString QRegularExpression::escape(const QString &str) to escape any and all other characters to literals.
- Now replace the
~
or#
with.*
.
Yes, people do stuff like this to get a desired regular expression :)
-
@JonB Thanks, this method is great! It works, but even
~
and#
are recognized as special characters and are escaped, I even tried an abstract character©
(copyright sign: 0x00A9) but that was still recognized as a special character and escaped. I finally tried a string, so I replaced*
withReplace_Character_Sequence
and this worked perfectly. -
-
@CJha
That is great. I might be forgetting, but I don't think~
or#
are actually regular expression special characters. However, theescape()
method may replace every punctuation character with\
-character (including non-ASCII characters like copyright) because that never does any harm/is allowed on any punctuation character. Or it might be enough to let*
go to\*
and then revert any\*
s to plain*
s (actually.*
in your case).