Qt 6.10 and text encodings
-
Qt 5 supports the following encodings on Windows and Mac:
ISO-8859-1 ,latin1 ,CP819 ,IBM819 ,iso-ir-100 ,csISOLatin1 ,ISO-8859-15 ,latin9 ,UTF-32LE ,UTF-32BE ,UTF-32 ,UTF-16LE ,UTF-16BE ,UTF-16 ,Big5-HKSCS ,Big5 ,Big5-ETen ,CP950 ,windows-949 ,CP949 ,EUC-KR ,Shift_JIS ,SJIS ,MS_Kanji ,ISO-2022-JP ,JIS7 ,EUC-JP ,GB2312 ,GBK ,CP936 ,MS936 ,windows-936 ,GB18030 ,hp-roman8 ,roman8 ,csHPRoman8 ,TIS-620 ,ISO 8859-11 ,WINSAMI2 ,WS2 ,macintosh ,Apple Roman ,MacRoman ,windows-1258 ,CP1258 ,windows-1257 ,CP1257 ,windows-1256 ,CP1256 ,windows-1255 ,CP1255 ,windows-1254 ,CP1254 ,windows-1253 ,CP1253 ,windows-1252 ,CP1252 ,windows-1251 ,CP1251 ,windows-1250 ,CP1250 ,IBM866 ,CP866 ,csIBM866 ,IBM874 ,CP874 ,IBM850 ,CP850 ,csPC850Multilingual ,ISO-8859-16 ,iso-ir-226 ,latin10 ,ISO-8859-14 ,iso-ir-199 ,latin8 ,iso-celtic ,ISO-8859-13 ,ISO-8859-10 ,iso-ir-157 ,latin6 ,ISO-8859-10:1992 ,csISOLatin6 ,ISO-8859-9 ,iso-ir-148 ,latin5 ,csISOLatin5 ,ISO-8859-8 ,ISO 8859-8-I ,iso-ir-138 ,hebrew ,csISOLatinHebrew ,ISO-8859-7 ,ECMA-118 ,greek ,iso-ir-126 ,csISOLatinGreek ,ISO-8859-6 ,ISO-8859-6-I ,ECMA-114 ,ASMO-708 ,arabic ,iso-ir-127 ,csISOLatinArabic ,ISO-8859-5 ,cyrillic ,iso-ir-144 ,csISOLatinCyrillic ,ISO-8859-4 ,latin4 ,iso-ir-110 ,csISOLatin4 ,ISO-8859-3 ,latin3 ,iso-ir-109 ,csISOLatin3 ,ISO-8859-2 ,latin2 ,iso-ir-101 ,csISOLatin2 ,KOI8-U ,KOI8-RU ,KOI8-R ,csKOI8R ,iscii-mlm ,iscii-knd ,iscii-tlg ,iscii-tml ,iscii-ori ,iscii-gjr ,iscii-pnj ,iscii-bng ,iscii-dev ,TSCII
Qt 6 currently supports just the following encodings on Windows and Mac:
UTF-16 ,UTF-16LE ,UTF-16BE ,UTF-32 ,UTF-32LE ,UTF-32BE ,ISO-8859-1
I understand that from Qt 6.10 a lot more encodings are going to be supported:
https://bugreports.qt.io/browse/QTBUG-132056
Can anyone tell me if all the original Qt 5 encodings are going to be supported in Qt 6.10 on both Windows and Mac?
-
The Qt 6.10 release candidate (MinGW_64) only seems to support these encodings in QStringConverter::availableCodecs():
UTF-16 ,UTF-16LE ,UTF-16BE ,UTF-32 ,UTF-32LE ,UTF-32BE ,ISO-8859-1
@AndyBrice the list doesn't seem to use ICU at all. I think these codecs are the ones that Qt 6 has.
Please try with MSVC, since MSVC comes with ICU from the Windows SDK.
I don't think MinGW comes with ICU bundled, and I'm not sure if Qt has packaged a version for MinGW as part of the provisioning. This is done for Linux, but for Windows I don't know.
Please also note that there is also LLVM-MinGW, but I don't think that has ICU either.
If MSVC works, then I suggest opening a bug report at https://bugreports.qt.io/ regarding MinGW and LLVM MinGW.
-
@AndyBrice the list doesn't seem to use ICU at all. I think these codecs are the ones that Qt 6 has.
Please try with MSVC, since MSVC comes with ICU from the Windows SDK.
I don't think MinGW comes with ICU bundled, and I'm not sure if Qt has packaged a version for MinGW as part of the provisioning. This is done for Linux, but for Windows I don't know.
Please also note that there is also LLVM-MinGW, but I don't think that has ICU either.
If MSVC works, then I suggest opening a bug report at https://bugreports.qt.io/ regarding MinGW and LLVM MinGW.
@cristian-adam I tried the MinGW_64 version because I don't have Visual Studio 2022 installed. Has anyone got Qt 6.10 and Visual Studio 2020 installed? If so, please run this and let me know that the output is:
#include <QCoreApplication> #include <QStringConverter> #include <QtDebug> int main(int argc, char *argv[]) { QCoreApplication a(argc, argv); foreach ( const QString& s, QStringConverter::availableCodecs() ) qDebug() << s; return a.exec(); }
-
@cristian-adam I tried the MinGW_64 version because I don't have Visual Studio 2022 installed. Has anyone got Qt 6.10 and Visual Studio 2020 installed? If so, please run this and let me know that the output is:
#include <QCoreApplication> #include <QStringConverter> #include <QtDebug> int main(int argc, char *argv[]) { QCoreApplication a(argc, argv); foreach ( const QString& s, QStringConverter::availableCodecs() ) qDebug() << s; return a.exec(); }
LLVM MinGW 17.0.6
C:\Projects\C++\TextEncodings\build\Desktop_Qt_6_10_0_llvm_mingw_64_bit-Debug\TextEncodings.exe... "UTF-8" "UTF-16" "UTF-16LE" "UTF-16BE" "UTF-32" "UTF-32LE" "UTF-32BE" "ISO-8859-1"
MinGW 13.1.0
C:\Projects\C++\TextEncodings\build\Desktop_Qt_6_10_0_MinGW_64_bit-Debug\TextEncodings.exe... "UTF-8" "UTF-16" "UTF-16LE" "UTF-16BE" "UTF-32" "UTF-32LE" "UTF-32BE" "ISO-8859-1"
MSVC 20222
C:\Projects\C++\TextEncodings\build\Desktop_Qt_6_10_0_MSVC2022_64bit-Debug\TextEncodings.exe... "Locale" "UTF-8" "UTF-16" "UTF-16BE" "UTF-16LE" "UTF-32" "UTF-32BE" "UTF-32LE" "UTF16_PlatformEndian" "UTF16_OppositeEndian" "UTF32_PlatformEndian" "UTF32_OppositeEndian" "UTF-16BE,version=1" "UTF-16LE,version=1" "UTF-16,version=1" "UTF-16,version=2" "UTF-7" "IMAP-mailbox-name" "SCSU" "BOCU-1" "CESU-8" "ISO-8859-1" "US-ASCII" "GB18030" "ISO-8859-2" "ISO-8859-3" "ISO-8859-4" "ISO-8859-5" "ISO-8859-6" "ISO-8859-7" "ibm-813_P100-1995" "ISO-8859-8" "ibm-916_P100-1995" "ISO-8859-9" "ISO-8859-10" "iso-8859_11-2001" "ISO-8859-13" "ISO-8859-14" "ISO-8859-15" "ibm-942_P12A-1999" "Shift_JIS" "ibm-943_P130-1999" "ibm-33722_P12A_P12A-2009_U2" "ibm-33722_P120-1999" "ibm-954_P101-2007" "EUC-JP" "ibm-1373_P100-2002" "Big5" "ibm-950_P110-1999" "Big5-HKSCS" "ibm-5471_P100-2006" "ibm-1386_P100-2001" "GBK" "GB2312" "GB_2312-80" "euc-tw-2014" "ibm-964_P110-1999" "ibm-949_P110-1999" "ibm-949_P11A-1999" "EUC-KR" "ibm-971_P100-1995" "cp1363" "ibm-1363_P110-1997" "KSC_5601" "windows-874-2000" "TIS-620" "ibm-1162_P100-1999" "IBM437" "ibm-720_P100-1997" "ibm-737_P100-1997" "IBM775" "IBM850" "cp851" "IBM852" "IBM855" "ibm-856_P100-1995" "IBM857" "IBM00858" "IBM860" "IBM861" "IBM862" "IBM863" "IBM864" "IBM865" "IBM866" "ibm-867_P100-1998" "IBM868" "IBM869" "KOI8-R" "ibm-901_P100-1999" "ibm-902_P100-1999" "ibm-922_P100-1999" "KOI8-U" "ibm-4909_P100-1999" "windows-1250" "windows-1251" "windows-1252" "windows-1253" "windows-1254" "windows-1255" "windows-1256" "windows-1257" "windows-1258" "ibm-1250_P100-1995" "ibm-1251_P100-1995" "ibm-1252_P100-2000" "ibm-1253_P100-1995" "ibm-1254_P100-1995" "ibm-1255_P100-1995" "ibm-5351_P100-1998" "ibm-1256_P110-1997" "ibm-5352_P100-1998" "ibm-1257_P100-1995" "ibm-5353_P100-1998" "ibm-1258_P100-1997" "macintosh" "x-mac-greek" "x-mac-cyrillic" "x-mac-centraleurroman" "x-mac-turkish" "hp-roman8" "Adobe-Standard-Encoding" "ibm-1006_P100-1995" "ibm-1098_P100-1995" "ibm-1124_P100-1996" "ibm-1125_P100-1997" "ibm-1129_P100-1997" "ibm-1131_P100-1997" "ibm-1133_P100-1997" "gsm-03.38-2009" "ISO-2022-JP" "ISO-2022-JP-1" "ISO-2022-JP-2" "ISO_2022,locale=ja,version=3" "ISO_2022,locale=ja,version=4" "ISO-2022-KR" "ISO_2022,locale=ko,version=1" "ISO-2022-CN" "ISO-2022-CN-EXT" "ISO_2022,locale=zh,version=2" "HZ-GB-2312" "x11-compound-text" "ISCII,version=0" "ISCII,version=1" "ISCII,version=2" "ISCII,version=3" "ISCII,version=4" "ISCII,version=5" "ISCII,version=6" "ISCII,version=7" "ISCII,version=8" "LMBCS-1" "IBM037" "IBM273" "IBM277" "IBM278" "IBM280" "IBM284" "IBM285" "IBM290" "IBM297" "IBM420" "IBM424" "IBM500" "ibm-803_P100-1999" "IBM-Thai" "IBM870" "IBM871" "ibm-875_P100-1995" "IBM918" "ibm-930_P120-1999" "ibm-933_P110-1995" "ibm-935_P110-1999" "ibm-937_P110-1999" "ibm-939_P120-1999" "ibm-1025_P100-1995" "IBM1026" "IBM1047" "ibm-1097_P100-1995" "ibm-1112_P100-1995" "ibm-1122_P100-1999" "ibm-1123_P100-1995" "ibm-1130_P100-1997" "ibm-1132_P100-1998" "ibm-1137_P100-1999" "ibm-4517_P100-2005" "IBM01140" "IBM01141" "IBM01142" "IBM01143" "IBM01144" "IBM01145" "IBM01146" "IBM01147" "IBM01148" "IBM01149" "ibm-1153_P100-1999" "ibm-1154_P100-1999" "ibm-1155_P100-1999" "ibm-1156_P100-1999" "ibm-1157_P100-1999" "ibm-1158_P100-1999" "ibm-1160_P100-1999" "ibm-1164_P100-1999" "ibm-1364_P110-2007" "ibm-1371_P100-1999" "ibm-1388_P103-2001" "ibm-1390_P110-2003" "ibm-1399_P110-2003" "ibm-5123_P100-1999" "ibm-8482_P100-1999" "ibm-16684_P110-2003" "ibm-4899_P100-1998" "ibm-4971_P100-1999" "ibm-9067_X100-2005" "ibm-12712_P100-1998" "ibm-16804_X110-1999" "ibm-37_P100-1995,swaplfnl" "ibm-1047_P100-1995,swaplfnl" "ibm-1140_P100-1997,swaplfnl" "ibm-1141_P100-1997,swaplfnl" "ibm-1142_P100-1997,swaplfnl" "ibm-1143_P100-1997,swaplfnl" "ibm-1144_P100-1997,swaplfnl" "ibm-1145_P100-1997,swaplfnl" "ibm-1146_P100-1997,swaplfnl" "ibm-1147_P100-1997,swaplfnl" "ibm-1148_P100-1997,swaplfnl" "ibm-1149_P100-1997,swaplfnl" "ibm-1153_P100-1999,swaplfnl" "ibm-12712_P100-1998,swaplfnl" "ibm-16804_X110-1999,swaplfnl" "ebcdic-xml-us"
As suspected. MinGW is treated as a second class citizen ... 🤷🏻♂️
Edit: I've opened up https://bugreports.qt.io/browse/QTBUG-140865
-
I've updated https://wiki.qt.io/Compiling-ICU-with-MinGW - Qt6 can be compiled with icu provided by mingw.