Jump to content

Welcome to Geeks to Go - Register now for FREE

Need help with your computer or device? Want to learn new tech skills? You're in the right place!
Geeks to Go is a friendly community of tech experts who can solve any problem you have. Just create a free account and post your question. Our volunteers will reply quickly and guide you through the steps. Don't let tech troubles stop you. Join Geeks to Go now and get the support you need!

How it Works Create Account
Photo

UTF-8 in C++


  • Please log in to reply

#1
scicatur

scicatur

    Member

  • Member
  • PipPip
  • 16 posts
Do you know what would be the best/most portable solution to handle UTF-8 characters in C++ code.

WinAPI seems to define a bunch of types like : WCHAR and TCHAR ... but I don't want to use anything MS specific.

wchar_t seems to be standard ... but its 16 bit unicode while UTF-8 is variable length encoding where one character takes space 1 - 6 bytes.

Also it would be nice to have something like the STL <string> class to use but with UTF-8.

Please if you know the solution or have ideas I would be most pleased to hear about it.
  • 0

Advertisements


#2
bdlt

bdlt

    Member

  • Member
  • PipPipPip
  • 876 posts
haven't used this but it may be somewhat 'portable':
http://www.utilityco...tr/default.aspx

unicode is probably more portable(no guarentee in C++, however)

sales pitch: consider using java on your next project if being portable is a requirement
  • 0

#3
scicatur

scicatur

    Member

  • Topic Starter
  • Member
  • PipPip
  • 16 posts
Thanks for answer. It looks good except it is commercial.

I think I tought up kind of a work-around solution to the problem.
Because UTF-8 its fundamentally just bytes so I read from file to
a 'unsigned char' array after which I use custom function to convert into
STL <string>. That custom function between ansi and utf-8 I maybe write myself BUT it is never perfect because utf-8 is fully unicode while ansi is not fully. Fortunately the utf-8 characters that cannot be converted are used only in very exotic languages. Converting japanese to ansi shouldn't be problem. Then of course I must make ansi to utf-8 function as well.
  • 0






Similar Topics

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

As Featured On:

Microsoft Yahoo BBC MSN PC Magazine Washington Post HP