SFML community forums

Help => System => Topic started by: Danetta on December 15, 2016, 03:52:07 am

Title: UTF-8 and stuff
Post by: Danetta on December 15, 2016, 03:52:07 am
Hello.

EDIT: It seems like I resolved almost all the issues.

EDIT2: How much bytes I possibly need to convert from sf::String to std::string or std::wstring?
In case when I need to resize empty strings so the converted one would fit. Probably an easy question, but can't google it.
Title: UTF-8 and stuff
Post by: eXpl0it3r on December 15, 2016, 11:06:22 am
Just make use of the convert functions the SFML provides, see the documentation.
Title: Re: UTF-8 and stuff
Post by: Danetta on December 15, 2016, 04:01:51 pm
Almost all of them require iterators to begin of original string | end of original string | begin of output string.
Output string is empty by default and most conversion function from there (http://www.sfml-dev.org/documentation/2.4.1/classsf_1_1Utf_3_018_01_4.php) would give me
string iterator + offset out of range
if I do not set the size of output string manually like that:

std::string ANSI_to_UTF8(const sf::String& original)
{
        std::string ansi;                                                              
        ansi.resize(original.getSize() * 4);

        std::string::iterator last = sf::Utf<8>::fromAnsi(original.begin(), original.end(), ansi.begin(), std::locale("Russian"));
        ansi.resize(last - ansi.begin());
        return ansi;
}
Title: Re: UTF-8 and stuff
Post by: eXpl0it3r on December 15, 2016, 05:35:58 pm
Maybe describe your problem in detail, because I don't really understand the issue. Or maybe you just misunderstood the API?

If you have a sf::String that is UTF and want to convert to an ANSI string, all you have to do is call toAnsiString() (and toWideString() for wstring).

If you want to do it manually, you could also take a look how SFML does it internally (https://github.com/SFML/SFML/blob/master/src/SFML/System/String.cpp#L150) for toAnsi.
Title: Re: UTF-8 and stuff
Post by: dabbertorres on December 15, 2016, 05:50:09 pm
string iterator + offset out of range
if I do not set the size of output string manually like that
You want std::back_inserter (http://en.cppreference.com/w/cpp/iterator/back_inserter)
Title: Re: UTF-8 and stuff
Post by: Danetta on December 15, 2016, 08:20:16 pm
Maybe describe your problem in detail, because I don't really understand the issue. Or maybe you just misunderstood the API?

If you have a sf::String that is UTF and want to convert to an ANSI string, all you have to do is call toAnsiString() (and toWideString() for wstring).

If you want to do it manually, you could also take a look how SFML does it internally (https://github.com/SFML/SFML/blob/master/src/SFML/System/String.cpp#L150) for toAnsi.
What if my sf::String is widestring and I want to convert it to UTF-8?


string iterator + offset out of range
if I do not set the size of output string manually like that
You want std::back_inserter (http://en.cppreference.com/w/cpp/iterator/back_inserter)
Seems like true, will try to figure out.
Title: Re: UTF-8 and stuff
Post by: Laurent on December 15, 2016, 08:28:42 pm
Quote
What if my sf::String is widestring and I want to convert it to UTF-8?
sf::String::toUtf8(). And you don't have to care about what is stored internally in sf::String (it's UTF-32, by the way).

You still have said nothing about your real problem. This conversation can last for days if you don't tell us what you really want to do.
Title: Re: UTF-8 and stuff
Post by: Danetta on December 15, 2016, 08:39:59 pm
I find sf::String::toUtf8 too hard to use because output is std::basic_string while sf::Utf<8> allows me to ger result almost within in a single string. I tried it and got what I need, but with some problems like having to resize declared strings manually so converted ones would fit into them.

Answering your question about what I am trying to do:
Client terminal displaying strings in cp_1251, client gui (label and editboxes) using widestring (seems like?), server processing requests to Database using UTF-8 strings as arguments.

So, when I receive widestring (or UTF-32? I am still not sure, but widestring works well when I use it as parameter; I don't want to, however, but TGUI documentation only said it's "sf::String"; also, UTF-8 doesn't display well with TGUI unless I convert to widestring.. ugh..) from gui, I need to handle all the encodings and decodings from client to server and then back.


Edited.
Title: Re: UTF-8 and stuff
Post by: Laurent on December 15, 2016, 09:07:49 pm
Quote
I find sf::String::toUtf8 too hard to use because output is std::basic_string
?
std::string is just a typedef to std::basic_string<char>; so what you get with this function is basically std::string with something else than char inside.

Quote
I tried it and got what I need, but with some problems like having to resize declared strings manually so converted ones would fit into them.
As already said, std::back_inserter is the solution. Look at sf::Utf source code for examples.
Title: Re: UTF-8 and stuff
Post by: Danetta on December 16, 2016, 03:20:59 am
std::string String::toAnsiString(const std::locale& locale) const
{
    // Prepare the output string
    std::string output;
    output.reserve(m_string.length() + 1);

    // Convert
    Utf32::toAnsi(m_string.begin(), m_string.end(), std::back_inserter(output), 0, locale);

    return output;
}

Why would you use string::reserve in this case? Is not it unneeded when you use std::back_inserter?
Title: UTF-8 and stuff
Post by: eXpl0it3r on December 16, 2016, 03:28:02 am
I'm not certain that this is the primary reason, but by reserving the needed memory space, you can do one memory allocation opposed to the string incrementally allocating more and more space.
Title: Re: UTF-8 and stuff
Post by: Laurent on December 16, 2016, 08:40:22 am
Yes, it's just an optimization to avoid many small memory re-allocations. You can remove this line, the function will still work as expected.