Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: Utf-8 coding problem  (Read 1166 times)

0 Members and 1 Guest are viewing this topic.

Rastislav Kiss

  • Newbie
  • *
  • Posts: 2
    • View Profile
Utf-8 coding problem
« on: February 02, 2018, 08:13:44 pm »
Hi all,
I am Rastislav Kiss, new on this forum. I would like to ask an question about problem I have encountered in SFML.
I am trying to create a simple app, which will show the window and catch all pressed keys as the characters to collect user input.
Window is fine, but the writing part is a problem. My code for events catching looks as follows:

      while (window.pollEvent(event))
         {
         if (event.type==sf::Event::TextEntered)
            {
            textbuffer+=event.text.unicode;
            }
         }

where textbuffer is an SF::String. I have also one method, which retrieves currently typed text

std::string get_characters()
   {
   std::string output=textbuffer.toAnsiString();
   textbuffer="";
   return output;
   }

It works, until I type some special characters of the Slovak language. If I do so, after some tests I found out, that key is catched and represented properly, but conversion to ascii fails. Characters like š, č or ť are not present in result string, although for example ý is represented correctly.
It is a bug in SFML, or just I'm doing something wrong? If yes, can you please tell me, how to do it correctly?

Thank you in advance.

Best regards

Rastislav

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 10846
    • View Profile
    • development blog
    • Email
Re: Utf-8 coding problem
« Reply #1 on: February 02, 2018, 08:47:41 pm »
I'm not quite sure why you're surprised. You're calling toAnsiString, which converts the UTF-32 to an ANSI string and ANSI simply doesn't contain the characters you're looking for. ;)
See here a table of all the characters contained in ANSI: http://ascii-table.com/ansi-codes.php

Question is more, why do you convert to an ANSI string?
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32504
    • View Profile
    • SFML's website
    • Email
Re: Utf-8 coding problem
« Reply #2 on: February 02, 2018, 10:50:27 pm »
"ANSI string" doesn't really mean a thing. It's just a short way to say "locale-dependant string". So, as the documentation says, toAnsiString() converts to the encoding defined by the locale given as argument, which defaults to the current global locale if the argument is omitted.

How do you print the converted std::string?
Laurent Gomila - SFML developer

Rastislav Kiss

  • Newbie
  • *
  • Posts: 2
    • View Profile
Re: Utf-8 coding problem
« Reply #3 on: February 03, 2018, 08:28:03 pm »
Hi,
thanks for quick ansver. :)
In Slovak language, š character for example do have ascii code 154. It is a locale dependant thing, for example in english ascii table is 154 some other character.
I thought, that toansi method will extract from utf-32 string all characters supported by the local ansi table and insert them to an std::string.
For example, if SF::String does contain 353 character (š in utf-32), there would be std::string with 154 character as a result of toansi.
Or it isn't what toansi method does? I need conversion like this to manage non-utf modules of the program, like tts synthetiser, but also saving to and loading from files, used by other programs like g++ for example.

Thank you in advance.

Best regards

Rastislav

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32504
    • View Profile
    • SFML's website
    • Email
Re: Utf-8 coding problem
« Reply #4 on: February 04, 2018, 11:17:38 am »
You just need to use the correct locale, ie. the one that your other libraries expect. Might be UTF-8, or something specific to your language, or really anything else. Relying on implicit defaults, especially without carefully reading the documentation, doesn't look like a reliable way to do things. Your current global locale might be the "C" locale, who knows... But in the end, there's very little chance that the locale that you use and the locale that is expected both match.
« Last Edit: February 04, 2018, 11:31:09 am by Laurent »
Laurent Gomila - SFML developer