Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: Unicode limited to Plane 0 (U+0000 to U+FFFF) in (C)SFML?  (Read 345 times)

0 Members and 1 Guest are viewing this topic.

kawe

  • Newbie
  • *
  • Posts: 4
    • View Profile
    • Email
Hi,

I started with CSFML a couple of weeks ago.
I am *not* using SFML but only the C binding, so questions and reports here may or may not also apply to SFML itself.
I just don't know, :-).

When I came to unicode strings I first had to manage some hurdles as my programs(s) should run in a Windows an a Linux enviroment, and handling the locales and wide characters is not alway funny then.
Especially, I miss the UTF_X_toUTF_Y conversion functions in the C binding.
You may consider to add them in a future release ...
But of course, this can also manually be done.

But when I wanted to set a unicode string to an sfText ('sfText_setUnicodeString()') I had to realize that CSFML doesn't accept unicode characters on higher planes beyond U+FFFF.

So I decided to get the character by the font itself (see example below) - same result.

Good for testing is the unicode font 'unifont' from  https://www.unifoundry.com/unifont/index.html that virtually contains an image for (nearly) every single code point.

My 'test character' is U+1D11E ('Musical Symbol G Clef') but you'll find that no unicode character at all will work in (C)SFML beyond 'U+FFFF'.

Is this a bug or by design? But if it's by design - why use 4 byte long sfUint32 integers to handle them ...?

 

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <SFML/Graphics.h>


sfRenderWindow* window;

void selectGlyph(void *data)
{
        sfUint32 *codePoint = data;
        char buffer[8];

        while(sfRenderWindow_isOpen(window))
        {
                printf("Enter hexadecimal code point: ");
                do
                {
                        fgets(buffer, 8, stdin);
                } while (strcspn(buffer, "\n") == 7);
                *codePoint = strtoul(buffer, NULL, 16);
        }
}

int main(void)
        {
        sfVideoMode mode = {160, 160, 32};
        sfTexture* texture;
        sfSprite* sprite;
        sfEvent event;
        sfColor bColor;
       
        sfUint32 codePoint = 0x0000;
        sfUint32 oldCodePt = 0xFFFF;
        sfThread *thread = sfThread_create(selectGlyph, &codePoint);
       
        /* Create the main window */
        window = sfRenderWindow_create(mode, "CSFML Unicode Test", (sfDefaultStyle), NULL);
        if (!window)
                return EXIT_FAILURE;
        sfRenderWindow_setVerticalSyncEnabled(window, sfTrue);
        sfRenderWindow_setPosition(window, (sfVector2i){50, 50});
        /* Create a graphical text to display */
        sfFont *font = sfFont_createFromFile("unifont-14.0.04.ttf");
        if (!font)
                return EXIT_FAILURE;
        unsigned int size = 160;
       
        sfThread_launch(thread);
       
        /* Start the game loop */
        while (sfRenderWindow_isOpen(window))
        {
                /* Process events */
                while (sfRenderWindow_pollEvent(window, &event))
                {
                        /* Close window : exit */
                        if (event.type == sfEvtClosed)
                                sfRenderWindow_close(window);
                }
               
                if(codePoint != oldCodePt)
                {
                        sfGlyph glyph = sfFont_getGlyph(font, codePoint, size, sfFalse, 0);
                        const sfTexture *bitmap = sfFont_getTexture(font, size);
                        if (!bitmap)
                                return EXIT_FAILURE;
                        sfImage *chr = sfTexture_copyToImage(bitmap);
                        if(glyph.bounds.width > 0 && glyph.bounds.height > 0)
                        {
                                sfImage *image = sfImage_create(glyph.bounds.width, glyph.bounds.height);
                                sfImage_copyImage(image, chr, 0, 0, glyph.textureRect, sfFalse);
                                texture = sfTexture_createFromImage(image, NULL);
                                if (!texture)
                                return EXIT_FAILURE;
                                sprite = sfSprite_create();
                                if (!sprite)
                                        return EXIT_FAILURE;
                                sfSprite_setColor(sprite, sfBlue);
                                sfSprite_setTexture(sprite, texture, sfFalse);
                                oldCodePt = codePoint;
                                bColor = codePoint ? sfYellow : sfRed;
                        }
                        else
                                codePoint = 0;
                }
                sfRenderWindow_clear(window, bColor);
                /* Draw the sprite */
                sfRenderWindow_drawSprite(window, sprite, NULL);
                /* Update the window */
                sfRenderWindow_display(window);
        }
        /* Cleanup resources */
        sfThread_destroy(thread);
        sfSprite_destroy(sprite);
        sfTexture_destroy(texture);
        sfFont_destroy(font);
        sfRenderWindow_destroy(window);
        return EXIT_SUCCESS;
}
« Last Edit: July 12, 2022, 12:35:20 pm by kawe »

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 10234
    • View Profile
    • development blog
    • Email
Re: Unicode limited to Plane 0 (U+0000 to U+FFFF) in (C)SFML?
« Reply #1 on: July 12, 2022, 12:50:17 pm »
sfFont_getGlyph and similar functions take, as you said, a Uint32, meaning four bytes, thus you can't even pass anything beyond 0xFFFF (without overflow).

It seems the underlying freetype code takes an unsigned long, which can be 8 bytes long.
Not sure if this needs to be adjusted.
Official FAQ: https://www.sfml-dev.org/faq.php
Nightly Builds: https://www.nightlybuilds.ch/
——————————————————————
Dev Blog: https://dev.my-gate.net/
Thor: http://www.bromeon.ch/libraries/thor/

kawe

  • Newbie
  • *
  • Posts: 4
    • View Profile
    • Email
Re: Unicode limited to Plane 0 (U+0000 to U+FFFF) in (C)SFML?
« Reply #2 on: July 12, 2022, 01:48:04 pm »
... not sure what you mean, but a single unsigned byte ('char')  is 0x00 ... 0xFF (8 bits), and an unsigned 4 byte value is therefore between 0x00000000 ... 0xFFFFFFFF (32 bit) what is the reason why the unicode standard (unlike the obsolete UCS-2 standard) (see also https://en.wikipedia.org/wiki/Unicode) needs more than 2 bytes to represent all code points but not more than 21 bits (less than 3 bytes!).

So a Uint*16* will overflow if unencoded used for higher planes (that's an issue with Windows that uses a 2 byte wchar_t and consequently a UTF16-encoding for "more") but an Uint32 is more than big enough to hold all the code points of all planes in the current unicode standard (0x000000 ... 0x10FFFF) without any encoding ...     
« Last Edit: July 12, 2022, 01:49:51 pm by kawe »

kojack

  • Full Member
  • ***
  • Posts: 172
    • View Profile
Re: Unicode limited to Plane 0 (U+0000 to U+FFFF) in (C)SFML?
« Reply #3 on: July 12, 2022, 02:21:55 pm »
A true type font can only hold 65535 glyphs. The number of glyphs field is 16 bit.

unifont-14.0.04.ttf is only the lower 0-FFFF glyphs. It sounds like (from the unifont page descriptions) you need unifont_upper-14.0.04.ttf to get the extra ones above that.

kawe

  • Newbie
  • *
  • Posts: 4
    • View Profile
    • Email
Re: Unicode limited to Plane 0 (U+0000 to U+FFFF) in (C)SFML?
« Reply #4 on: July 12, 2022, 02:47:36 pm »
A true type font can only hold 65535 glyphs. The number of glyphs field is 16 bit.

unifont-14.0.04.ttf is only the lower 0-FFFF glyphs. It sounds like (from the unifont page descriptions) you need unifont_upper-14.0.04.ttf to get the extra ones above that.

That's it, :-). I didn't know about the true type font size limit yet ...
And yes, using the unifont_upper-xxx.ttf give the glyphs of plane 1, :-).

Solved!

Thank you ...

« Last Edit: July 12, 2022, 02:56:39 pm by kawe »

 

anything