SFML community forums
Help => System => Topic started by: deadc0der on October 29, 2015, 12:17:57 am
-
Hi, I think I found a minor bug. When using sf::Utf16 t encode the Unicode character U+FFFF, it is incorrectly encoded as a surrogate pair, while it should be encoded as a single code unit. For confirmation, RFC 2781 (https://www.ietf.org/rfc/rfc2781.txt), which specifies the UTF-16 encoding, specifies in section 2.1:
If U < 0x10000, encode U as a 16-bit unsigned integer and terminate.
U+FFFF (0xFFFF) fulfills this requirement, so it should be encoded as a single code unit. However, include/SFML/System/Utf.inl at line 325 (https://github.com/SFML/SFML/blob/master/include/SFML/System/Utf.inl#L325) encodes a code point into a single code unit only if it is strictly inferior to 0xFFFF, whereas it should be inferior or equal in order to account for the edge-case that is U+FFFF.
Now, since U+FFFF is barely ever used by applications, this is not going to end the world, but I thought it might be good to report it nonetheless.
NOTE: This is my first post on this forum, so I apologize if this is not the right place to post this.
-
Thanks for reporting it.
The mistake is even more obvious if we look at what happens when we encode the value as a surrogate pair: we start by subtracting 0x10000 from it. So it's clear that surrogate pairs should start at 0x10000 and not 0xFFFF.
Should we create a PR for this '=' to add? ;D
-
Should we create a PR for this '=' to add? ;D
I'm also missing the time where such things could be fixed directly, without bureaucracy and delay ;)
I don't see any point in long reviewing of this, either.
-
I'm also missing the time where such things could be fixed directly, without bureaucracy and delay ;)
There's not really much bureaucracy for small changes and if there's bureaucracy it's rarely about getting something into master and more about styling, API or other discussions.
And the delay doesn't really matter. Whether something gets into master today or within a week or so mostly doesn't matter.
Also there's not more work to be done then if you directly pushed to master. Update master, switch branch, apply changes, commit, push to branch, create PR.
-
That "bureaucracy" helps maintaining a consistent workflow and minimizes errors. Whether it's a small or big change, errors can happen everywhere. Let's just avoid those "exceptions", otherwise we would have to find out the difference between a small/trivial and big change.
-
Should we create a PR for this '=' to add? ;D
That's probably overkill, a single commit push would do :)
-
https://github.com/SFML/SFML/pull/997
Actually it doesn't take much more time than a direct commit to master: for simple task like that you can create a branch/commit/PR directly in your browser in less than 2 minutes. ;)
@deadcOder: `commit` was the right word :P