Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: PhysicsFS vs an SFML virutal filesystem  (Read 12381 times)

0 Members and 3 Guests are viewing this topic.

FRex

  • Hero Member
  • *****
  • Posts: 1848
  • Back to C++ gamedev with SFML in May 2023
    • View Profile
    • Email
PhysicsFS vs an SFML virutal filesystem
« on: January 17, 2019, 10:03:58 pm »
Notice: This topic as been split off from: https://en.sfml-dev.org/forums/index.php?topic=24998.msg166661#msg166661

Fair enough to everything I guess except for PhysicsFS.

How is it a complex library (to use) at all? It's like a handful of functions (literally like 10 or 15 total for common things) if you don't count all 'write/read big/little endian 8/16/32/64 int/uint' ones and don't want to make own archive or IO type or something.
« Last Edit: January 18, 2019, 08:52:45 am by eXpl0it3r »
Back to C++ gamedev with SFML in May 2023

DeathRay2K

  • Newbie
  • *
  • Posts: 24
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #1 on: January 17, 2019, 10:15:03 pm »
Granted, it's it's not the most complex library, but it is complex to use in a C++ codebase by virtue of being a C library with its own, non-standard data structures and conventions. It forces you at the least to create a layer of abstraction to handle that.

FRex

  • Hero Member
  • *****
  • Posts: 1848
  • Back to C++ gamedev with SFML in May 2023
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #2 on: January 17, 2019, 10:26:09 pm »
What non-standard data structures and conventions (other than having to free something yourself) are there in common use cases? :o

Edit: this is a 100% serious question. I really just counted and there is 0 data structures and a total of like 11 or 12 functions for common use, all of them trivial.
« Last Edit: January 17, 2019, 10:31:48 pm by FRex »
Back to C++ gamedev with SFML in May 2023

DeathRay2K

  • Newbie
  • *
  • Posts: 24
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #3 on: January 17, 2019, 10:46:28 pm »
I feel like we're really straying from the topic at this point, but most (all?) PhysicsFS functions return their own types, which are typedefed or custom structs, C strings, enums, void*s, etc, that all have better, standardised analogues in C++. Some of that won't matter in many cases, but enough of it matters enough of the time to make interoperating with C++ container types, or C++17 filesystem functions a pain.
I would suggest anyone unfamiliar with PhysicsFS just check the header (https://icculus.org/physfs/docs/html/physfs_8h_source.html) to get a better idea of this.

This is what the docs list for data structures, by the way: https://icculus.org/physfs/docs/html/annotated.html
I would also say that endian-specific functions are worth including in the discussion: they are necessary for cross-platform data handling, and not at all intuitive in PhysicsFS. SFML is a cross platform library, and could do a much better job with this.
« Last Edit: January 17, 2019, 10:50:21 pm by DeathRay2K »

FRex

  • Hero Member
  • *****
  • Posts: 1848
  • Back to C++ gamedev with SFML in May 2023
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #4 on: January 17, 2019, 11:22:11 pm »
Well obviously a library has its own enums for its specific errors. :o

A C string you can just assign to an std::string and then c_str it back to the library and either way it's simple and safe (you'll get a compile error if you forget c_str).

A char *, unsigned char * or void * is how every library handling memory reading/writing does it (SFML streams and C++ iostream included).

The typedefs are same as in SFML and all straightforward since there is only one of each of: 8, 16, 32 and 64 bit type of int and uint. It's a typedef, if you use your own typedef to type of same size it'll work too.

Out of all these structs the only remotely commonly used is maybe PHYSFS_Stat and (if for some reason you want to know what version and formats you have) PHYSFS_ArchiveInfo and PHYSFS_Version. All of them are pure structs you only ever read fields of. They couldn't be simpler. This is like complaining you need to use sf::Vector2u to get sf::Image size and that's confusing because it's a custom struct and you wanted an std::pair or something to 'interoperate' better.

PHYSFS_File you don't "use" in any way, you only get a pointer to it and pass it to read, write and close.

PHYSFS_Io, PHYSFS_Archiver and PHYSFS_Allocator are all very specialized things (own IO, own archive format and own memory alloc). You don't need to use them if all you want to do is read zips, 7zips, isos and normal files. The documentation even has a 'WARNING' telling you not to use them unless you have a very special need and know what you're doing.

The only other 'structured' thing is a file list but that's also trivial to iterate over in a for. PhysFS also doesn't ever deal (except for mounts) with OS specific paths or file permissions, that's its whole point. You're not supposed to use C++17 filesystem path for its files, most of them aren't even files on disk but in archives.

You're the one here saying we 'could use' a virtual filesystem in SFML while claiming its complex to use PhysFS's 10-15 most common functions like 'open' 'close' and 'read bytes' or that there's too many (1? 2?) custom structs or that void * pointer to a memory buffer is unstandard. :o

If SFML had a virtual filesystem it'd look exactly or almost exactly the same as PhysFS, with minor changes like RAII instead of user needing to do close/free, methods instead of a handle + functions that take it (literally same thing with file.func(...) instead of func(file, ...)) and std::string instead of const char * (saving you one c_str when passing a path to it).
Back to C++ gamedev with SFML in May 2023

FRex

  • Hero Member
  • *****
  • Posts: 1848
  • Back to C++ gamedev with SFML in May 2023
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #5 on: January 17, 2019, 11:28:45 pm »
Quote
I would also say that endian-specific functions are worth including in the discussion: they are necessary for cross-platform data handling, and not at all intuitive in PhysicsFS. SFML is a cross platform library, and could do a much better job with this.

I said not to count them because there's so many of them and they are all the same: returning true if it works, taking a file handle and pointer to right int type (so there is no chance you can put it in wrong type like you could with doing a = to a return value).

I seriously feel like you're trolling (or very wrong about something) at this point but okay. How does your good intuitive function for reading an unsigned 16 bit big endian value from a file look like? Or swapping from little to native endian on a 32 bit int. This is the 'unituitive' versions from PHYS for comparison:
int PHYSFS_readUBE16 ( PHYSFS_File * file, PHYSFS_uint16 * val);
PHYSFS_sint32 PHYSFS_swapSLE32(PHYSFS_sint32 val);

Feel free to include any other example of your better idea because right now I really don't get it - we need an entire VFS in SFML so you can no longer write c_str or no longer call close on a file once you're done? If SFML had a VFS it'd have a Stat struct too, probably, and an ArchiveInfo one, all with same fields as PHYSFS has, just const char * would be std::string, and it'd still be same types in all the places, and all custom error/filetype enums, and void * for passing a buffer.
« Last Edit: January 17, 2019, 11:59:06 pm by FRex »
Back to C++ gamedev with SFML in May 2023

DeathRay2K

  • Newbie
  • *
  • Posts: 24
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #6 on: January 18, 2019, 12:05:06 am »
I'm really not trolling, I am just saying that PhysicsFS is a C library with all the differences in implementation there are between C libraries and object-oriented C++ libraries.
I would prefer having a file class that provides the requisite functions. Perhaps preset to read/write a particular endianness. Like:
class VirtFile
{
    uint16_t ReadUInt16()
}
 
So using it might look something like
auto file = filesystem.GetFile("file/path/name.ext");
auto firstNum = file.ReadUInt16();
 

Endianness could be set for the filesystem and returned files using configuration options for the virtual filesystem, or better yet template parameters.

And while you bring up sf::Vector2u, there really are quite a few cases where the SFML types can and ought to be replaced with standards. Those have been discussed already elsewhere, but I'd say those are a pain point too. Not where they actually provide useful functionality, but where they simply duplicate the standards, or are actually less useful. But that's mostly down to C++ versions.
« Last Edit: January 18, 2019, 12:09:35 am by DeathRay2K »

FRex

  • Hero Member
  • *****
  • Posts: 1848
  • Back to C++ gamedev with SFML in May 2023
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #7 on: January 18, 2019, 12:40:02 am »
Quote
I'm really not trolling, I am just saying that PhysicsFS is a C library with all the differences in implementation there are between C libraries and object-oriented C++ libraries.
I would prefer having a file class that provides the requisite functions.
The way PhysFS does File handle is literally OO with private members, virtual functions, etc., just done in C 'by hand' since C doesn't have OO features. The only difference is you go method(file, args) instead of file.method(args). It's a bit more work to know/remember and it works way worse with intelli sense but PhysFS has a laughably small amount of functions for file: close, file length, set internal cache size, read/write bytes, read/write object (deprecated). Plus tons of read/write different endianess of different ints.

Of other functions there's only a handful too like mount, set write dir, get user dir, open read/write file, etc. A C++ rewrite wouldn't effect these ones at all other than letting you not do .c_str on your std::string before passing a path to them. And PhysicsFS makes effort to behave well, e.g. the string given in get user dir isn't allocated for you so you can just do std::string usedir = etc. and there is no leak. Only slightly awkward thing is get list dir but that's such a tiny piece too and it's like 5 lines to make an std::vector<std::string> returning version.

auto file = filesystem.GetFile("file/path/name.ext");
You can use auto in your code already. And what is filesystem? You want a non-global filesystem? That's fair enough but just before you complained about 'custom unstandard data structures' like Version (literally 3 x 8 8 bit int named minor, major and patch number) or ArchiveInfo that is one int (used a boolean) saying if it has symlinks + 4 const chars fields: author, email, extension, name.

auto firstNum = file.ReadUInt16();
What happens when ReadUInt16 fails? Gotta check some other function wasThereError? Exception? They're not that common or widespread in C++ at all (I kinda dislike the trend by now but don't claim that all C++ uses exceptions so it's such a pain you don't have them in C because that's untrue), SFML doesn't use them right now either. Many huge libraries don't either - even Qt doesn't. This also has the problem that someone can do int = file.ReadUInt64() very easily by accident out of habit or something while PHYSFS's version forces you to get a proper sized type var somewhere at least once. This is also a trivial one or two liner to write yourself, even as a template. You can then return a bool + int pair, throw exception, return 0, as you prefer.

Quote
Endianness could be set for the filesystem and returned files using configuration options for the virtual filesystem, or better yet template parameters.
How do you set endianess for a 'filesystem'? Which bytes need swapping depend on how big of an int you're reading and it's not that uncommon for some format to have little endian ints but another to have big endian ones. And why would you bother with a template anywhere in here? Where even? You want Filesystem<Little> and Filesystem<Big> and then everything that uses a Filesystem must also be a template now or have two version of all functions? You want to go from (basically) readWhateverSizeEndianess(file, &variable) to having to think/remember what kind of an endianess of a filesystem you got the file object from and that's a simplification or more convenient to use? Just keep calling same endianess of a function if you want to always have one endian.

Quote
And while you bring up sf::Vector2u, there really are quite a few cases where the SFML types can and ought to be replaced with standards. Those have been discussed already elsewhere, but I'd say those are a pain point too. Not where they actually provide useful functionality, but where they simply duplicate the standards, or are actually less useful. But that's mostly down to C++ versions.
What functionality is duplicated? Is there a standard C++ 2d and 3d point or rect class in C++17 or C++20 or do you really want to replace it with std::pair or std::tuple and then claim it's simpler to have free functions instead of the overloaded operators and do .first and .second or std::get<0> and std::get<1> than having .x and .y?
« Last Edit: January 18, 2019, 12:43:57 am by FRex »
Back to C++ gamedev with SFML in May 2023

DeathRay2K

  • Newbie
  • *
  • Posts: 24
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #8 on: January 18, 2019, 01:08:28 am »
Come on, it's not OO, it's normal procedural C functions like any other.
Filesystem is just what I called my imaginary virtual filesystem in a proper C++ implementation. In PhysicsFS terms, this would be the "search path" where all chosen real filesystem paths are searched when you try to access a file.

As for what to do when a function fails, I'd say you should throw an exception: if a file is missing a value you expect it's surely corrupt and you'll have to deal with that. Or as you said yourself you could easily return an int bool pair, or you could return an optional. Plenty of ways to deal with erroneous calls.
The nice thing about this is that if the programmer screws something up, or does something silly like:
int = file.ReadUInt64()
they'll get a warning or an error, letting them know very quickly that they need to fix it.

You'd set the endianness for the filesystem because as the developer you know what endianness your data files are. So when the filesystem opens a file for reading, it can automatically convert from that to native, or vice versa for writing.
For example:
VirtualFilesystem<VFS_BIG_ENDIAN> filesystem();
filesystem.AddDirectory("search/this/path");
auto file = filesystem.GetFile("filename.ext");
auto num = file.ReadUInt16();
 
I would suggest that the VirtualFile class be templated using the filesystem template, and appropriate conversion functions used based on the template. This makes all handling of endianness automatic from the perspective of the programmer using this API. Or, for someone who's reading files native to the computer, they might use VFS_NATIVE_ENDIAN or something of the sort to avoid any conversion.

As far as duplicated functionality between SFML and standards, that's all been well-discussed here and in related topics: https://en.sfml-dev.org/forums/index.php?topic=21571.0

FRex

  • Hero Member
  • *****
  • Posts: 1848
  • Back to C++ gamedev with SFML in May 2023
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #9 on: January 18, 2019, 01:39:57 am »
Quote
Come on, it's not OO, it's normal procedural C functions like any other.
Most everyone calls this an OO design in C:
https://lwn.net/Articles/444910/
https://en.wikipedia.org/wiki/GObject

Inside PhysFS literally does (in C, by hand) literally same things like C++ does: a vtable of pointers for methods handling different IO types and archive types, members per concrete type, all hidden behind a common interface from physfs.h, etc.

Quote
As for what to do when a function fails, I'd say you should throw an exception: if a file is missing a value you expect it's surely corrupt and you'll have to deal with that. Or as you said yourself you could easily return an int bool pair, or you could return an optional. Plenty of ways to deal with erroneous calls.
Sure. But that's opinionated and mismatches what SFML (and many others) does now and what many people prefer (error codes, some kind of 'optional' template type). :P

Quote
You'd set the endianness for the filesystem because as the developer you know what endianness your data files are. So when the filesystem opens a file for reading, it can automatically convert from that to native, or vice versa for writing.
But you can have files of different endianess. Png and other 'portable' types has big, many Windows originating file types have little. There are even file formats that have some fields in their headers or something that say whether or not to read the ints in that file as big or little endian and you don't know which one it is until you parse the headers so your idea totally dies with such a file.

VirtualFilesystem<VFS_BIG_ENDIAN> filesystem();
filesystem.AddDirectory("search/this/path");
auto file = filesystem.GetFile("filename.ext");
auto num = file.ReadUInt16();
You need to remove () from after filesystem (most vexing parse is the name of this error/compiler mistake).

Anyway - this is crazy. Not only can you not read in another endian within one FS, you need to keep in mind (hundreds of lines away in another function) what kind of FS has made your file, to know what kind of read (big or little) you just did, all so you can write readUInt16 instead of readUInt16BE. You also can't write a function taking VirtualFilesystem easily (forcing everyone to use templates everywhere is not 'easily') because endianess is baked into the FS type for 0 real reason.

And that's more convenient than 10 plain C functions... And the only other 'improvement' over C here is that methods are obj.method and not method(obj) or that you can use std::string too without c_str.

It makes literally no sense to have endianess set on filesystem or even file, when it can (and is) be specified in each function with just few letters like BE, LE, Big, Little. That way you don't have to look around anywhere to file or filesystem constructor or something to find out what endianess you're working with.

IF you think you need these features then you can easily wrap Phys yourself to try, but I think you'll just get burned when you have two files with different endian (big endian is 'portable' or 'network' one so it gets used in some formats, but little one is native to Windows/Intel so its in lots of Windows file formats too) or start forgetting what endian your program is using while in middle of some file parsing or waste more time on this than you'll ever save by not writing func(obj) anymore but obj.method().

Quote
As far as duplicated functionality between SFML and standards, that's all been well-discussed here and in related topics: https://en.sfml-dev.org/forums/index.php?topic=21571.0
Nowhere in there did anyone suggest removing (???) sf::Vector2. We/they are debating removing the 'shortcuts' overloads in some classes (there is setPosition(x, y) and setPosition(vector)) which are there to not force user to keep doing someFunction(sf::Vector2f(x, y)) but with modern C++ they can do someFunction({x, y}) instead which will do the right thing (make a vector) so some people want the shortcut that takes two singular arguments removed (I don't but it's not a big deal and I write sf::Vector2f(x, y) all the time for clarity in cases like these). No one is removing sf::Vector2f itself, there is no 'replacement' for it in C++ or standard libraries of any version.

And the word duplicated doesn't appear in this thread even once. :P

I'm seriously done with this now, I feel like I took troll bait for even talking to someone who wants to force all files in an application to have same endianess to not add BE or LE to a readUint16 function name, finds it hard to use (?) struct Version {int major; int minor; int patch;} or says that C style strings (that are not your responsibility to allocate or free) are hard to use along std::string or that typedefs (that you can ignore and use std::intXX_t on your code) or void * pointer for a memory buffer are weird or complex to use from C++. And to 'fix' these 'problems' we need a VFS in SFML that either wraps or reimplements PhysFS and then add this silly Filesystem<BIG> interface on top of it.

Or says we said we're gonna remove sf::Vector2  ;D
« Last Edit: January 18, 2019, 02:41:18 am by FRex »
Back to C++ gamedev with SFML in May 2023

eXpl0it3r

  • SFML Team
  • Hero Member
  • *****
  • Posts: 11030
    • View Profile
    • development blog
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #10 on: January 18, 2019, 09:32:27 am »
I have split the thread as it was clearly off-topic.

Can I also ask to have a normal discussion here, without calling people trolls or being incapable to accept that others can have a different view point or a different personal preference, thanks. :)
Official FAQ: https://www.sfml-dev.org/faq.php
Official Discord Server: https://discord.gg/nr4X7Fh
——————————————————————
Dev Blog: https://duerrenberger.dev/blog/

FRex

  • Hero Member
  • *****
  • Posts: 1848
  • Back to C++ gamedev with SFML in May 2023
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #11 on: January 18, 2019, 02:22:07 pm »
It's hard not to get a trolling intent from someone when they say things like these with such conviction.

SFML uses own enums, typedefs, void * for memory buffers and structs too. The 'make filesystem only read one kind of endian ever' is a crazy idea.

And PHYS is well behaved with const char *. It never tells you to free a returned const char * so you can always assign a returned one to std::string without a leak and it never frees or keeps one you pass in so you can use c_str.

It's as interoperable as it could ever be without returning and taking std::string itself (which would let you not call c_str and that's it, and forgetting c_str now is a compiler error so it's not like you can make a mistake and have your program break in weird ways).
« Last Edit: January 18, 2019, 02:37:56 pm by FRex »
Back to C++ gamedev with SFML in May 2023

DeathRay2K

  • Newbie
  • *
  • Posts: 24
    • View Profile
    • Email
Re: PhysicsFS vs an SFML virtual filesystem
« Reply #12 on: January 18, 2019, 09:02:22 pm »
Thanks eXpl0it3r, it was definitely way off-topic. There's a typo in the title though  ;)

FRex, I would argue that SFML is in many ways an abstraction over several other C libraries, providing a cohesive interface into their disparate functions in a way that enables them to easily work together. I don't see the issue with even considering adding PhysicsFS, or a virtual filesystem more generally, to the collection of functionalities that SFML abstracts.

And I hadn't thought of reading different formats with endianness expectations. I was mainly considering serialization for your own data. In this use case, you only ever want or need to read and write one endianness, regardless of the endianness of the native system. So for serializing your own data, it's a huge benefit to set that in one place and have it automated for all reads and writes. Most other data formats you're reading and writing are already handled by SFML already, so I would suggest there's fairly limited utility in broader endian support. A VirtualFile class could easily pass a real filesystem path to SFML's reading/writing utilities in order to handle those resources. Or they could be integrated in the VirtualFile class so you don't need to pass them from one class to another yourself. Such as:
auto file = filesystem.GetFile("file/path/here.ext");
sf::Texture texture = file.ReadTexture();
 

However, in the case of reading disparate formats that have endianness requirements, I agree that it would have to handle endianness on a file by file basis. In this case, I would suggest using functions in the VirtualFilesystem class, such as OpenFile (for native endianness), OpenFileBig, and OpenFileLittle.
I still think this would be a big improvement over PhysicsFS's functions.

And a virtual filesystem is really a very simple piece of code to write. I've written my own before because I didn't want to deal with PhysicsFS's particulars, and it's about as much work to wrap (usefully, not just a thin wrapper) it as it is to just write your own. So it would be relatively easy for SFML to provide one that's better integrated with other SFML components.

And just to take a moment to address some of the more personal attacks:
It might be helpful to reread what I've actually said, and notice that much of what you accuse me of did not come from what I wrote, but rather what you've inferred far beyond my own words.
For instance, I never said sf::Vector2 was being removed, I said that it's not out of the question to replace SFML types and utilities that are better handled with standards-based analogues, and pointed out that there has been discussion and agreement towards this end with other pieces of SFML.
There are other examples of words being put in my mouth in this way, but I really think it's beside the point of this discussion to go further into that.
« Last Edit: January 18, 2019, 09:26:59 pm by DeathRay2K »

Mario

  • SFML Team
  • Hero Member
  • *****
  • Posts: 879
    • View Profile
Re: PhysicsFS vs an SFML virutal filesystem
« Reply #13 on: February 20, 2019, 10:54:10 am »
Probably worth noting that integration of PhysFS into SFML isn't that hard. Haven't used this code in ages, but I'd assume it should still work out of the box, integrating neatly: https://github.com/SFML/SFML/wiki/Source%3A-SFML-PhysFS

 

anything