Rewaz said:
But how do u handle chinese names, for example? Or the text localization in the files.
Adding a tiny bit to Alberth's excellent answer: be careful that you're not conflating different things, and do your best to build systems in a way that makes it meaningless.
When you're programming stop thinking of things like filenames or UI text as words and symbols. It is far better to think of them as generic blobs of data which happen to correlate into human-meaningful blobs.
If you're in generic C++, there are now system-agnostic formats like std::filesystem::path data type that works automatically with 8-bit, 16-bit, and 32-bit string formats. You don't need to be bothered with the underlying structure, nor about if they use a forward slash or backslash or other characters, nor about any other system-specific elements.
In windows, that can also mean a generic 8-bit string for the “A” functions like CreateFileA()
, FindFileA()
, CreateDirectoryA()
, and related functions, or the 16-bit wchar_t string and the “W” functions like CreateFileW()
, FindFileW()
, CreateDirectoryW()
, and related functions. Windows has a bunch of functionality that automatically handle them, with Axxx and Wxxx variants, plus Txxx variants with intelligent rules to help move between them.
In an Apple MacOS environment, or Linux environment, the wchar_t is 32 bits and the blob of text for the file name is adjusted accordingly. Linux has had UTF8 for filesystems almost from the very beginning, which is perfectly capable of encoding unicode file names.
But ultimately, if you've done it right you don't care. Files are located “somewhere”. You can throw up a dialog box that lets the user point to “somewhere else”. Done well you shouldn't care if they're located on a hard drive, a USB drive through a chain of 17 usb hubs, or a network location on a different continent. You don't care if the file name is in English, Spanish, Korean, Kanji, or Klingon, all of them are a blob representing “somewhere”. You pass that argument that says “somewhere”, and the functions know how to interpret “somewhere” and create a file pointer out of it.
When you are displaying blobs of text on the screen allow for translators to do their thing, and for artists to do their things around fonts, and for unicode rendering to do it's thing. You do that by not manipulating strings for UI. Let them be driven by string tables, and you pass along whatever blob of text is in the string table. This way you don't know or care what's actually displayed, all you care about is the tag for the data.
In Unreal, they're all put into FText instances that can all be customized by translation. Functions like FText::Format()
allow translators to reposition strings, change tense based on the plurality or genders used, and more, but the programmer doesn't need to know or care about it, it's just an FText object that represents a blob of text meaningful to the user. As one of many examples in Unreal, following the simple rules lets translators do things like "You came {Place}{Place}|ordinal(one=st,two=nd,few=rd,other=th)!"
to make "You came 1st!"
, or “You came 3rd!"
or for genedered languages, like a male warrior guerrero
or female warrior guerrera
based on tags.
The point of all this is: Build your systems in a way that it doesn't matter.