nimforum mirror - Why does os.walkDir and consequently os.walkDirRec not use filesystem order?

ITwrx (orginal) [2023-05-26T16:09:09+02:00] view original

My dev env is fedora. walkDirRec walks dirs in seemingly random order. I assumed it would traverse dirs and files in the order that the OS filesystem uses. Why does it not do that? Where is this randomness/order coming from?

thanks

JohnAD (orginal) [2023-05-26T17:33:17+02:00] view original

It depends on the filesystem. I would expect to see the same order you would see if you ran ls -U at the console. I think ext4 uses some kind of B-tree.

I would recommend sorting it if you plan on then displaying it to the user as a list. If you know it is proper ASCII, then a simple sort will do.

To mimic ls, you would do a byte-level sort on the entries.

Also, a simple Rune-point sort is possible. (A < B< a < b)

Unfortunately, you cannot do an alphabetical sort on UTF8 yet; as the DUCET collation algorithm has not yet been written for Nim. (A < a < B< b)

Araq (orginal) [2023-05-26T18:15:42+02:00] view original

Unfortunately, you cannot do an alphabetical sort on UTF8 yet

True but how would you know the filesystem's "language" anyway. You cannot really sort without knowing the language.

ITwrx (orginal) [2023-05-26T18:49:37+02:00] view original

It depends on the filesystem.

whoops! ext4

I would expect to see the same order you would see if you ran ls -U at the console.

I didn't remember/realize ls and gui were sorting for me. lol.

I would recommend sorting it if you plan on then displaying it to the user as a list.

I'm trying to make a web site generator and need to determine section/page heirarchy from directory structure.

A simple sort should be fine. Thanks for the help!

JohnAD (orginal) [2023-05-29T01:36:29+02:00] view original

True but how would you know the filesystem's "language" anyway. You cannot really sort without knowing the language.

Not sure about other environs; but in Linux the locale can be taken from LANGUAGE, LC_ALL, LC_xxx, LANG. (in that order). PopOS! uses LANG. Mine is set to LANG=en_US.UTF-8.

The DUCET algo also has an "international default" for when the language is not known; but you are right, it is better to use the specific language of the person who is viewing it. (The author's language of the filesystem entry need not be known. So, the same list of files may appear in different order to different users based on locale.)

JohnAD (orginal) [2023-05-29T01:45:23+02:00] view original

For the curious: https://github.com/nitely/nim-unicodedb . The author is currently adding the raw data for the DUCET algo right now.

(Now that I think of it; I don't think Nim or any library has an enum with the official list of locales in it.)

Mirror of forum.nim-lang.org

10233 :: Why does os.walkDir and consequently os.walkDirRec not use filesystem order?