Hello,
I've recently started to play around with Nim and am really enjoying it. The developers have done a superb job creating a practical and performant language and I'm sure that it will become very popular after it hits 1.0! I have been trying to write a reader for a specific type of data file that is used in my field called a LAS file, which is used to store LiDAR data. I've run into a problem with trying to read the header of these data and have finally narrowed it down to what looks to me like an offset in the byte sequence of the header data structure when read compared with what is stored on the disk. Ultimately, there seems to be a disparity between the size of the header data structure and what Nim sees as the size of it. I've created a simple example to illustrate this, given below, and wonder if this is perhaps a bug with Nim or if perhaps I am seeing something incorrectly.
type
headerData = tuple [fileSignature: array[4, char],
fileSourceID: uint16,
globalEncoding: uint16,
projectID1: uint32,
projectID2: uint16,
projectID3: uint16,
projectID4: uint64,
majorVersion: uint8,
minorVersion: uint8,
systemIdentifier: array[32, char],
generatingSoftware: array[32, char],
fileCreationDay: uint16,
fileCreationYear: uint16,
headerSize: uint16,
offsetToPoints: uint32,
numberOfVLRs: uint32,
pointFormatID: uint8,
pointRecordLength: uint16,
numberOfPoints: uint32]
# This is the size of headerData in bytes; should be 111
var totalSize = 4+2+2+4+2+2+8+1+1+32+32+2+2+2+4+4+1+2+4
echo "Size of the header tuple = ", totalSize
proc main() =
var h: headerData
h.fileSignature = ['L', 'A', 'S', 'F']
h.fileSourceID = 1'u16
h.globalEncoding = 0'u16
h.projectID1 = 1'u32
h.projectID2 = 0'u16
h.projectID3 = 0'u16
h.projectID4 = 0'u64
h.majorVersion = 1'u8
h.minorVersion = 2'u8
h.systemIdentifier = ['O', 'T', 'H', 'E', 'R', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ' , ' ']
h.generatingSoftware = ['L', 'P', '3', '6', '0', ' ', 'f', 'r', 'o', 'm', ' ', 'Q', 'C', 'o', 'h', 'e', 'r', 'e', 'n', 't', ' ', 'S', 'o', 'f', 't', 'w', 'a', 'r', 'e', ' ', ' ', ' ']
h.fileCreationDay = 34'u16
h.fileCreationYear = 2016'u16
h.headerSize = 227'u16
h.offsetToPoints = 1487'u32
h.numberOfVLRs = 3'u32
h.pointFormatID = 0'u8
h.pointRecordLength = 20'u16
h.numberOfPoints = 1_000_000'u32
var sizeOfH = sizeof(h)
echo "Size of 'h' = ", sizeOfH # will print 112, even though it should be 111
echo sizeOfH == totalSize # prints false
main()
I've tested this with other arbitrary tuples and it works as expected. It just seems to be this particular structure and I'm starting to wonder if I'm going crazy. I appreciate your advice.
Regards,
jlindsay
It is the pointFormatID which takes 16 Bit. Afaik you can't make assumptions on memory layout here!
You can use the {.packed.} which is available for object (only it seems) like this:
type
headerData = object {.packed.}
fileSignature: array[4, char]
fileSourceID: uint16
globalEncoding: uint16
projectID1: uint32
projectID2: uint16
projectID3: uint16
projectID4: uint64
majorVersion: uint8
minorVersion: uint8
systemIdentifier: array[32, char]
generatingSoftware: array[32, char]
fileCreationDay: uint16
fileCreationYear: uint16
headerSize: uint16
offsetToPoints: uint32
numberOfVLRs: uint32
pointFormatID: uint8
pointRecordLength: uint16
numberOfPoints: uint32
# This is the size of headerData in bytes; should be 111
var totalSize = 4+2+2+4+2+2+8+1+1+32+32+2+2+2+4+4+1+2+4
echo "Size of the header tuple = ", totalSize
proc main() =
var h: headerData
h.fileSignature = ['L', 'A', 'S', 'F']
h.fileSourceID = 1'u16
h.globalEncoding = 0'u16
h.projectID1 = 1'u32
h.projectID2 = 0'u16
h.projectID3 = 0'u16
h.projectID4 = 0'u64
h.majorVersion = 1'u8
h.minorVersion = 2'u8
h.systemIdentifier = ['O', 'T', 'H', 'E', 'R', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ' , ' ']
h.generatingSoftware = ['L', 'P', '3', '6', '0', ' ', 'f', 'r', 'o', 'm', ' ', 'Q', 'C', 'o', 'h', 'e', 'r', 'e', 'n', 't', ' ', 'S', 'o', 'f', 't', 'w', 'a', 'r', 'e', ' ', ' ', ' ']
h.fileCreationDay = 34'u16
h.fileCreationYear = 2016'u16
h.headerSize = 227'u16
h.offsetToPoints = 1487'u32
h.numberOfVLRs = 3'u32
h.pointFormatID = 0'u8
h.pointRecordLength = 20'u16
h.numberOfPoints = 1_000_000'u32
var sizeOfH = sizeof(h)
echo "Size of 'h' = ", sizeOfH # will print 112, even though it should be 111
echo sizeOfH == totalSize # prints false
main()
Which shows 111 for both.
See: http://nim-lang.org/docs/manual.html#foreign-function-interface-packed-pragma
Hello OderWat,
Thank you very much for your quick reply. I'm a little confused when you say 'It is the pointFormatID which takes 16 Bit' because pointFormatID is a uint8 in the structure. At this point I'm sure that I'm missing something obvious and am going to feel foolish. Can you explain? In any event, it appears that your 'packed' pragma suggestion is exactly what I needed. I've spent the last two days trying to solve this problem so I am immensely grateful to you.
Regards,
jlindsay
Thank you for explaining that Def. I thought that it might be something along those lines but have never before encountered a situation quite like this. I'm concerned about the warning in the packed pragma description that OderWat linked to, in which it states, "Combining packed pragma with inheritance is not defined, and it should not be used with GC'ed memory" Do you think this will cause problems?
I must say, I've been looking at things on this forum for the last week or so but finally decided to take the dive today in posting this. I am really pleased with how friendly the community is to new-comers. I'm so thankful that I've discovered Nim.