C Language, C++

structure padding and memory alignment.

Icons made by Chanut is Industries from www.flaticon.com is licensed by CC 3.0 BY

Many times I have read about the most confusing topic memory alignment, understanding of memory alignment is very important for a software engineer who works on the memory because sometimes if we don’t care about the alignment then maybe we faced some serious issues in the program.

I have read a lot of article and blog to understand the memory alignment but always I had some confusion and questions regarding the memory alignment. So finally I had thought to dig the internet to understand the memory alignment and padding.

When I had read about the memory alignment then I found generally people confuse about the bad effect of the alignment, according to some reputed blog, memory alignment affects the performance of CPU and processor takes some extra fetch to access the unaligned memory.

So then I had started to solve this mysterious question and I found that alignment is only the micro-optimization techniques, the current processor is too smart and they know how to handle the unaligned memory but in some bad situation, the processor takes some extra cycles to fetch the unaligned memory. So it’s good for the programmer to care about the alignment when he writes the program.

In real world processor, does not read or write the memory byte by byte  but actually, for performance reason, it accesses the memory in the formats like 2,4,8,16 and 32 bytes of chunks at a time.

In 32 bits processor word size is 4 bytes, if the data address within the 32 bits then it perfectly fit in the memory alignment but if it crosses the boundary of 32 bits then processor have to take some extra cycles to fetch the data from that unaligned memory.

Memory alignment

Image describe the aligned memory data for 32 bits system.

When memory is aligned then processor easily fetches the data from the memory, in below image 1 and 2 show that processor takes one cycle to Access the aligned data.

Memory Alignment

When memory is not aligned then processor takes some extra ticks to access the unaligned memory, in below image we access a 4 byte data from the unaligned address which looks like as below image.

Memory not alligned

Below image describes the steps, how is the processor access the unaligned memory

When processor gets an unaligned memory then it takes the following steps to access the unaligned memory.

  1. CPU select the unaligned memory which represents through the black dark border.
  2. CPU access the whole 4 bytes of above and below of the black square border.
  3. Shift one byte above and three bytes below in corresponding to the above and below chunks of memory.
  4. Combined the both chunks of data and get the actual bytes of data.
Steps taken by processor to access unaligned memory

Image describe how to processor access the unaligned memory.

The RISC processor throws the exception when he faced the unaligned memory but some MIPS have some special instruction to handle the unaligned memory situation, unaligned memory is not big issues for intel x86 processor it easily handles the unaligned memory but sometimes it take some extra ticks to fetch the unaligned memory.

In the program, theirs is mainly two property attached to the variable first is the value of the variable and second its address. In the case of the Intel X86 architecture address of the variable in the multiple of 1, 2, 4 or 8, in another word we can say that address of variable should be multiple of the power of 2.

In the program generally, compiler handles the scenario of alignment and aligned the variable in their boundary, we don’t need to worry about the alignment, in the 32 bits X86 architecture alignment of data type generally similar to their length.

In below table, I have described the alignment of some primitive data type which frequently used in the program

Data Type 32-bit (bytes) 64-bit (bytes)
char 1 1
short 2 2
int 4 4
float 4 4
double 8 8
pointer 4 8

Note: Alignment of data types mandated by the processor architecture, not by language.

In the case of structure or union, the compiler inserts some extra bytes between the members of structure or union for the alignment, theses extra unused bytes are called padding bytes and this technique is called padding.

Padding has increased the performance of the processor at the penalty of memory.
In structure or union data members aligned as per the size of the highest bytes member to prevent from the penalty of performance.

Here, I have described some example to clear the concept of alignment and padding.

Example 1:

Memory layout of structure InfoData

Memory Layout

In the above declaration, the structure largest byte size member is the integer (4 bytes) so to prevent from the penalty compiler inserts some extra padding bytes to improve the performance of the CPU.
So the size of the InfoData will be 12 bytes due to the padding bytes inserted by the compiler for the data alignment.

Note: In the case of structure and union we can save the wastage of memory to rearrange the structure members in the order of largest size to smallest.

Example 2:

Memory layout of Element after the rearrange of his members

Memory Layout

In the above example, the size of the InfoData is 8 bytes due to 2 tail padding bytes inserted by the compiler for the data alignment.

Example 3

 

Memory layout of structure InfoData

Memory Layout

Largest size member is double (8 bytes), hence compiler aligned the memory in the form of 8 bytes. So here compiler adds 6 padding bytes for the alignment, the size of the InfoData will be 16 bytes.

Note: We can change the alignment of structure, union or class using the “pack” pragma directive, but sometimes it becomes a crucial reason for the compatibility issues in your program. So it’s better always use the default packing of the compiler.

Sample program to how to control the packing using the pragma directive

Example Code 1

Output:
Output

In the above code, I have made the alignment of code is 2 bytes using pragma pack directive.

Example Code 2

OutPut:
tt

In the above code, I have made the alignment of code is 4 bytes using pragma pack directive.

Example Code 3

Output:

output

Conclusion

Finally, I understand memory alignment increases the performance of the processor and we have to care the alignment of the memory for the better performance of the program. CPU perform better with an aligned data as compared to the unaligned data because some processor takes an extra cycle to access the unaligned data. So when we create the structure, union or class then we have to rearrange the member in a careful way for the better performance of the program.

Your opinion matters

Although here I have tried to puts a lot of points regarding the alignment and structure padding but I would like to know your opinion regarding the memory alignment and structure padding, so please don’t forget to write a comment  in the comment box.

 




2 Comments

  1. Don Carr

    First, I would say that you need to be aware of alignment and byte padding in structures. But, you should always try to let the compiler and system calls take care of it for you and never try to do it by hand unless absolutely necessary. For instance, you need to know that a call to malloc always returns a pointer to memory that is aligned for all data types. if you want to put multiple things in part of memory you allocate, define a structure for that instead of manually deciding where things will go and calculating offsets – it will in the end come out very badly. Also any shared memory segment you get is always aligned for any data type. Never ever ever send structures over sockets, or write them to files.

    Also, it is not just a matter of speed, if you try to operate on incorrectly aligned double floating point numbers, it might silently give incorrect results, or, even core dump, depending on the processor architecture and OS. The reason for byte padding in structures is so that arrays of structures will all be correctly aligned if the first one is aligned. If you did not have the byte padding to maintain alignment, and you passed a pointer to a floating point in one of the structures in the array, it might not be aligned, and, whatever you did with that pointer is undefined and bad things can happen.

    • Amlendra

      Yes, Don Carr, nowadays all compilers are very smart and they are capable of making memory aligned. I have checked on Keil IDE, malloc also returns a pointer to memory that is aligned for all data types. So if used the malloc in the program then no need to make the allocated memory aligned manually.
      Thanks for pointing here, regarding the unaligned memory. An unaligned memory does not only affect the speed but also sometimes create a bus error and you can face the BSOD( blue screen of death) or any issues depending on the operating system and processor architecture.

Leave a Reply

Theme by Anders Norén