From: K�r�at on
Hi,

As far as I know a 32-bit CPU can access 4 bytes of memory at once. If this
is true why sould we 2-byte align a short variable? For example, when a
32-bit CPU reads a dummy address 0 it gets 0-1-2-3 at once. Alignment rule
says "you should 2-byte align a short so you can put a short only into
address-0 or address-2", but what is wrong with address-1 while the CPU can
still read a short at address-1 at once?

Thanks in advance.


From: Doug Harrison [MVP] on
On Mon, 28 Apr 2008 01:27:11 +0300, "K�r�at" <xx(a)yy.com> wrote:

>Hi,
>
>As far as I know a 32-bit CPU can access 4 bytes of memory at once. If this
>is true why sould we 2-byte align a short variable? For example, when a
>32-bit CPU reads a dummy address 0 it gets 0-1-2-3 at once. Alignment rule
>says "you should 2-byte align a short so you can put a short only into
>address-0 or address-2", but what is wrong with address-1 while the CPU can
>still read a short at address-1 at once?

Two reasons:

1. Speed. Even if a CPU supports two-byte integers at odd addresses,
accessing them may be slower than if they were aligned at even addresses.

2. Correctness. Some CPUs trap when accessing misaligned data.

--
Doug Harrison
Visual C++ MVP
From: K�rsat on
Obviusly, but why? Doesn't a 32-bit CPU have 32 address lines and can locate
any byte in 4GB of RAM?

"Doug Harrison [MVP]" <dsh(a)mvps.org> wrote in message
news:0nv914pelg4qhs6lvl6o7b5h7d0ik8jspg(a)4ax.com...
> On Mon, 28 Apr 2008 01:27:11 +0300, "K�r�at" <xx(a)yy.com> wrote:
>
>>Hi,
>>
>>As far as I know a 32-bit CPU can access 4 bytes of memory at once. If
>>this
>>is true why sould we 2-byte align a short variable? For example, when a
>>32-bit CPU reads a dummy address 0 it gets 0-1-2-3 at once. Alignment rule
>>says "you should 2-byte align a short so you can put a short only into
>>address-0 or address-2", but what is wrong with address-1 while the CPU
>>can
>>still read a short at address-1 at once?
>
> Two reasons:
>
> 1. Speed. Even if a CPU supports two-byte integers at odd addresses,
> accessing them may be slower than if they were aligned at even addresses.
>
> 2. Correctness. Some CPUs trap when accessing misaligned data.
>
> --
> Doug Harrison
> Visual C++ MVP


From: abudovski on
On Apr 28, 5:13 pm, "Kürsat" <x...(a)yy.com> wrote:
> Obviusly, but why? Doesn't a 32-bit CPU have 32 address lines and can locate
> any byte in 4GB of RAM?

Some CPUs operate with soft-alignment (SA), others hard-alignment
(HA). The Intel x86 family operates in soft-alignment by default,
unless a flag is set in EFLAGS (AC). Some other processors only
operate in HA.

While the CPU can address any byte, for HA, it cannot read a DWORD
operand that is misaligned (that is, a DWORD at an address not
divisible by 4). If you try, the processor raises an exception, which
the OS usually uses to terminate your application.

In SA, if you try to read a DWORD at an address 2, The CPU must
perform 2 aligned reads, bytes 0-3 and 4-7 and fixup the result to
form a single DWORD.

This obviously incurs a performance penalty, and hence is undesirable.
From: K�rsat on
Ok, if I understand correctly; a 32-bit CPU can generate any 32-bit address
using it's address lines but it can not use every address' to read data from
RAM. Excellent but I still can't grasp the reason. CPU can put any address
into the address bus and memory controller should read bytes starting from
that address and put them into the data bus. Some limitations about CPU or
memory controller should cause that alignment issues.

<abudovski(a)gmail.com> wrote in message
news:9c38c0fe-8493-4094-9af6-d04ad1e48e28(a)j33g2000pri.googlegroups.com...
On Apr 28, 5:13 pm, "K�rsat" <x...(a)yy.com> wrote:
> Obviusly, but why? Doesn't a 32-bit CPU have 32 address lines and can
> locate
> any byte in 4GB of RAM?

Some CPUs operate with soft-alignment (SA), others hard-alignment
(HA). The Intel x86 family operates in soft-alignment by default,
unless a flag is set in EFLAGS (AC). Some other processors only
operate in HA.

While the CPU can address any byte, for HA, it cannot read a DWORD
operand that is misaligned (that is, a DWORD at an address not
divisible by 4). If you try, the processor raises an exception, which
the OS usually uses to terminate your application.

In SA, if you try to read a DWORD at an address 2, The CPU must
perform 2 aligned reads, bytes 0-3 and 4-7 and fixup the result to
form a single DWORD.

This obviously incurs a performance penalty, and hence is undesirable.