You probably already know that a floating point number is represented in
binary as sign * significand * radixexponent. Thus you can represent
the number 20.23 with radix 10 (i.e. base 10) as 1 * .2023 * 102 or
as 1 * .02023 * 103.
Since one number can be represented in different ways, we define we
defined the normalised version as the one that satisfies
`` 1/radix <= significand < 1``. You can read that as saying "the
leftmost number in the significand should not be zero".
So when we convert into binary (base 2) rather than base 10, we are
saying that the "leftmost number should not be zero", hence, it can only
be one. In fact, the IEEE standard "hides" the 1 because it is implied
by a normalised number, giving you an extra bit for more precision in
the significand.
So to normalise a floating point number you have to shift the signifcand
left a number of times, and check if the first digit is a one. This is
something that the hardware can probably do very fast, since it has to
do it a lot. Combine this with an architecture like IA64 which has a 64
bit significand, and you've just found a way to do a really cool
implementation of "find the first bit that is not zero in a 64 bit
value", a common operation when working with bitfields (it was really
David Mosberger who originally came up with that idea in the kernel).
#define ia64_getf_exp(x) \
({ \
long ia64_intri_res; \
\
asm ("getf.exp %0=%1" : "=r"(ia64_intri_res) : "f"(x)); \
\
ia64_intri_res; \
})
int main(void)
{
long double d = 0x1UL;
long exp;
exp = ia64_getf_exp(d);
printf("The first non-zero bit is bit %d\n", exp - 65535);
}
Note the processor is using an 82 bit floating point implementation,
with a 17 bit exponent component. Thus we use a 16 bit (0xFFFF, or
65535) bias so we can represent positive and negative numbers (i.e, zero
is represented by 65535, 1 by 65536 and -1 by 65534) without an explicit
sign bit.
IA64 uses the floating point registers in other interesting ways too.
For example, the clear_page() implementation in the kernel spills
zero'd floating point registers into memory because that provides you
with the maximum memory bandwidth. The libc bzero() implementation
does a similar thing.