Processor Cores - Heterogeneous Multicore Processor Technologies for Embedded Systems

Hardware Reference

In-Depth Information

The rounding rule of the conventional floating-point operations is strictly defined

by an ANSI/IEEE 754 floating-point standard. The rule is to keep accurate value

before rounding. However, each instruction performs the rounding, and the accumu-

lated rounding error sometimes becomes very serious. Therefore, a program must

avoid such a serious rounding error without relying to hardware if necessary. The

sequence of one FMUL and three FMACs can also cause a serious rounding error.

For example, the following formula results in zero, if we add the terms in the order

of the formula by FADD instructions:

×−×

127

102

127

1.0

×+

1.FFFFFE

×+

1.FFFFFE

1.0

103

However, the exact value is

1.FFFFFE

, and the error is

1.FFFFFE

2 − times of the maximum term.

We can get the exact value if we change the operation order properly. The floating-

point standard defines the rule of each operation, but does not define the result of the

formula, and either of the result is fine for the conformance. Since the FIPR opera-

tion is not defined by the standard, we defined its maximum error as “

for the formula, which causes the worst error of

2 E − + round-

ing error of result” to make it better than or equal to the average and worst-case

errors of the equivalent sequence that conforms the standard, where E is the maxi-

mum exponent of the four products.

A length-4 vector transformation was also popular operation of a 3D graphics,

and a floating-point transform vector instruction (FTRV) was defined. It required 20

registers to specify the operands in a modification type definition. Therefore, the

defining formula is as follows, using a four-by-four matrix of all the back bank reg-

isters, XMTRX, and one of the four front-bank vector registers, FV0-FV3:

XMTRX

(

: 0, 4,8,12).

Since a 3D object consists of a lot of polygons expressed by the length-4 vectors,

and the same XMTRX is applied to a lot of the vectors of a 3D object, the XMTRX

is not so often changed and suitable for using the back bank.

The FTRV operation was implemented as four inner-product operations by divid-

ing the XMTRX into four vectors properly, and its maximum error is the same as the

FIPR. It could be replaced by four inner-product instructions if we made input and

output registers different to keep the input value properly during the transformation.

The formula would become as follows:

(XF0

XF4

XF8

XF12) · FV

FR [

(XF1

XF5

XF9

XF13) · FV

FR [

(XF2

XF6

XF10

XF14) · FV

FR [

(XF3

XF7

XF11

XF15) · FV

The above inner-product operations were different from that of the FIPR in the

Heterogeneous Multicore Processor Technologies for Embedded Systems

Search WWH ::

Custom Search

Home