Hardware Reference
In-Depth Information
Table 1-4. A Table of Integers
A
B
1
1
2
2
3
3
4
4
Let's take, for example, a nonvector processor and add each number from ranges A and B: it would need to loop
through this table five times. In a vector processor all the values from ranges A and B are loaded in one loop as shown
in Figure 1-4 .
Simple addition function without vector processing
1
2
3
4
5
Read
and
decode
Fetch
number A1
Fetch
number B1
Add
number A1
and
number B1
Push
result
Simple addition function vector processing
1
2
3
4
5
Read
and
decode
Fetch
number A1
to A5
Fetch
number B1
to B5
Add
ranges
Push
result
Figure 1-4. Flow charge showing the difference between vector and nonvector function execution
So, given Table 1-4 and Figure 1-4 , we can see that it would take the nonvector processor five loops of steps 1 to 5
to add all the numbers from each range. The vector-enabled processor can do the same work in one single loop
of steps 1 to 5.
In the Raspberry Pi world, this can be called a hard float or a soft float. A hard float is making use of the
coprocessor; a soft float is doing the same operation in software. Wherever possible you should be looking at trying to
get a hard float enabled to help speed up the applications and give your Raspberry Pi's CPU more free cycles.
Caches
As you can see from Figure 1-2 the ARM11 provides separate data and instruction caches (a cache is an area of very fast
memory that can be directly accessed by the main CPU). On the Raspberry Pi these are sometimes known as “D-cache”
and “I-cache.” In the case of the ARM1176JZF-S each cache is 16 KB in size and each cache can be locked independently
so they are four-way associative. The I-cache and D-cache make up what is commonly called the L1 cache.
Hmm, 16 KB seems pretty small. Well, it's not so bad for an L1 cache when you compare it with the Intel Core i7.
The Intel L1 cache is only 32 KB in size so for the Raspberry Pi to have 16KB by 16 KB, it's not doing too badly for itself.
Next up you would see figures on the level 2 cache, or L2 cache. If you were to use the same Intel Core i7 example, the
current generation of the i7 ships with anywhere from 8 MB to 16 MB for the L2 cache.
 
 
Search WWH ::




Custom Search