Hardware Reference
In-Depth Information
Optimization
If by now you still want to do more with your tool chain or make your code possibly run faster, then this is the section
that will give you more details about that. These optimizations may break some applications or may not even work on
all distributions. This section also will require you to read and understand the architecture white papers and the GCC
manuals. I will provide some examples but you really should read at least the ARM1176 process architecture white
paper and the GCC ARM options page for your version of GCC. The first page I open whenever I am going to look at
optimizations is the GCC manual. In particular, look at section 3.17.2 on “ARM Options.”
It's best to get the manual for the exact version of GCC you are using. The link you need is
http://gcc.gnu.org/onlinedocs/gcc-4.6.3/gcc/ARM-Options.html#ARM-Options . The good people who work
on GCC do a very good job on documentation. You should be happy that there are walls of text for you to read,
unlike some other larger OSS projects that have only around four lines of text within their manuals. The GCC manual,
while it is very good, is only half the documents that you need. All the options in GCC will be pretty useless if you don't
know what your CPU can support. The ARM is also quite good about supplying decent levels of documentation. Take
a look at the following web page: http://arm.com/products/processors/classic/arm11/arm1176.php .
The main page gives a good brief overview of the ARM1176 processor family but you could guess by now you're
going to need more detail. Select the “Resources” tab and find the link called “ARM1176JZF Development Chip Technical
Reference Manual”; this will bring you to an online version of the manual. With the ARM and GCC manuals you can now
start to work out what you can do with GCC. It's a good idea to read the whole thing. Now let's get some optimization
happening. Change back into your crosstool-NG working directory (in my case this was /home/brendan/ct/rpi ).
Ensure that you update your path setting and then launch the ct-ng menu:
ct-ng makemenuconfig
Select the “Target Options” menu. The first thing you should set is the “Architecture level” option, which is the
equivalent of the GCC -march option. There are many choices for this but the best choice is armv6j . This sets the
architecture type to ARMV6-J; this is the closest fit to the ARM1176JZF-S CPU.
Now that you have set the family of the CPU you should also set the subfamily type. For the “Tune for CPU”
option, enter arm1176jzf-s as that's what the Raspberry Pi uses.
Next up is the value of the floating point unit. You should know by now that the ARM1176JZF-S uses a vector
floating point unit. Set this option to vpf . The next option you will see is the one everyone is talking about: the magical
hard or soft float. For the best performance you should select “hardware (FPU)”; if your distribution is still using a soft
float it's best to use the “software” option here.
The last option for this section is “CFLAGS.” A lot of people get carried away with adding as many CFLAGS as they
can find in the GCC manual but this is not a good idea. The thing with CFLAGS is that, depending on your workload
type, they may make your code run much more slowly. On the flip side, if you can see why a certain CFLAG may help
your workload, then by all means do use it. Once again you need to read the manuals and really understand your
workload to make CFLAGS of any use. Please don't just add random CFLAGS. With that warning in mind I will heed
my own advice and not add any. In Figure 6-21 you can see my final options. Once you're done, exit and save because
there is no need to edit anything else in crosstool-NG.
 
 
Search WWH ::




Custom Search