Hardware Reference
In-Depth Information
the CPU's load. When run queues are initialized, their cpu_loads are set at zero
and updated periodically afterward. The number of runnable tasks on each run
queue is represented by the nr_running variable. The current run queue's cpu_
load variable is roughly set to the average of the current load and the previous
load using the statement shown below:
(
)
cpu _ load
=
cpu _ load
+
nr _ running
128 / 2.
*
The constant 128 is used to increase the resolution of load calculations and to
produce a fixed-point number. The above statement means that the cpu_load vari-
able accumulates the recent load history. The load balancing is done at a certain
appropriate timing. The load balancer looks for the busiest CPU. If the busiest
CPU is the current CPU, it does nothing because it is busy. If the load of the current
CPU is less than the average, and the difference in loads of two CPUs exceeds a
certain threshold, the current CPU will pull a certain number of tasks from the
busiest CPU. The number of tasks pulled is the smaller of the following two calcu-
lations. One is the difference between the busiest load and the average load of the
four CPU's, and the other is the difference between the average load of four CPU's
and the current load [ 11 ] .
The purpose of the first application program is to visualize the load balancing
mechanism of Linux. The application program shows that the number of processes
on each CPU core is averaged among the four CPU cores on the RP-1 chip.
6.3.1.2
Design and Implementation
When the application creates several processes, they will be distributed to the four
CPU cores according to the load balancing mechanism of the Linux kernel. This
mechanism should work effectively when the number of processes is both increasing
and decreasing.
A system diagram of the RP-1 application is shown in Fig. 6.21 , and the software
architecture of the RP-1 application is in Fig. 6.22 . The display unit (“DU” hereafter)
on the RP-1 chip has been used for visualization. The DU converts the contents of a
frame buffer located in the main memory into a video signal. The size of the display
is fixed to VGA, 640 × 480 pixels. The display is divided into four sections. They are
assigned to CPU #0, CPU #1, CPU #2, and CPU #3 exclusively, as shown in
Fig. 6.23 . The location of the frame buffer can be an arbitrary address. If the system
has a dedicated memory area for the frame buffer, the DU driver uses the virtual
address after mapping by the ioremap() function of Linux. In this system, the DU
driver allocates the frame buffer in the main memory, DRAM, using the dma_alloc_
coherent() function of Linux. This function allocates one or more physical pages
which can be written or read by the processor or device without worrying about
cache effects, and returns a virtual address. Finally, a frame buffer of plane 0 of the
DU can be accessed by a user program as a file, “/dev/fb0.”
The application program creates some processes. One process shows a bitmap
image of a penguin on the display. When a penguin process is assigned to CPU #3,
Search WWH ::




Custom Search