Explain briefly about the major enhancements in ARM processo
Solution
ARM:
• ARM stands for Advanced RISC Machine based on the RISC.
• It has high code density, low power consumption & low silicon area.
• It is a load-store architecture, data processing through registers and does not involve changes directly within memory and it gives good speed vs power consumption ratio.
RISC features
Instructions:
Lower number of instructions compared to CISC. The compiler or programmer synthesizes complicated operations by combining several simple instructions. Each instruction is a fixed length to allow the pipeline to fetch future instructions before decoding the current instruction.
Pipeline:
The processing of instructions is broken down into smaller units that can be executed in parallel by pipelines. Ideally the pipeline advances by one step on each cycle for maximum throughput. Instructions can be decoded in one pipeline stage. There is no need for an instruction to be executed by a mini-program called microcode as on CISC processors.
Fixed number of instruction cycles: Most instructions single cycle.
Registers:
RISC have a large number of general purpose registers while CISC have special purpose registers. In RISC any register can contain either data or an address. Registers act as the fast local memory store for all data processing operations.
Load-store architecture -The processor operates on data held in registers. Separate load and store instructions transfer data between the register bank and external memory. Memory accesses are costly, so separating memory accesses from data processing pro-vides an advantage because you can use data items held in the register bank multiple times without needing multiple memory accesses. In contrast, with a CISC design the data processing operations can act on memory directly.
ARM feature improvements over RISC
Variable cycle execution for certain instructions-Not every ARM instruction executes in a single cycle. For example, load-store-multiple instructions vary in the number of execution cycles depending upon the number of registers being transferred. The transfer can occur on sequential memory addresses, which increases performance since sequential memory accesses are often faster than random accesses.
Inline barrel shifter leading to more complex instructions-The inline barrel shifter is a hardware component that preprocesses one of the input registers before it is used by an instruction. This expands the capability of many instructions to improve core performance and code density.
ARM has enhanced the processor core by adding a second 16 bit instruction set called Thumb. This thumb instruction permits the ARM core to execute either 16 bit or 32 bit instructions. The 16 bit instructions improve code density by about 30 percent compare to 32 bit instructions of fixed length.
Conditional execution-An instruction is only executed when a specific condition has been satisfied. This feature improves performance and code density by reducing branch instructions.
Enhanced instructions-The enhanced digital signal processor (DSP) instructions were added to the standard ARM instruction set to support fast 16 x 16-bit multiplier operations and saturation. These instructions allow a faster-performing ARM processor in some cases to replace the traditional combinations of a processor plus a DSP. The simplified design of ARM processors enables more efficient multi-core processing and easier coding for developers.
Profile of is shown as follows,ARM Profiles are A Profile,M Profile and R Profile.
CPU Core
MMU/MPU
Cache
Jazelle
Thumb
ISA
Ea
ARM7TDMI
None
None
no
yes
v4T
no
ARM7EJ-S
None
None
yes
yes
v5TEJ
yes
ARM720T
MMU
unified, 8K cache
no
yes
v4T
no
ARM920T
MMU
separate, 16K/16K D + I cache
no
yes
v4T
no
ARM922T
MMU
separate, 8K/8K D +I
no
yes
v4T
no
ARM926EJ-S
MMU
separate , cache and TCMs configurable
yes
yes
v5TEJ
yes
ARM940T
MPU
separate, 4K/4K D+I cache
no
yes
v4T
no
ARM946E-S
MPU
separate, cache and TCMs configurable
no
yes
v5TE
yes
ARM966E-S
none
separate, TCMs configurable
no
yes
v5TE
yes
ARM1020E
MMU
separate, 32K/32K D + I cache
no
yes
v5TE
yes
ARM1022E
MMU
separate, 16K/16K D + I cache
no
yes
v5TE
yes
ARM1026EJ-S
MMU and MPU
separate, cache and TCMs configurable
yes
yes
v5TE
yes
ARM1136J-S
MMU
separate, cache and TCMs configurable
yes
yes
v6
yes
ARM1136JF-S
MMU
separate, cache and TCMs configurable
yes
yes
v6
yes
| CPU Core | MMU/MPU | Cache | Jazelle | Thumb | ISA | Ea |
| ARM7TDMI | None | None | no | yes | v4T | no |
| ARM7EJ-S | None | None | yes | yes | v5TEJ | yes |
| ARM720T | MMU | unified, 8K cache | no | yes | v4T | no |
| ARM920T | MMU | separate, 16K/16K D + I cache | no | yes | v4T | no |
| ARM922T | MMU | separate, 8K/8K D +I | no | yes | v4T | no |
| ARM926EJ-S | MMU | separate , cache and TCMs configurable | yes | yes | v5TEJ | yes |
| ARM940T | MPU | separate, 4K/4K D+I cache | no | yes | v4T | no |
| ARM946E-S | MPU | separate, cache and TCMs configurable | no | yes | v5TE | yes |
| ARM966E-S | none | separate, TCMs configurable | no | yes | v5TE | yes |
| ARM1020E | MMU | separate, 32K/32K D + I cache | no | yes | v5TE | yes |
| ARM1022E | MMU | separate, 16K/16K D + I cache | no | yes | v5TE | yes |
| ARM1026EJ-S | MMU and MPU | separate, cache and TCMs configurable | yes | yes | v5TE | yes |
| ARM1136J-S | MMU | separate, cache and TCMs configurable | yes | yes | v6 | yes |
| ARM1136JF-S | MMU | separate, cache and TCMs configurable | yes | yes | v6 | yes |




