Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
102 Cards in this Set
- Front
- Back
combinational datapath elements |
datapath elemants that operate on data values; their outputs depend only on the current inputs |
|
state elements |
datapath elements that contain state, which means it has internal storage; still stores info without power, must have at least 2 inputs and 1 output, output provides value written in an earlier clock cycle |
|
edge-triggered clocking |
a clocking scheme in which all state changes occur on a clock edge |
|
clocking methodology |
the approach used to determine when data is valid and stable relative to the clock |
|
control signal |
a signal used for multiplexor selection or for directing the operation of a functional unit |
|
datapath element |
a unit used to operate on or hold data within a processor - in the MIPS implementation, the datapath elements include the instruction, data memories, the register file, the ALU, and adders |
|
register file |
a state elements that consists of a set of registers that can be read and written by supplying a register number to be accessed |
|
single cycle CPU |
creating a single datapath - more info |
|
pipelining |
an implementation technique in which multiple instructions are overlapped in execution - running tasks in parallel to execute multiple tasks faster (speed up is ~ equal to the number of pipe stages) |
|
pipeline hazards |
situations when the next instruction cannot execute in the following clock cycle |
|
structural hazard |
when a planned instruction cannot execute in the proper clock cycle because the hardware does not support the combination of instructions that are set to execute
|
|
data hazards |
when a planned instruction cannot execute in the proper clock cycle because data that is needed to execute the instruction is not yet available |
|
data forwarding/bypassing |
a method of resolving a data hazard by retrieving the missing data element from internal buffers rather than waiting for it to arrive from programmer visible registers or memory |
|
load-use data hazard |
a specific form of data hazard in which the data being loaded by a load instruction has not yet become available when it is needed by another instruction |
|
pipeline stall |
a stall initiated in order to resolve a hazard |
|
control hazard/branch hazard |
when the proper instruction cannot execute in the proper pipeline clock cycle because the instruction that was fetched is not the one that is needed; that is, the flow of instruction addresses is not what the pipeline expected |
|
branch prediction |
a method of resolving a branch hazard that assumes a given outcome for the branch and proceeds from that assumption rather than waiting to ascertain the actual outcome |
|
latency (pipeline) |
the number of stages in a pipeline or the number of stages between two instructions during execution |
|
5 sections of pipelined datapath |
IF, ID, EX, MEM, WB |
|
dual pipeline CPU |
|
|
branch acceleration/data forwarding |
|
|
exception handling |
|
|
instruction level parallelism |
|
|
memory hierarchy |
memory closest to the CPU is smallest, fastest, and most expensive (SRAM) while the furthest is largest, slowest, and least expensive (DRAM then magnetic disk or flash memory) - hierarchy is meant to hide memory access times through the use of spatial and temporal locality |
|
cache |
|
|
virtual memory |
using more memory than physically available by using the paging technique |
|
cache control (finite state machine) |
|
|
parallel processor cache |
|
|
multithreading |
|
|
cache hit/miss |
a cache hit occurs when the valid bit is valid and the tag of address and tag in cache at the index are equal; a miss occurs when either the valid bit is not valid or the tag of the address and tag in cache at the index are not equal |
|
direct-mapped cache (one-way set associative) |
a cache structure in which each memory location is mapped to exactly one location in the cache |
|
set-associative |
a cache that has a fixed number of locations (at least two) where each block can be placed |
|
fully associative |
a cache structure in which a block can be placed in any location in the cache |
|
benefits of virtual memory |
freeing applications from having to manage a shared memory space, increased security due to memory isolation, and being able to conceptually use more memory than might be physically available |
|
translation lookaside buffer |
a special address translation cache that keeps track of recently used translations for using in the near future due to the temporal and spatial locality of the words on the page - improves access performance by relying on locality of reference to the page table |
|
cache control idle state |
waits for a valid read or write request from the processor, which moves the FSM to the compare tag state |
|
cache control compare tag state |
tests to see if the read or write is a hit or miss - goes to idle state if hit and block is valid, goes to write back state if miss and dirty bit is 1, or goes to allocate state if miss and dirty bit is 0 |
|
cache control write back state |
writes the 128-bit block to memory using the address composed from the tag and cache index - remain here until it receives the ready signal from memory then goes to the allocate state |
|
cache control allocate state |
fetches new block from memory and waits for ready signal from memory then goes to the compare tag state |
|
hardware multithreading |
allows multiple threads to share the functional units of a single processor in an overlapping fashion by duplicating the independent state of each thread |
|
coarse-grained multithreading |
switches threads only on costly stalls (like 2nd level cache misses) - less likely to slow down the execution of an individual thread, but limited in ability to overcome throughput losses - most useful for reducing the penalty of high-cost stalls |
|
fine-grained multithreading |
switches between threads on each instruction - results in interleaved execution of multiple threads - processor switches threads every clock cycle on a round-robin basis skipping any that are stalled at that time - can hide throughput losses that arrive from short and long stalls but also slows down execution of individual threads |
|
simultaneous multithreading (SMT) |
variation of hardware multithreading that uses a multiple-issue, dynamically scheduled processor to exploit thread-level parallelism at the same time it exploits instruction-level parallelism - uses register renaming and dynamic scheduling to issue multiple instructions without regard to the dependences among them |
|
Amdahl's Law |
a rule stating that the performance enhancement possible with a given improvement is limited by the amount that the improved feature is used - quantitative version of diminishing marginal returns |
|
SISD |
single instruction single data stream - conventional uniprocessor |
|
MIMD |
multiple instruction multiple data stream - conventional multiprocessor |
|
SIMD |
single program multiple data - when different processors execute different sections of code - all the parallel execution units are synchronized and they all respond to a single instruction that emanates from a single PC |
|
memory acces latency |
the time it takes to access a specific type of memory - different types require different amounts of time |
|
memory hierarchy |
a structure that uses multiple levels of memories - as the distance from the processor increases, the size of the memories and access times both increase |
|
spatial locality |
the concept that likelihood of referencing a resource is higher if a resource near it was just referenced |
|
temporal locality |
the concept that a resource that is referenced at one point in time will be referenced again soemtime in the near future |
|
hit time |
the time required to access a level of the memory hierarchy, including the time needed to determine whether the access is a hit or miss |
|
miss penalty |
the time required to fetch a lock into a level of the memory hierarchy from the lower level, including the time to access the block, transmit it from one level to the other, insert it in the level that experienced the miss, and then pass the block to the requestor |
|
direct-mapped cache |
a cache structure in which each memory location is mapped to exactly one location in the cache |
|
tag |
a field in a table used for a memory hierarchy that contains the address information required to identify whether the associated block in the hierarchy corresponds to a requested word |
|
valid bit |
a field in the tables of a memory hierarchy that indicates that the associated block in the hierarchy contains the valid data |
|
cache miss |
a request for data from the cache that cannot be filled because the data is not present in the cache |
|
write-through |
a scheme in which writes always update both the cache and the next lower level of the memory hierarchy, ensuring that data is always consistent between the two |
|
write buffer |
a queue that holds data while the data is waiting to be written to memory |
|
write-back |
a scheme that handles writes by updating values only to the block in the cache, then writing the modified block to the lower level of the hierarchy when the block is replaced |
|
LRU |
least recently used - a replacement scheme in which the block replaced is the one that has been unused for the longest time |
|
multilevel cache |
a memory hierarchy with multiple levels of caches, rather than just a cache and main memory |
|
global miss rate |
the fraction of references that miss in all levels of a multilevel cache |
|
local miss rate |
the fraction of references to one level of a caches that miss - used in multilevel hierarchies |
|
virtual memory |
a technique that uses main memory as a "cache" for secondary storage |
|
physical address |
an address in main memory |
|
page fault |
an event that occurs when an accessed page is not present in main memory |
|
virtual address |
an address that corresponds to a location in virtual space and is translated by address mapping to a physical address when memory is accessed |
|
address translation/address mapping |
the process by which a virtual address is mapped to an address used to access memory |
|
segmentation |
a variable-size address mapping scheme in which an address consists of 2 parts: a segment number, which is mapped to a physical address, and a segment offset |
|
page table |
the table containing the virtual to physical address translations in a virtual memory system - it is stored in memory and typically indexed by the virtual page number - each entry contains the physical number for the virtual page is the page is currently in memory |
|
reference bit/use bit |
a field that is set whenever a page is accessed and that is used to implement LRU or other replacement schemes
|
|
translation-lookaside buffer (TLB) |
a cache that keeps track of recently used address mappings to try to avoid an access to the page table |
|
virtually address cache |
a cache that is accessed with a virtual address rather than a physical address |
|
aliasing |
a situation in which the same object is accessed by two addresses - can occur in virtual memory when there are two virtual addresses for the same physical page |
|
physically addressed cache |
a cache that is addressed by a physical address |
|
supervisor mode/kernel mode |
a mode indicating that a running process is an operating system process |
|
system call |
a special instruction that transfers control from user mode to a dedicated location in supervisor code space, invoking the exception mechanism in the process |
|
context switch |
a changing of the internal state of the processor to allow a different process to use the processor that includes saving the state needed to return to the currently executing process |
|
exception enable/interrupt enable |
a signal or action that controls whether the process responds to an exception or not - necessary for preventing the occurrence of exceptions during intervals before the processor has safely saved the state need to restart |
|
restartable instruction |
an instruction that can resume execution after an exception is resolved without the exception's affecting the result of the instruction |
|
handler |
name of a software routine invoked to handle an exception or interrupt |
|
unmapped |
a portion of the address space that cannot have page faults |
|
3 C's model |
a cache model in which all cache misses are classified into one of 3 categories: compulsory misses, capacity misses, and conflict misses |
|
compulsory miss/cold-start miss |
a cache miss caused by the first access to a block that has never been in the cache
|
|
capacity miss |
a cache miss that occurs because the cache, even with full associativity, cannot contain all the blocks needed to satisfy the request |
|
conflict miss/collision miss |
a cache miss that occurs in a set-associative or direct-mapped cache when multiple blocks compete for the same set and that are eliminated in a fully associative cache of the same size |
|
hardware multithreading |
increasing utilization of a processor by switching to another thread when one thread is stalled |
|
fine-grained multithreading |
a version of hardware multithreading that suggests switching between threads after every instruction |
|
coarse-grained multithreading |
a version of hardware multithreading that suggests switching between threads only after significant events, such as a cache miss |
|
simultaneous multithreading (SMT) |
a version of multithreading that lowers the cost of multithreading by utilizing the resources needed for multiple issue, dynamically schedule microarchitecture |
|
fully associative |
a cache structure in which a block can be placed in any location in the cache |
|
set associative |
a cache that has a fixed number of locations (at least two) where each block can be palced |
|
Amdahl's law |
a rule stating that the performance enhancement possible with a given improvement is limited by the amount that the improved feature is used - it is a quantitative version of the law of diminishing returns |
|
base/displacement addressing |
|
|
instruction set architecture |
|
|
CPI for multicycle CPU |
|
|
CPI for pipeline CPU |
|
|
loop unrolling |
|
|
register renaming |
|
|
dual pipeline |
|
|
message passing multi-processor |
|