• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/278

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

278 Cards in this Set

  • Front
  • Back

Von Neumann Architecture

Key concepts:


- Memory holds both programs and data


- Memory is addressed linearly


- instructions executed sequentially (unless a branch or reset occurs)

Little Man Computer

- starts by looking at the counter for a mailbox number X


- the man increments the counter by 1


- the man goes to mailbox X and reads what is written on the slip of paper in the mailbox


- the man takes the appropriate actions depending on those digits


- the man then starts again

Op code 5

LOAD


Go to the mailbox address specified, read the digit number at that address, then go to the calculator and enter that number in

Op code 3

STORE


Go to the calculator and note the 3 digit number displayed, then go to the mailbox address specified and enter that number on a new slip of paper

Op code 1

ADD


Go to the mailbox address specified , read the 3-digit number at the address, then go to the calculator and add the number to the number already on the calculator

Op code 2

SUBTRACT


go to the mailbox address specified, read the 3-digit number at that address, then go to the calculator and subtract the number from the number already on the calculator

Op code 9

INPUT/OUTPUT


INPUT - Op code 9 address 01


Go to the IN tray, read the three digit number there, then go to the calculator and enter the that number in


OUTPUT - Op code 9 address 02


Go to the calculator, read the 3-digit number there, then go to the OUT tray and leave a slip of paper with that number on it

Op code 0

BREAK

Op code 7

BRANCH on ZERO


go to the calculator and read the 3-digit number. If it is zero, set the counter to the 2-digits specified in the address, and start fetch of instruction, otherwise continue with the next instruction as normal

Op code 8

BRANCH on POSITIVE


Go to the calculator and read the number. If it is positive, set the counter to the 2-digits specified in the address, and start fetch if instruction, otherwise continue with the next instruction as normal

Fetch

The man goes to the counter and read the address there. The man goes to the mailbox at the address and reads the 3-digit number in the mailbox.

Execute

In which the man performs the work specified in the instruction

Components of a CPU

-memory


-registers


-ALU


-Buses


-Control Unit

Accumulator

Considered part of the ALU


General purpose register used for:


-holding data


- holding interim and final results of arithmetic operations


- holding data between different memory locations


- holding data waiting to be transferred between I/O and main memory

The program Counter (PC)

Holds the address of the next instruction to the executed (part of the control unit)

Instruction register (IR)

Holds the actual instruction to be executed (part of the control unit)

Flags

1 bit registers used to keep track of special conditions eg arithmetic carry, overflow, power failure.


Internal computer flags are grouped together in one or more status registers (SR) (part of the control unit)

Memory address register (MAR)

Holds the address of a memory location to be accessed. it always written to - never from

Memory data register (MDR)

Holds the value that is being stored to or retrieved from the memory location currently addressed in the MAR. May be written to or from

Buses

The physical connection that makes it possible to transfer data from one location in a system to another is called a bus. Buses are used to:


- transfer data between the different points on the CPU


- transfer data between the CPU and the main memory


Transfer data between computer peripherals and the CPU

Point-to-point bus

Bus carries the signal from a specific source to a specific destination eg home PC to printer

Broadcast Bus

Bus used to carry signals to many different devices

Bus interface bridges

Allow communications between the different buses

Signed magnitude representation

Add a single-bit flag : 0 for positive or 1 for negative


- have two values for 0: 10000000 and 00000000


-makes binary arithmetic messy

Ones-complement

- the negative number is represented by flipping each bit


- the higher order bit still indicates the sign of the number


- still has who representations for zero: 00000000 and 11111111


Makes binary addition a bit simpler

Twos-complement

-A negative number is obtained by flipping each bit and adding


- the higher order still in dictates the sign of the number


-one representation for zero


-makes binary arithmetic much simpler


Add a bias

- for a k-bit number as a bias of 2k-1-1, then store in normal binary.


-can store numbers between -(2k-1-1) and 2k-1 (-127 and 128)


- the higher bit does not indicate the sign of the number

Floating point representation

Has three fields:


- the sign bit


- the exponent e


- the mantissa M (also called the significand)

The sign bit

0 indicates positive 1 indicates negative

The exponent e

Value in the range -126 to 127 stored with a bias: 127 is added giving a number between 1 and 254


0 and 255 have special meanings:


- exponent field with 0 with mantissa 0 gives the number 0


- exponent field 0 with non-zero mantissa : "subnormal numbers"


- exponent field 255 with mantissa 0 gives + or - infinity


- exponent field 255 with non zero mantissa: not a number

The mantissa M

Always scaled so that the radix point is after the leading 1. Hence we need not store the leading 1.

Transistors

A transistor is just an electronically controlled switch.


2 ports (s and d) are connected depending on the voltage of a third(g)


D - drain


S - Source


G - gate

MOSFET

Metal-Oxide-Semiconductor Field Effect Transistor


Silicon is called a semiconductor because it's conductivity changes over many orders of magnitude depending on small changes in levels of dopants

P-type

Electron holes - overall positive charge, called p-type


The dopant of p-type silicon is boron. Is short an electron so leaves an 'electron hole' that can move around the lattice

N-Type

Extra electrons - overall negative charge, called n-type


The dopant of n-type silicon is arsenic. This is an extra electron no involved with the lattice bonds

A Diode

A junction between a p-type and n-type. Current can only flow from p-type to n-type. Near the junction the holes in the p-type and electrons in the n-type cancel each other leaving a 'depletion' region where there are no mobile charge carriers. If a positive voltage is applied to the p-type, and a negative to the n-type, then the holes in the p-type are attracted towards the n-type and the electrons in the n-type are attracted to the p-type and so the depletion region is squeezed out of existence and current flows. If a negative voltage is applied to the p-type, and a positive to the n-type, then the holes in the p-type are repelled by the n-type and so the depletion region is expanded and creates a barrier to current flow.

Capacitor

A capacitor is two pieces of conductive material separated by an insulator.


If a positive voltage is applied to one side it accumulates a charge Q and the other side accumulates the opposite charge -Q. It takes time and energy to charge and discharge a capacitor

Functionally Complete Sets

Every circuit can be constructed from these alone:


- AND, OR and NOT


- A single NAND

Digital Design Principals

Digital design us all about managing the complexity of huge numbers of interacting elements. Some principals help humans do this:


- discipline: restricting design choices to make things easier to model, design and combine


- abstraction : hiding details when they aren't important


The three -y's:


- Hierarchy: dividing a system into modules and submodules


- Modularity: well-defined functions and interfaces for modules


- Regularity: encouraging uniformity to modules to be swapped and reused

Circuits

A circuit has:


- one or more discrete valued input terminals


- one or more discrete valued output terminals


- a specification of the relationship between inputs and outputs


- a specification of the delay between inputs changing and outputs responding

Elements and Nodes

The circuit is made up of elements and Nodes:


- An element is itself a circuit with inputs, outputs and specs


-A node is a wire joining elements, whose voltage conveys a discrete valued variable

Combinational Logic

Combinational logic rules:


- Individual gates are combinational circuits


- every circuit element must be a combinational circuit


-every node is either an input to the circuit or connecting to exactly one output of a circuit element


- the circuit has no cyclic paths - every path through the circuit visits any node at most once

Minterm

A minterm is a product involving all the inputs to a function

Maxterm

A maxterm is a sum involving all the inputs to a function

"Sum of products" form

Every Boolean expression can be written as minterms ORed together

"Product of sums" form

Every Boolean expression can be written as maxterns ANDed together

Decoder

This is what a decoder does:


- N inputs. 2^n outputs


- only one output is high at a time


- which one depends on the input

Multiplexer

A multiplexer has k+2^k inputs and 1 output. The first k inputs (selector) represent a binary number. The output takes the value of the remaining input indexed by this binary number. Consider many memory locations connected to the inputs. Using the selector we can select which is loaded into a register connected to the output

Propagation delay : tpd

The max delay before the output is stable

Contamination delay : tcd

The min delay before the output changes

Reasons why tpd and tcd may be different

- different rising and falling delays


- a circuit might have multiple inputs and outputs, some of which are faster than other


- circuits slow down when hot and speed up when cold

Critical path

In a circuit the critical path is the path determining the propagation delay of the circuit - the longest path in the circuit

Short path

The short path is the path determining the contamination delay of the circuit - the shortest path in the circuit

Glitches

Sometimes the output can temporarily move to an incorrect value before stabilising. This is called a glitch.

Carry lookahead adder

Generate: G(A,B) = 1 if A and B would cause cout even if cin = 0


Propagate: P(A,B) = 1 if A and B would cause cout if cin = 1


G(A,B) = A and B P(A,B) = A or B


Carry occurs if it is either generated or there is Carry in and it is propagated


Cout = G(A,B) + P(A,B).Cin

Combinational circuits

Output depends only on current input (after propagation delay).

Sequential Circuits

– Output depends on history as well as current input – I.e. the circuit has memory.


– Can be modelled as finite-state machines.


– In general hard to analyse.


–Fundamental components: latches and flip-flops, each store 1-bit.

Synchronous sequential circuit

A synchronous sequential circuit consists of interconnected elements such that:


• Every circuit element is either a register or a combinational circuit


• At least one circuit element is a register


• All registers receive the same clock signal


• Every cyclic path contains at least one register. A synchronous sequential circuit has:


• A discrete set of states {S0,…,Sk-1}


• A clock input, whose rising edge indicates when a state change occurs


• A functional specification which details the next state and all outputs for each possible current state and set of inputs.

SR-Latch (NOR latch)

• Both inputs ‘usually’ set to 0.


• If input S (set)has a pulse of 1, the output becomes 1.


• The output remains 1 even when the pulse is over.


• If input R (reset) has a pulse of 1, the output becomes 0.


• The output remains 0 even when the pulse is over.


• If the output is already 1, a pulse on S will not change it. If it is already 0, a pulse on R will not change it.


• Latches can be used to store a single bit of memory. This circuit has two stable states for a given input – it is called bistable.

SR –Latch (NAND LATCH)

• Latches can also be built from NAND gates.


• In this case the ‘usual’ state for the inputs is 1, so the inputs are denoted with bars.

D-latch

In a latch a pulse on set or reset indicates what the newstate should be, and when it should change.


D – data input. Defines what the new value should be.


CLK – clock input. Define when the new value should arise.

D Flip-flop

In the D-latch the output can change whenever the clock is high. Even better is if we can make it change only at the moment the clock goes high. D flip-flop copies D to Q on the rising edge of the clock, and remembers its state at all other times.

Register (flip-flops)

A bank of flip-flops driven by the same clock

Enabled Flip-flop

Incorporates an additional input (enable) to control whether the data is loaded into the register or not.


EN can control the input with a multiplexor, or can control the clock. This is called a gated clock. Gated clocks can cause timing errors and glitches on the clock.

Flip-flop design

How many transistors are needed to build the D flip-flop?


– A NAND or NOR gate uses four transistors.


– A NOT gate uses two transistors.


– An AND gate is built from a NAND and a NOT, so it uses six transistors.


– The SR latch uses two NOR gates, or eight transistors.


– The D latch uses an SR latch, two AND gates, and a NOT gate, or 22 transistors.


– The D flip-flop uses two D latches and a NOT gate, or 46 transistors.

Problem circuits

The problems are caused by outputs being fed back into inputs: the circuits contain loops or cyclic paths. To avoid this we insert registers into cyclic paths:


• the registers contain the ‘state’ of the circuit


• they break the paths


• they only update on a clock edge


• say they are synchronised to the clock If the clock is sufficiently slow, so that all inputs to all the registers have settled before the next clock edge, then race conditions cannot arise.

tsetup

Time before rising edge during which inputs must be stable

thold

Time after rising edge during which inputs must be stable

tccq

time until output starts to change

tpcq

Time by which output has stabilized

Overclocking

Setting the clock cycle fast than the manufacturers recommend.

Time between ticks (Tc)

Must be at least tpcq+tpd+tsetup

Sequencing overhead

(tpcq+ tsetup)

Metastable states

Suppose the input D to a flip-flop is connected to a button. If D is unpressed or pressed when the clock rises, Q will be set to a stable state. If D is in the process of being pressed just when the clock rises, Q may be set to a metastable state. The state will be driven to 1 or 0 eventually, but it may take time.

2-Dimensional Array

• 2-dimensional array of bit cells: each cell stores one bit


• N address bits and M data bits:


– 2N rows and M columns


– Depth: number of rows (number of words)


– Width: number of columns (size of word)


– Array size: depth × width = 2N × M

Wordline

– single row in memory array to be read/written – only one wordline (or address line) HIGH at once

On memory read:

• If the wordline is active (high), the stored bit should drive the bitline value.


• If the wordline is not active (low), the stored bit should not affect the bitline (leave it floating).

Read/ Write line

Determines whether the data will be transferred to the MDR (write) or from the MDR (read) the cell

Read and Write

Read occurs when:


Address line = 1 AND R/W line =1 Bitlines are left floating at the MDR, so take their values (weakly) from the active memory cells in the address line.


Write occurs when:


Address line = 1 AND R/W line = 0 Bitlines are strongly driven by the MDR data,and overpower the weakly held bits in the active memory cells, overwriting the contents

DRAM

Each bit is one transistor and one capacitor. The capacitor is charged or discharged to represent a bit. Higher density, but slower. Capacitors discharge with time so need refreshing (reading and rewriting) every 64ms.

SRAM

6 (or more) transistor flip-flop based memory. Low density but stable and fast. Doesn’t require refreshing but loses memory when powered off. Can be easily incorporated into CPU manufacture

Basic ROM

To read the bitline is weakly set to high The wordline is then activated If the cell contains a 0, the transistor connects the bitline to ground and the bitline goes low. If the bitline contains a 1, the bitline retains the weak high it was initialised with.

Programmable ROM (PROM):

All bit cells have a transistor. By applying sufficient voltage a fuse can be blown to disconnect the transistors from ground in cells meant to contain 1s. FLASH memory uses floating gate transistors instead, which can be electrically activated or deactivated.

Multi-ported memory

Memory that can access more than one address line at once.

MAR(number of bits)

The number of bits in the MAR determines how many different address locations can be decoded. For an MAR of width k bits, the number of possible memory addresses is M = 2^k Typical modern MAR is at least 32 bits wide ,giving a 4 gigabytes memory.

Methods to improve memory access

– Wide Path Memory Access


Retrieve multiple bytes instead of 1 byte at a time


– Memory Interleaving


Partition memory into subsections, each with its own address register and data register


– Cache Memory

Wide Path Memory Access

When calling or storing memory, don’t call up just 1 bit or byte. Most modern CPUs load 8 bytes (64 bits) at a time.


• Requires a wider memory-data bus and larger MDR


• The 8 bytes can be separated and used appropriately in the CPU


- using more circuitry


• Diminishing returns


–widening the path further would involve more complexity in the CPU and extra bytes are less likely to be used often.

Memory Interleaving

• Divide the memory into separate chunks that can be accessed simultaneously.


• Interleave, so consecutive addresses can be accessed simultaneously.


• Useful for parallelisation / pipelining

Cache memory

Idea: Preload some special fast memory with the data you are likely to use soon. What data are you likely to use soon?Typically, data near addresses you have recently accessed.

Temporal Locality

If data used recently, likely to use it again soon How to exploit: keep recently accessed data in higher levels of memory hierarchy

Spatial Locality

If data used recently, likely to use nearby data soon. How to exploit: when access data, bring nearby data into higher levels of memory hierarchy too.

Control Unit

Responsible for directing the flow of instructions and data within the CPU.

CPU Fetch Execute Cycle

1. CPU sends the address in the Program Counter (PC) to theMemory Address Register (MAR)


2. Increment the PC


3. Get the instruction identified by the MAR into the MemoryData Register (MDR)


4. Move the instruction from the MDR to the InstructionRegister


5. Move the instruction from the IR to the Control Unit for decoding – Send the operation to the Arithmetic Logic Unit (ALU) – Put the address of data to be operated upon in a register


6. Send the address for the data from the register to the MAR


7. Read the data from memory into the MDR


8. Move the data from the MDR to an accumulator in the ALU


9. Do the operation and store result in an accumulator in the ALU

Direct addressing

The address read in the instruction is the address of the actual data

Immediate addressing

The address read in the instruction is the data to be used.

Indirect addressing

The address read in the instruction is the address of a memory location containing the address of the actual data

Register Indirect addressing

The address read in the instruction is the address of a register containing the address of the actual data

Indexed addressing

The address read in the instruction should have an index value (contained in someregister) added to obtain the address of the data.

Absolute

The address read is the one the little minion should go to.

Relative addressing

The address read is an offset to the current instruction address.

Base offset addressing

The address read is offset by the current value in a special ‘base’ register.

Multiple Data Instructions

Single Instruction, Multiple Data (SIMD), used in multimedia to perform a single operation on a large amount of data, e.g. pixels in a picture.

MIPS

Microprocessor without Interlocked Pipeline Stages


32-bit RISC processor


• ∼80instructions in the instruction set


• 32 general purpose registers $r0 - $r31


• $r0 is special and always contains the value zero


• The MIPS processor has a super-pipelined architecture – each instruction is broken down into a sequence of ‘micro’ instructions


MIPS is a reduced instruction set computer(RISC), with a small number of simple instructions.

Design principles (MIPS)

Simplicity favors regularity


• Consistent instruction format: Same number of operands (two sources and one destination) is easier to encode and handle in hardware. Make the common case fast


• MIPS includes only simple, commonly used instructions


• Hardware to decode and execute instructions can be simple, small, and fast


• More complex instructions (that are less common) performed using multiple simple instructions


Smaller is Faster


• MIPS includes only a small number of registers


Good design demands good compromises


• Multiple instruction formats allow flexibility


• add, sub: use 3 register operands


• lw, sw: use 2 register operands and a constant


• Number of instruction formats kept small to adhere to design principles 1 and 3 (simplicity favors regularity and smaller is faster).


• Other formats appear in assembler, but are transformed in machine code to fit this format.

Instruction Formats (MIPS)

– R-Type: register operands


– I-Type: immediate operand


– J-Type:for jumping

R-Type Instructions

• 3 register operands:


– rs, rt: source registers


– rd: destination register


• Other fields:


– op: the operation code or opcode (0 for R-type instructions)


– funct: the function, with opcode, tells computer what operation to perform


– shamt: the shift amount for shift instructions, otherwise it’s 0

I-type Instructions

• 3 operands:


– rs, rt: register operands


– imm: 16-bit two’s complement immediate


• Other fields:


– op: the opcode, operation is completely determined by opcode

J-type Instructions

• 26-bit address operand: addr


• Used for jump instructions: j


• Rarely used in assembler


• Typically use the R-type instruction: jr name or jr $s0 (Jump to name, or jump to address contained in register)

How do we address the operands? (MIPS)

• Register Only add $s0, $s1, $s2


• Immediate (16-bit two’s complement integer) addi $s0,$s1, 5 ori $t3, $t7, 0xFF


• Base Addressing address of operand is given by base address + signed immediate lw $s0, 0($sp) sw $s0, -12($t0)


• PC-Relative jump so far from current position beq $t0,$0, else becomes beq $t0, $0, 3

Microarchitecture

How to implement an architecture in hardware


Made up of the:


- Datapath: functional blocks and registers


- Control : control signals

Pipelined Microarchitecture

- Temporal parallelism


- Divide single-cycle processor into 5 stages:


o Fetch


o Decode


o Execute


o Memory


o Writeback


- Add pipelined registers between stages

Further Registers (MIPS)

There are a further 32 registers in the Floating Point Unit


• Each can hold a single precision FP number (32 bits)


• Pairs can be used to store double precision numbers (64bits)


• mul.s$f0,$f1,$f2 single precisionmultiplication


• mul.d $f0,$f2,$f4 double precision – operands are doubles stored in$f2,$f3 and $f4,$f5, result is stored in $f0 and $f1.


There are also the exception and interrupthandling registers

MMIO

Memory Mapped Input/Output Devices connected to the same address bus as main memory Devices are each allocated address space. CPU does not see difference between memory and IO


• Input devices are allocated specific addresss space that can be read like any other memory.


• Output devices can be written to like any other memory.

Exception and Interrupt handling registers

• An exception is anything that causes an unscheduled function call in the middle of a user program.


• Caused by:


• Hardware, also called an interrupt, e.g., keyboard


• Software, also called traps, e.g., undefined instruction


• When exception occurs, the processor:


• Records the cause of the exception • Jumps to exception handler (at instruction address0x80000180)


• Returns to program

Operating System

An operating system is a:


Resource Allocator: Responsible for the management of the computer system resources.


Control Program: Controls the execution of user programs and operation of the input / output (I/O) devices.


Kernel: The one program that runs all the time.

Process

A process is a unit of execution; an abstraction that is used to support the discussion and study of operating systems. Resources needed by the process include: CPU time, memory, files, andI/O devices.The resources may be allocated at the start of a process or as it executes.

A process includes

Code: text section


Current activity, represented by the program counter and the content of the CPU’s registers Data stack: Temporary data such as local variables


Data section: Global variables


Heap:Memory allocated while the process is running

Process Control Block (PCB)

Information about each process is represented by a Process Control Block (PCB), including:


- Unique identifier e.g. PID


- State (∼status)


- CPU utilization


- CPU schedulinginformation e.g. priority


- Memory usage


- Other information: owner, description.

Process State

As a process executes, it changes state (∼status):


New – the process is being created.


Running –instructions are being executed.


Waiting –the process is waiting for some event to occur.


Ready – the process is ready to be dispatched to the CPU.


Terminated – the process has completed its execution, or some other event causing termination.

Process Creation

A new process, as a parent process, can create a number of child processes,which, in turn create other processes, forming a tree of processes.

Resource sharing: three possible cases:

i. The parent and child processes share all resources.


ii. The child process shares a subset of parent’s resources.


- The parent and child processes share no resources.

Execution:two possible cases:

i. The parent and child execute concurrently.


ii. The parent waits until child terminates


The process executes its last statement and asks the operating system to delete it:


- Outputs data from the child’s process to parent


- The child process’s resources are de-allocated by operating system

The parent process may terminate execution of child processes if:

- The child process has exceeded its allocated resources.


- The task assigned to child is no longer required.


- The parent is itself terminating (cascadetermination).

The Kernel

The kernel consists of:


- The first-level interrupt handler (FLIH): to manage interrupts.


- The dispatcher: to switch the CPU between processes.


- Intra operating system communications e.g. via the system bus.

Interrupt

An interrupt is a signal from either hardware or software of an event that will cause a change of process, for example:


- Hardware: Triggers an interrupt by sending asignal to the CPU via the system bus e.g. I/O event (e.g. printing complete).


- Software: Triggers an interrupt by executing asystem call for some action by the operating system (e.g. what time is it?).

First-level interrupt handler

The function of the FLIH is to:


- Determine the source of the interrupt (prioritise).


- Initiate servicing of the interrupt (selectionof suitable process of the dispatcher).

Privileged instructions

Some instructions must be accessible only to the operating system: privileged instruction set. Privileged instructions include functions such as:


- Managing interrupts (incl. enabling, disabling).


- Performing I/O.


- Halting a process!

Dual mode

Aim: To distinguish between execution of operating system code and user-defined code. We do not let the user execute instructions that could cause harm.Two modes:


- User mode


- Kernel mode (or supervisor, system, privileged mode)


Switching from user mode to kernel mode occurs when:


- a user process calls on the operating system to execute a function needing a privileged instruction (system call).


- an interrupt occurs (hardware).


- an error condition occurs in a user process (software).


- an attempt is made to execute a privileged instruction while in user mode.

The dispatcher

Assigns processing resource for processes! Is later initiated when:


- A current process cannot continue. - The CPU may be better used elsewhere, for instance: o


i. after an interrupt changes a process state. o


ii. after a system call which results in thecurrent process not being able to continue.


iii. after an error which causes a process tosuspend.

Multiprogramming

The advantages of multiprogramming / implementing process co-operation may include:


- Computation speed-up: if there are multiple CPUsand/or cores.


- Convenience: single user wants to run many tasks.


- Information sharing: for example shared files.


- Modularity: programming (e.g. OOP)The aim of multiprogramming is to maximize CPU utilization.With multiprogramming several processes are kept in memory concurrently.


When one process is waiting, the operating system executes (to running)one of the processes sitting in the ready queue.CPU Scheduling is the method used to support multiprogramming in process management

Long-term scheduler (job scheduler)

Selects which processes should be brought into the ready queue.

Medium-term scheduler

Removes processes from active contention for the CPU by swapping processes in and out of the ready queue.

Short-term scheduler (CPU scheduler)

The Short-term scheduler (CPU scheduler) selects the process to be executed next and allocated to the CPU.


This decision is initiated when processes:


1. Switch from running to waiting state, or


2. Switch from running to ready state, or


3. Switch from waiting to ready, or


4.Terminate

Scheduling Criteria

- CPU utilization: keep the CPU busy!


- Throughput: number of processes that completetheir execution within a specific number of time units.


- Turnaround time: amount of time to execute aparticular process.


- Waiting time: total (accumulated) amount of timea process has been waiting in the ready queue.


- Responsetime: amount of time it takes from when a request was first submitted to theready queue until the first response is produced (access to the CPU). NB: Notthe time to complete the process!

Algorithm Evaluation

Deterministic modelling: Take a predetermined workload (case) and analyse the performance of each algorithm.


Simulation: Complex model of the system and the way it is used. Beware of Bonini’s paradox! Post-implementation: Examine the running operating system (a.k.a. ‘live’ testing).

FiFirstome, First Served (FCFS)

- The average waiting time may or may not belengthy!


- A simple algorithm to implement.


- May result in a ‘convoy’ effect, for exampleshort processes waiting on a long process to finish.

RoundRobin (RR)
Each process gets a small unit of CPU time: a time quantum or timeslice usually q = 10 to 100 milliseconds.After this time has elapsed, the process is pre-empted and added to theend of the ready queue.If there are n processes in the ready queue and the timequantum is q, then each process gets 1 n of the CPU time in chunks of at most qtime units at once.
Shortest Remaining Time First (SRTF)
If a new process arrives in the ready queue with a CPU burst lengthless than the remaining time of executing process then pre-empt the runningprocess and run the process with the shorter CPU burst length.It can be viewed as a pre-emptive version ofSJF.
PriorityScheduling
In Priority Scheduling, the algorithm associates a priority number(integer) for each process.The CPU is allocated by the dispatcher to the process with the highestpriority (smallest integer = highest priority).Problem: Starvation (also known as indefinite blocking).The low priorityprocesses may never execute.Solution: Ageing, as time progresses increase the priority of theprocess.Priority based CPU scheduling can be eitherpre-emptive or non pre-emptive.
Round Robin (with priority)
Round Robin (with priority)With priority in Round Robin, we can choose toimplement either a priority queue, or a normal queue where the priority is usedonly when competing for entry onto the queue. We will choose the latter!
Multilevelqueue scheduling
Multilevel queue scheduling is where processes are partitioned intodifferent queues, e.g. foreground v. background processes.The approach is based on the different queues having different(performance) requirements e.g. response time.Problem: provides differentiation but notflexible, since a process remains in the same queue regardless of adaptingrequirements. Feedback queues An adaption of multilevel queue scheduling,however the process may proceed from one queue to the next. This is the mostcommon approach to scheduling but also difficult to implement/
Threads
Athread is a basic unit of CPU utilization.
Benefitsof Using Threads

The benefit of using threading includes the following:


- Responsiveness: the process (therefore program)continues running even if part of it (i.e. a thread) is blocked or performing alengthy operation.


- Resource sharing: threads share the resources oftheir common process.


- Economy: allocating memory and resources forprocess creation is costly.


- Multiprocessor architectures: a single-threadedprocess can only run on one processor.


- Performance: Cooperation of multiple threads insame job delivers higher throughput and improved performance.

MultithreadingModels

- Many-to-One. Many user application threads toone kernel thread. Programmer controls the number of application threads.


- One-to-One One user application thread to onekernel thread.


- Many-to-Many N user application threads to≤Nkernel threads. Defined by the capabilities of the operating system.

VariableClass Threads
When a thread’s time quantum runs out, the thread isinterrupted and dispatched to the ready queue. Priority lowered: never belowbase priority.When a thread is released from a wait operationit is dispatched to the ready queue. Priority is raised: based on what thethread was waiting for.
Logicaladdress
Anaddress created by the CPU.
Physicaladdress
Anaddress on the physical memory
MemoryManagement Unit (MMU)
TheMMU automatically translates virtual addresses to physical addresses.
MemoryHole Allocation

Some strategies to satisfy a request of size n from a listof free holes:


First-fit: Allocate the first hole that is big enough.


Best-fit: Allocate the smallest hole that is big enough.


- Must search entire list, unless ordered by size.


-Produces the smallest leftover hole.


Worst-fit: Allocatethe largest hole.


- Must also search entire list.


- Produces the largest leftover hole.

Externalfragmentation
External fragmentation is memory space that is able satisfya request but it is not contiguous. We can reduce external fragmentation bycompaction:Shuffle memory contents to place all free memoryin one block.
Internalfragmentation
Internalfragmentation is memory that is allocated successfully but may be slightlylarger than the requested amount of memory
Paging
Physical memory is divided into fixed-sized blocks calledframes. Typically, the size is a power of 2, between 512 and 8192 bytes.Logical memory is divided into blocks of the same sizecalled pages.To run a program of size n pages, we need to find n freeframes and load the program. This process is called paging.A page table is set up to help manage themapping of logical to physical addresses, using an address translation.
AddressTranslation Scheme

The logical memory address generated by the CPU is dividedinto:


- Page number (p): used as an index into a pagetable which contains base address of each page in physical memory.


- Pageoffset (d): combined with base address to define the physical memory addressthat is sent to the memory unit.

Protection

Memory protection implemented by associating protection bitwith each frame.A valid-invalid bit is attached to each entry in the pagetable:


- valid indicates that the associated page is inthe process’s logical address space, and is thus a legal page.


- invalidindicates that the page is not in the process’s logical address space.

VirtualMemory

Virtual memory is the capability of the operating systemthat enables programs to address more memory locations than are actuallyprovided in main memory.Virtual memory systems help remove much of the burden ofmemory management from the programmers, freeing them to concentrate onapplication development.


- The logical address space is much larger thanthe physical address space.


- Pages areswapped (paged) in and out main memory.


- The physical address space is shared by severalprocesses.


- Only part of the process needs to be in thephysical address space for execution.


- A freeframe list of the physical address space is maintained by the operating system.

VirtualAddress Space

Each process views the address space as a contiguous blockof memory holding the objects it needs to execute.


- Code (read-only)


- Data (read and write)


- Heap: the address space in memory that is usedto hold data produced during execution of a process. This space will expand andcontract during execution.


- Stack:the address space that is used to hold instructions and data known at the timeof the procedure call. The contents of the stack grow as the process issuesnested procedure calls and shrinks as the called procedures return.

DemandPaging

Bring a page into memory only when it is required by theprocess during its execution


Reduces:


- Number of pages to be transferred


- Number of frames allocated to a process Increases:


- Overall response time for a collection ofprocesses Number of processes executing

AProcess’s Page Table

When reference is made to a page’s address:


- Invalid reference→abort


- Not-in-memory→bring to memory


A valid/invalid bitis associated with each process’s page table entry to indicate if the page isalready in memory:


- 1→in-memory


- 0→not-in-memory


Address translation:


- when valid/invalid bit in page table entry is0→page fault trap


Initially the valid/invalid bit is set to 0 onall entries

HandlingPage Fault Traps

A process requests access to an address within a page:


Check the validity of the process’s access to that address


If an invalid request


- terminate process


- clear allpreviously allocated main memory


- throw error message “invalid memory access”


Else if a valid request


- If page allocated a frame in main memory


o break


- Else if page fault trap


o identifya free frame from the free frame list


o read thepage into the identified frame


o update the process’s page table


- restart the instruction interrupted by the pagefault trap (break)

Performanceof Demand Paging

Page Fault Rate 0≤p ≤1


- if p = 0 no page faults


- if p = 1, every reference is a fault


Effective Access Time = (probability of a page being inmemory X time to access the memory) + (probability of a page not being inmemory X page fault overhead)


Page fault overhead includes:


- Context switching when relinquishing the CPU


- Time waiting in the paging device queue - Time waiting back in the ready queue Context switching when regaining the ready queue

NoFree Frame?

This occurs when all main memory frames available to a process arecurrently allocated.Page replacement. Find some page in memory, but not reallyin use, and swap it out.


- Page replacement algorithms to decide the page.


- We want to minimise the number of page faults.

BasicPage Replacement

- Find the location of the desired page on thedisk.


- Find a free frame:


o If there is a free frame, use it.


o Else if there is no free frame, use apage-replacement algorithm to select a victim page.


o Write the victim page to the disk; update theprocess’s page table and the free frame list accordingly.


- Read the desired page into the (newly) freedframe.


o Update the process’s page table and the freeframe list.


- Restart the user process.

Belady’sAnomaly
Forsome page fault algorithms, the page fault rate may increase as the number ofallocated frames increases.
LRU(LeastRecently Used) - Implementations

Counter implementation :


- Every page entry has a counter.


- Everytime a page is referenced, record the time.


- When a page needs to be changed, look at thecounters to determine which are to change.


Stack implementation:


- Keep a stack of page numbers in a double linkform: P


- Page referenced: move it to the top (requirespointers to be altered).


- No search for the replacement: the tail pointerpoints to the bottom of the stack, which is the LRU page.

LRUApproximation Algorithms

Additional Reference Bit:


- With each page associate a bit (initially = 0).


- When page is referenced bit set to 1.


- Replace one of the pages with the bit set to 0(if one exists).


- Replace arandom page when none of the pages are set to 0.


- Problem: No way of knowing if a page is a leastrecently used page, a most recently used page, or somewhere in between.


Second chance algorithm:


- Implemented as a variation on a FIFO queue: itcan be viewed as a circular queue.


- When a page enters main memory, it joins thetail of the queue and its reference bit is set to 1.


- When the queue is full and a victim page needsselecting: Start at the head of the queue. If the page’s reference bit is setto 0: Select this page as the victim page. Else if the page’s reference bit isset to 1: Set the reference bit to 0 and move on to the next page in the queue.

Allocationof Frames

- Global replacement: Select a replacement framefrom the set of all frames; one process can take a frame from another


- Local replacement: Select from only theprocess’s own set of allocated frames


- Fixed allocation 1) Equal allocation 2) Proportionalallocation: Allocate according to size of process


- Priority allocation: If process Pi generates apage fault, select for replacement a frame from a process with lower prioritynumber

Thrashing
Thrashingoccurs when a process is spending more time paging then doing actual work.
Prepaging
Pageinto memory at one time all the pages that will be needed. Prepaging is a meansto prevent thrashing. However, we must ensure that the cost of prepaging isless than the cost of servicing the page faults.
Page-FaultFrequency Scheme

A Page-Fault Frequency Scheme is an alternative approach to preventthrashing.Scheme: firstly establish an “acceptable” page-fault rate.then:


- If the actual rate is too low, i.e. the process hasgot too many frames ⇒The process loses frames.


- If the actual rate too high, i.e. the processhas not got enough frames ⇒The process gains frames.

Transferrate
Israte at which data flow between drive and computer
Positioningtime (random-access time)
Is timeto move disk arm to desired cylinder (seek time) and time for desired sector torotate under the disk head (rotational latency)
Headcrash
Resultsfrom disk head making contact with the disk surface
AccessLatency
Averageaccess time = average seek time + average latency
Average I/O time
=average access time + (amount to transfer / transfer rate) + controlleroverhead
Solid-StateDisks

- Nonvolatile memory used like a hard drive - Manytechnology variations


- Can be more reliable than HDDs - More expensive per MB


- Maybe have shorter life span


- Less capacity


- But muchfaster


- Busses can be too slow -> connect directly toPCI for example


- No moving parts, so no seek time or rotationallatency

MagneticTape

- Was early secondary-storage medium


- Evolved fromopen spools to cartridges - Relatively permanent and holds largequantities of data


- Access time slow


- Randomaccess ~1000 times slower than disk


- Mainly used for backup, storage ofinfrequently-used data, transfer medium between systems


- Kept in spool and wound or rewound pastread-write head


- Once data under head, transfer rates comparableto disk


- 140MB/sec and greater


- 200GB to 1.5TB typical storage


- Common technologies are LTO-{3,4,5} and T10000

DiskStructure

- Disk drives are addressed as large 1-dimensionalarrays of logical blocks, where the logical block is the smallest unit oftransfer


- Low-levelformatting creates logical blocks on physical media


- The1-dimensional array of logical blocks is mapped into the sectors of the disksequentially


- Sector 0 is the first sector of the first trackon the outermost cylinder


- Mappingproceeds in order through that track, then the rest of the tracks in thatcylinder, and then through the rest of the cylinders from outermost toinnermost


- Logicalto physical address should be easy


- Except for bad sectors


- Non-constant # of sectors per track via constantangular velocity

StorageArray

- Can just attach disks, or arrays of disks


- Storage Array has controller(s), providesfeatures to attached host(s)

ShortestSeek Time First
Selectsthe request with the minimum seek time from the current head position. Maycause starvation of some requests.
SCAN
Thedisk arm starts at one end of the disk, and moves toward the other end,servicing requests until it gets to the other end of the disk, where the headmovement is reversed and servicing continues.
C-SCAN
Providesa more uniform wait time than SCAN Thehead moves from one end of the disk to the other, servicing requests as itgoes. When it reaches the other end, however, it immediately returns to thebeginning of the disk, without servicing any requests on the return trip
C-LOOK
Armonly goes as far as the last request in each direction, then reverses directionimmediately, without first going all the way to the end of the disk
Low-levelformatting
Dividinga disk into sectors that the disk controller can read and write
RAID
Redundantarray of inexpensive disks
Meantime to repair
Exposuretime when another failure could cause data loss
Hotspare disk
Hotspare disk is unused, automatically used by RAID production if a disk fails toreplace the failed disk and rebuild the RAID set if possible
Toimplement stable storage:

- Replicate information on more than onenonvolatile storage media with independent failure modes


- Updateinformation in a controlled manner to ensure that we can recover the stabledata after any failure during data transfer or recovery

Diskwrite has 1 of 3 outcomes
1. Successful completion - The data were written correctlyon disk 2. Partial failure - A failure occurred in the midst oftransfer, so only some of the sectors were written with the new data, and thesector being written during the failure may have been corrupted 3. Total failure - The failure occurred beforethe disk write started, so the previous data values on the disk remain intact
FileAttributes

- Name – only information kept in human-readableform


- Identifier – unique tag (number) identifies filewithin file system


- Type –needed for systems that support different types


- Location– pointer to file location on device


- Size –current file size


- Protection– controls who can do reading, writing, executing


- Time, date, and user identification – data forprotection, security, and usage monitoring


- Information about files are kept in thedirectory structure, which is maintained on the disk


- Many variations,including extended file attributes such as file checksum


- Information kept in the directory structure

OpenFiles

Several pieces of data are needed to manage open files:


- Open-file table: tracks open files


- Filepointer: pointer to last read/writelocation, per process that has the file open


- File-opencount: counter of number of times a file is open – to allow removal of datafrom open-file table when last processes closes it


- Disklocation of the file: cache of data access information


- Accessrights: per-process access mode information

TrojanHorse

· Code segment that misuses its environment


· Exploits mechanisms for allowing programs writtenby users to be executed by other users


· Spyware,pop-up browser windows, covert channels


. Up to 80% of spam delivered by spyware-infectedsystems

Public-keyencryption
Based on each user having two keys: · public key – published key used to encrypt data . private key – key known only to individual userused to decrypt data
Limitationsof the file-based approach

· Separation and isolation of data


o – eachprogram maintains its own set of data


o – users of a program may be unaware of usefuldata held by other programs


· • Duplication of data


o same data may be held by different programs


o – wasted space, different values and/or formatsfor the same item


· • Data dependence


o – thestructure of the file is defined in the code of the program


· • Incompatible file formats


o – programs written in different languages (C,Java, etc.) they can not easily access each other’s files


· • Fixed Queries / many application programs


–programs written to satisfy particular functions any new requirement needs anew program

Twomain reasons for all these limitations

1. definition of the data is embedded in the applicationprograms (instead of being stored separately and independently)


2. no central control over the access /manipulation of data (the only control is imposed by the application programs)

Database(DB)

A collection of logically related data, designed to meet theneeds of an organization


– a single repositoryof data, shared by many departments / users


– alldata are integrated with minimum amount of duplication

DBManagement System (DBMS)
Asoftware system that enables users to define / create / maintain / control theaccess to the DB
DataDefinition Language (DDL)
–allows users to define the database: specify the data types, the structures andthe constraints of the data
DataManipulation Language (DML)

– allows users to insert / update / delete / retrieve datafrom the database


– Query Language: the part of DML that involvesdata retrieval

Databasedesigners

– the logical DB designer identifies: (“what” to store)


• data (entities /attributes),


• relationships among data items


• constraints on thedata


– the physical DB designer: (“how” to store)


• maps the logical DBdesign to specific tables / integrity constraints,


• selects specific storage structures / accessmethods for the data

Advantagesof databases:

– Sharing of data


– Control of redundancy


– Data consistency


– Data security and integrity


– Faster development of new applications


– Better dataaccessibility


– Economy of scale


– Control of concurrency


– Better backup / recovery procedures

Disadvantages(of databases):
– Complexity / Size of the system – Higher cost of implementation of the DBMS – Additional hardware cost – Slower processing of some applications – Higher impact of a failure
Concurrencycontrol protocols
Preventdatabase accesses to interfere with each other
Concurrencycontrol
Theprocess of managing simultaneous operations on the DB, without having theminterfere with each other
Threecharacterizations of data

– Structured data


– Semi-structured data (XML)


– Unstructured data

Structureddata

– data represented in a strict format (i.e. schema) • relational data model (tables, tuples, attributes)– the DBMS checks to ensure that the data follows:


• the structures (table, attributes, domains)


• the integrity & referential constraints that are specified in the schema

Semi-structureddata

– self describing data


– the “schema”information is mixed with the data values


• How do we end up with such data?


– sometimes data iscollected ad-hoc


• i.e. no predefined structure


• for instance: details of all research projects


– not known in advance how it will be stored / managed

Unstructureddata
–very limited indication of the type / structure of data
RelationalData Model

• relations are tables (columns + rows),


• attributes are columns,


• tuples are rows

Entity-Relationship(ER) model

Basic concepts:


– the important data objects (entities)


– the important properties of the entities (attributes)


– the associations between the entities (relationships)

DBschema:
Totaldescription of the DB
DBinstance:
Itsdata at a particular moment
Objectivesof a DB schema:

All users have access at every point:


– to the same DB instance


– with customized views of parts of the data

Threemain phases of Database Design
Conceptual design: – construct a first,high-level model of the data: ER model • identify the appropriate entities, their relationships andtheir constraints – using the users’ requirements specification – independently ofany physical considerations – it serves as the fundamental understanding of thesystem Logical design: – construct therelational data model of the data – using the conceptual design • map entities / relationships to tables – use normalization techniques to eliminate data redundancy/ anomalies Physical design: – describe thedatabase implementation of the logical design – specific storagestructures / access methods / security protection – aim is optimum performance
RelationalModel Terminology

• A relation is a table (with rows and columns)


• An attribute is a named column of a relation


• The domain of an attribute is the set of allowable values


• A tuple is a row of a relation


• A cell of a relation is theintersection of a row and a column


• The degree of a relation isthe number of attributes


• The cardinality of a relation is the number oftuples

NULLvalue

It represents an attribute value that is:


– either currently unknown


– or not applicable

Primarykey
Thecandidate key selected to identify rows uniquely with the table
Foreignkey

An attribute in one table A whose values must:


• either match the primary key of another table B (then A references B)


• or be NULL (e.g. staff has not been yet assigned to a branch)

Derivedattribute
Derivablefrom value of a related attribute e.g. age can be derived from date-of-birth
Step-by-stepconstruction of an ER Model

1. Identify entities from the scenario/real world


2. Findrelationships


3. Draw rough ER diagram (i.e. an ER model)


4. Fill in the relationships


5. Define primary keys and resolve many-to-manyrelationships


6. Identify attributes and map them to the entities


7. Draw full ER diagram showing keys, attributes,relationships


8. Check it! Does it reflect the real-world system?

Constructa Relational model from an ER model:
• Resolve the many-to-many (*:*) relationships • Buildrelational tables from the entities in the ER diagram • Map foreign keys with primary keys
Noredundancy

Every data item is stored only once:


1. Minimize the amount of space required


2. Simplify maintenance of the database

Modificationanomaly
Failureto maintain all existing instances of a specific value
Deletionanomaly
Losingother values as a side effect when you delete data
Insertionanomaly

– when new data items are inserted, we need to add much moreirrelevant data


– adding rows forces us to add information aboutother entities

Fullfunctional dependency

– B is functionally dependent on A


– B is not functionally dependent on any propersubset of A

Partialfunctional dependency

– B is functionally dependent on A


– B remains functionally dependent on at leastone proper subset of A

Transitivefunctional dependency:

– if there exist functional dependencies and


– then the functional dependency also exists andis called transitive

Armstrong’saxioms

1. Reflexivity: if 𝐵 ⊆ 𝐴,then 𝐴→ 𝐵


2. Augmentation: if 𝐴 → 𝐵,then 𝐴,𝐶→ 𝐵,𝐶


3. Transitivity: if 𝐴 → 𝐵and B → 𝐶,then 𝐴 → 𝐶

Furtherrules can be derived from Armstrong’s axioms
4. Decomposition: if 𝐴 → 𝐵,𝐶,then 𝐴→ 𝐵and 𝐴→ 𝐶 5. Union: if 𝐴 → 𝐵 and 𝐴→ 𝐶,then 𝐴→ 𝐵,𝐶 6. Composition: if 𝐴 → 𝐵and C → 𝐷,then 𝐴,𝐶 → 𝐵,𝐷
Un-normalisedform (UNF)
– when it contains one or more repeating groups – this does not conform with the definition of arelation
FirstNormal Form (1NF)

–no repeating groups (every cell has one value)


– no identical rows

SecondNormal Form (2NF)

– it is in 1NF and


– there are no partial functional dependencies

ThirdNormal Form (3NF)
– it is in 2NF and – there are no transitive functional dependencies– all attributes (which are not part off theprimary key) are functionally dependent on the key, the whole key, and nothingbut the key
SELECT-FROM-WHEREstatement

– SELECT: specifies which columns are to appear in theoutput


– FROM: specifies the table or tables to be used


– WHERE:filters the rows subject to some condition


• GROUP BY: formsgroups of rows with the same column value


• HAVING: filters the groups subject to some condition


• ORDER BY: specifies the order of the output

LIKE‘H%’
firstcharacter must be H, but the rest can be anything
LIKE‘H_ _ _’
exactly4 characters, first character must be H
LIKE‘%e’
anysequence of characters, ending at ‘e’
NOTLIKE ‘H%’
thefirst character can not be ‘H’
COUNT
returnsthe total number of values in a given column
COUNT(*)
returnsthe number of rows in a table
Innerjoin
ifone row of a table is unmatched, the row is omitted from the output table
Outerjoin
itretains (some of) the rows that do not satisfy the join conditions
Rightouter join:
retainthe unmatched rows of the right table
Leftouter join
itretains the rows of the left table that are unmatched with rows from the righttable
Fullouter join
retainthe unmatched rows of both tables
Maindomain types in SQL

• CHAR(n): character string of fixed length n


• VARCHAR(n): character string of variable length at most n


• BIT(n): bit string of fixed length n


• INTEGER: large positive / negative integer values


• SMALLINT: small positive / negative integer values (up to32 767)


• NUMERIC(p,d): a (positive / negative) decimal number withat most:


– precision p: totalnumber of all digits (integer + decimal, excluding the decimal point)


– scaled: total number of decimal digits

CASCADE
(when update / delete: update the foreign key / delete the tuple)
SETNULL
(whenupdate / delete: set the foreign key to NULL)
SETDEFAULT
(whenupdate / delete: set the foreign key to the default value)
NOACTION
(whenupdate / delete: do nothing – dangerous !!! )
Aquery language is relationally complete if
itcan be used to produce any relation that can be derived using relationalalgebra/calculus expressions
predicate
truth-valuedfunction (true/false) with arguments
proposition
theexpression obtained when we substitute values to the arguments of a predicate(can be true/false)
{x|P ( x)^Q ( x)}
setof all x such that both P(x) and Q(x) are true
{x|P ( x)VQ ( x)}
setof all x such that both P(x) or Q(x) is true
{ x|~P(x )}
setof all x such that P(x) is not true’
Unaryoperations

– Selection (𝜎)


– Projection (𝜋)


– Rename (𝜚)

Binaryoperations

– Union (⋃)


– Set Difference (−)


– Cartesian product (×)

predicate(R)
–outputs a subset of the relation R that contains only the tuples (rows) thatsatisfy the specified condition (predicate)
col-1,. . . , col-n(R)
–outputs a subset of the relation R that contains only the specified attributes(columns) with names col-1, …, col-n and also eliminates duplicates
R US
–outputs a new relation having all tuples of R, or S, or both R and S, and alsoeliminates duplicate tuples
R –S
–outputs a new relation having all tuples that exist in R but not in S
Union-Compatible

– the schemas of the relations R and S must match, i.e.:


• R and S must have the same number of attributes


• every pair of corresponding attributes musthave the same domain

R nS
–the output relation has all tuples existing in both R and S
R XS

– outputs a new relation that is a concatenation of everytuple from R with every tuple from S


– nofurther “compatibility” assumptions on the relations

R ÷S
–outputs a relation over the attributes C that consists of the tuples from Rthat match every tuple in S
TeleprocessingArchitecture:

– the traditional (and most basic) architecture


– one computer with asingle Central Processing Unit (CPU)


– many (end-user) terminals, all cabled to the centralcomputer


– a terminal sends messages to the central computer (throughthe user’s application program)


– all data processing in the central computer

Downsizing:

–replace expensive mainframe computers with cost-effective networks of personalcomputers


– achieve the same / better results

File-Serverarchitecture

• Processing is distributed around a computer network


– typically through a Local Area Network (LAN) – one centralfile-server (computer containing the database)


– every workstation has its own DBMS and its own userapplication


– workstations request files they need from the file server


– file server acts like a “shared hard disk” (ithas no DBMS ! )

Client-Serverarchitecture

• Client: requires some resource


• Server: provides the resource


• Client / Server are not always in the same machine / place


• Two-tier architecture:


– Tier 1 (client): responsible for the presentation of datato the user


– Tier 2 (server): responsible for supplying data servicesto the user


• Typical procedure:


– user gives a request to the client


– client generatesSQL query and sends it to the server


– server accepts, processes the query and sends the resultto the client


– client formats the result for the user

Advantagesof client-server architecture

– increased performance: many client CPUs work in parallel


– reduced hardware costs: only the server needs increased storageand computational power


– reduced communication costs: less data traffic(not unnecessary data are transmitted)

Three-tierclient-server architecture

– First tier: user-interface layer (on end-user’s computer)


– Second tier: application server (connects to many users)


– Third tier: database server

Distributeddatabase
– alogically interrelated collection of shared data, physically distributed over anetwork
DistributedDBMS (DDBMS):

– the software system that can manage the distributeddatabase


– it makes the distribution transparent (invisible)to users

Ina DDBMS

– a single logical database, which is split into fragments


– each fragment is stored on one (or more) computers, underthe control of a separate DBMS


– all these computers are connected by a communicationsnetwork


– sites have local autonomy: independent processing of localdata (via local applications)


– sites have access to global applications (toprocess data fragments stored on other computers)

Distributedprocessing

– a centralized database that is accessed over a computer network


– for example: the client-server architecture


– not the same as distributed DBMS !

Threecorrectness rules for the partitioned placement:

1. Completeness


– if relation 𝑅is decomposed into fragments 𝑅1,𝑅2,…,𝑅𝑛,each data item that can be found in 𝑅 must appear in at leastone fragment 𝑅𝑖


2. Reconstruction


– it must be possible to define arel. algebra expression that can reconstruct 𝑅 from its fragments


3. Disjointness


– if a data item appears infragment 𝑅𝑖, it should not appear in any otherfragment


– exception for vertical fragmentation: primarykey attributes must be repeated for the reconstruction !

(dis)advantages of a distributed DBMS
• Advantages: – Reflects organizational structure – Improves shareability and local autonomy – Improved availability and reliability – Improved performance – Smaller hardware cost – Scalability • Disadvantages: – Complexity – Higher maintenance cost – Security – Integrity control more difficult – Design more complex – Lack of experience in the industry