• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/46

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

46 Cards in this Set

  • Front
  • Back
  • 3rd side (hint)

Resource consumed over internet

Infrastructure- computing power, storage


Software - middleware, development


Application


Business process

SABI

1st Enabling technologies

Virtualisation technology- images procreated and allocated

Anything that is not physical

2nd enabling technologies

service oriented architecture-


Component


Flexibility


Reusability


Extension

Cerf

1st principle CCOA

Integrated ecosystem management for cloud-



Support the management of the cloud


Ecosystem includes all services and solutions ,vendors, partner , end-user to provide or consume shared resources


Integrated onboarding process and common utilities to hosting environment


Support the seamless collaboration and message exchanges among Cloud vendors


Natural habitat

2nd principle of CCOA

Virtualisation for cloud infrastructure



Hardware virtualisation- hardware equipment to plug and play mode



Software - software image management or software code virtualisation to enable the software sharing

3rd principle of CCOA

Service Oriented for Common Resuable services



Further analyse - business value


Asset reusable


Composite application


Mashup services



Two types



Cloud Horizontal- various platform service hide complexity of middlware database and tools



Cloud vertical - domain specific to industry specific utility

4th principle of CCOA

Extensible provisioning and susbscription for Cloud



Handle service providers provisioning process


Service consumer's subscriptions process

5th principle of CCOA

Configure enablement for Cloud offerings


Cloud offerings address business goal


Cloud offerings contains storage cloud and infrastructure cloud



Most cloud offerings delivered or accessed through Web browsers

6th principle of CCOA

Unified information representation and exchange Frame work



The message exchanged among cloud client partnera and vendors

7th principle of CCOA

Cloud Quality and Governance


responsible for the identification and definition of quality indicators for Cloud computing environment and a set of normative guidance to govern design deployment operation and management of Cloud offerings



Quality of services parameters cloud entitles reliability,response time security and integrity

Service oriented cloud computing Architecture

Connection between SOA and cloud issues



Service oriented cloud computing Architecture



Platform as a service paas


Software as a service saas



SOA arch pattern that guides business solutions to create organise and resuse its computing components

7V s of big data

Volume


Velocity


Variety


Variability - same data meaning change


Veracity - accurate or noisy data


Visualisation


Value - whether adding value to your company

Data stream

Ordered sequence of instances that can be read only once and or few times using limited computing and storage

Data stream model

Data stream model defines an input stream arriving sequentially describes an underlying signal, one dimensional function

Types of dsm

Timeseries


Cash register


Turnstile

Time series

si are equal to Sj


Traffic at IP link for 5 min

Cash register

si is update to Sj


Monitoring ip accessing a web server

Turn stile

si ar updates to Sj


Stock value monitoring

Big data clustering

Single machine


Multi machine

Single machine.

Sample based


Dimension reduction

Multi machine

Parallel


Map reduce

Sample

Birch


Clarans


Cure

Dimension reduction

Random projection


Global projection

Parallel clustering

Dbdc.


Par Metis

Map reduce

Map reduce kmeans


MR D B Scan


Map reduce based on GPU

HDFS

Block structure file system


Flie divided into blocks of pre set size


Each block in one or several machines clusters

Hadoop hdfs

Master name


Slave data

Name node

maintains and manages the blocks present on the data nodes

Chunk server

File split contiguous chunks


Chunk size 16-64


Each chunk replicated 2x and 3x


Try to keep replicas in diff racks

Data node functions

Slave demons which run on each slave machine


Actual data


Low level read write


Report health of hdfs

Secondary node - helper


Not a back up. Reads all meta data from the Ram to hard disk


FS image with edit logs combine



Block

Smallest continuous location where data is stored

Commodity cluster

Number of low cost ,low performance commodity computer working parallel instead of using fewer high performance and high cost computers

Big data problems

Data from different sources can handle


Hdfs distributed computing add new nodes to cluster on the fly


Processing speed processing logic send to data nodes response sent back to client

Cluster architecture

Request or part of request is handled and delivered by two or more than node


Benefit


Load and balancing


High availability

Distributed file system

Managing file folders across multiple computers or server


Data saved over multiple nodes

Problem with traditional approach

Critical path problem- slowest determine speed


Reliability- if one machine fails


Single split - overloading of machine


Aggregate of result - keys wise sorting not possible

Map reduce

Programming model for processing large volumes of data in parallel by dividing the work into a set of independent tasks

Map

The map process the input data


Input data in the form of file or directory (hdfs) .


Input file passed line by line


Process and output small chunks

Reduce

Shuffle stage and reduce stage


Process the data


New sets of output

Map reduce strengths

Great placement in flexibilities


Schedule and load balancing


Can access large dataset

Map reduce weakness

High overhead


Low raw performance

Matrix multiplication

Row, col, value ,matrix ref



Phase 1 compute all products aik *bij


Phase 2sum product of each entry I and j

K means motivation

Clustering large dataset


Initial seed selection


Outlier rejection

Problem statement

Large


Initial


Rejection


Decrease

Function of name node

Manages the slave node


Meta data stored in cluster


Record each change that takes place to the file system metadata


Record of all the block