Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
82 Cards in this Set
- Front
- Back
Big Data
|
General term used to descrie masssive amounts of data ata are often unstructured available to today's managers. Big data are often unstrucutred and are too big and costly to easily work through use of conventional databases
|
|
business intelligence
|
term combining aspects of reporting, data exploration and ad hoc queries, and sophisticated modeling
|
|
analytics
|
term describing extensive use of data, statistical and quantitative analysis, explanatory predictive models and fact-based management to drive decisions and actions
|
|
data rate
|
amount of data on corporate hard drives doubles every 6 months
|
|
data
|
raw facts and figures
|
|
information
|
data presented in context so that it can answer a question or support decision making
|
|
knowledge
|
insight derived from experience and expertise
|
|
database
|
single table or a collection of related tables
|
|
database management systems (DBMS)
|
software for creating, maintaining and manipulating data
|
|
structured query language (SQL)
|
language used for creating and manipulating databases
|
|
database administrator (DBA)
|
job title focused on directing, performing, or overseeing activities associated with a database. Design, creation, implementation, maintenance, backup and recovery, policy setting and enforcement, and security
|
|
table or file
|
list of data, arranged in columns (fields) and rows (records)
|
|
column or field
|
each category of data contained in a record (e.g. first name, last name, ID)
|
|
row or record
|
represents single instance of whatever the table keeps track of (e.g., student, faculty, course title)
|
|
key
|
code that unlocks encryption (relationship between tables in database)
|
|
relational databases
|
multiple tables are related based on common keys (most common)
|
|
transaction processing systems (TPS)
|
systems that record a transaction (some form of business-related exchangge), such as a cash register sale, ATM withdrawal, or product return
|
|
transaction
|
some kind of business exchange
|
|
loyalty card
|
systems that provide rewards and usage incentives
|
|
Feasibility Analysis
2 different paradigms of valuation |
1) Role and Function
2) Contribution |
|
role & function. definition and 7 characteristics
|
assesment of comprehensiveness and sophistication of solution. What does it do?
1) Simple, repetiive batch application (EDP) 2) real time transactions processing (TPS) 3) Generation of predefined managment reports (MIS) 4) simple statistical analsis 5) modeling - time syncs, progressional 6) consequence analysis 7)Expert system |
|
Contribution
|
end game analysis where IT is the means
impact analysis, bottom line, business impact |
|
Impacts of Innovation
|
Positive/Anticipated - Target
Positive/Latent - Lucky Negative/Anticipated - Calculated Negative/Latent - Unlucky |
|
2 types of contributions
|
1) Tangible - quantifiable impact, usually talking about direct $$
2) Less tangible - feel good impacts (ex satisfactoin, customer and employee) |
|
Obervations on tangible impact
|
1) difficult to quantify (ex) what's profit margin if we open a dot com
2) easier to quantify in retrospect 3) vested in the simples system, more sophisticated role and function. Simple process, see effects greater. Problem with forecasting system |
|
2 impacts of tangible impact (either or both)
|
gross profit increases
operating costs decrease |
|
Examples of less tangible benefits
|
customer satisfaction (how much does this equal revenue---> ??)
employee satisfaction better accuracy (Bounded rationality theory) participation in decision making better access to data - can consider more alternatives decision making speed |
|
3 Types of Feasibility Analysis
|
1) Technical feasibilty tool
2) organizational/operation feasibility 3) cost feasibility |
|
cost vs budget
|
cost means worth every penny
budget means do we have it |
|
developmental cost of building IT platform
|
systems analysis and design
platform HW acquisition software acquisition software development physical conditioning training QA/testing Data conversion/migration prone to risk |
|
Chart showing costs at time of roll out vs entire life cycle
|
developmental to operational as life cycle increases
benefit to cost as time of roll out increases |
|
Developmental benefit
|
benefit tangible before system rolls out
1) sell off of existing hardware 2) accelerated depreciation - write off invvestment. tax off set (accelerated depreciation) 3) joint venture |
|
Operational Cost
|
maintenance (Hardware - equipment breakdown, replacement; Software - new user requirements), training, upgrades, vendor service agreements, overhead, backups, security, telecom, utilies
|
|
Operational Benefit
|
increased gross profits or reduced operating cycle
need to be > OC and also start to offset some developmental costs |
|
Payback Analysis and positives and negatives
|
do I recover my investment
+ conceptually simple easy to calculate - only consideres tangible costs and tangible benefit cannot factor in intangible, less tangible only as good as forecast. hard to predict software and maintenance costs in years 2,3 and 4 total current operating cost is a sunk cost how do we react to time for break even LACKS A BENCHMARK system alternative - what happens after time value of money - investment rates, NPV analysis |
|
Net Present Value Approach
|
Positives
consider entire life of project trust near term projections more considering time value of money (opportunity cost) Considers risk Negative no less tangible benefits/costs being factored in forecasts are being relied on heavily |
|
DATABASE Platform - Typical 3GL (COBOL) Arch
|
see notes 4/8
|
|
mf - masterfile
|
static demographic information
|
|
tf - transacational file
|
dynamic transactions information
|
|
how does application program purse the above?
|
DAR - Data Access Routine - most intense part of program
|
|
5 Problems with 3GL/DAR
|
1) Data Redunancy
2) Data Isolation 3) Tight Linkage 4) Decentralization of controls 5) Programming intensive |
|
Data Redunancy
|
many have checking and saving accounts,
wasted space need to change both when updated leads to integrity probelms |
|
Data Isolation
|
difficulty or inability to create applications that span functional areas
ex) mutual fund - periodic mailing, same household - 3 accounts in same mutual fund, how to idenify who is same household |
|
Tight Linkage
|
structural change to the file would have numerous and unpredictable changes to the source code
Costs- money cost is insignificant new changes introduce risk and bugs |
|
Decentralization of Controls
|
independent, relying on developer to run checks on subprograming
ex) putting a (-1) as quantity on online shoping Control: numeric, positive, integer, set maximum only as good as weakest link |
|
Programming Intensive
|
write DARS --> labor intensive (time is money)
Interval costs can run very high develop safe applications only due to costs DSS (trying out ideas) - want to be able, but costs make one relucatant |
|
Typical 4GL (Oracle) ARch
|
See 4/15 notes
|
|
DDl
|
data defintion language
static - structural Defines: fields - phone no. alpha (10), files: cost, MF, rules: Hours worked <= 60, relationships among files |
|
DML
|
Data Manipulation language
dynamic property insertion deletion modification/update inquiry |
|
Hierarchical Database Model
(4GL Model 1) |
Explicit structure
parent can have multiple children, children can only have 1 parent. (cant alter sequence, can't clone (redundancy file) locate records - query executed on basis of tree traversal |
|
Hierarchical Database Model
Good and Bad |
Good
1) Query execution --> very efficient Bad 1) System overhead of maintaining link lists. Lot of maintenance activity necessary. CPU - run time |
|
Network Model - DMS (see 4/22)
(4GL Model 2) |
explicit structure
allows for multiple parents (draw arrows from 1 to N) arrow is named, link lists Deviator now has control over link list, can control what order it comes out (chronological, alphabetic, etc.) |
|
Relational Model (dominant market player)
(4GL Model 3) |
no explicit structure
not tell DDL (not asking to maintain link lists) still need ability to connect with common fields Catch: how to execute query without link lists No overhead costs - b/c no link lists $$ in execution (uses overhea) need to join thousand records, select function pulls 3 |
|
Relational Alegra (SQL)
|
Join/select commands
how ramp up execution speeds? create link lists "secondary key", use limited scale, on frequent queries |
|
Costs (inhibitors) or adoption of an original DBMS
|
HW - requires more powerful CPU get reasonable speed
SW - subscription/licensing for platform and software migration Data - need to establish data standards, all requred fields for next 5 years, poplate the database Procedures - centralization - volatility (what if data disrupts or needs to be taken down), vulnerability (corruption of data) People - retraining, hiring/fire, data base admin, point man, |
|
Headquarters (HQ) Dominant vs Fully Distributed
|
HQ dominant - NY office holds all cards (centralized)
Fully - hold locally |
|
HQ Dominant Advantages
|
establishes global SOPS
maintains global data and software standards can settle priorities at gobal level economies of scale on IT staff and hardware simplify global data storage, realtively secure matches centralized DM model (match culture) |
|
HQ Dominant Disadvantages
|
fails to recongize cultural differences
doesnt recognize local priorities high overhead (update entire system at once) volatility (global system is down, entire company sufferes) vulnerability - break into 1 system for all data forein to a decentralized culture |
|
Factors in the Adoption of WAN Arch
|
1) Development cost - initial, incremental
2) Relative volatility - node failure - how does it affect the rest of the network, frequency of disruption, duration (minutes hours days) , severity 3) Relative efficiency (worse case, best case) 4) Relative complexity (complexity of network management), prevent data collisions |
|
4 Models
|
Fully Connected Model
Star Model Bidirectional Ring Hierarchical star |
|
Fully Connected Model
|
- not realistic, runaway costs, benchmark
Sum(n-1) = n(n-1) over 2 DC -= very high RV - affects nothing beyond 1 node being down, self controlled error E - point point, 1, 1, 1 C - negotions/handshake |
|
Star Model
|
central switch acts as router
DC = n, much cheaper, RV - route failure is real bad. external node failure not big deal E - best case 2. 2. 2 C - use method authorization - central switchboard has to be constant |
|
Bidirectional Ring
|
DC - N, incremental cost =2 (basically same as star)
RV - if node braks can awlays go other way. Self contained failure, not all or nothing, more robust E - bet case = 1, worst case = n-2, best worst case = n/2 C - 2 methods to prevent collisions 1) locking - have to grab electronic token, only 1 node can release info at a time (costly, slowing down) 2) Rollback Restart -do nothing and monitory for collision. Random number, restart |
|
Hierarchical star
|
DC - n, incremental costs - 1
RV - gradations of volatility. Mid point down - bad E - best case = 1, worst case - longest path C - authorization, ranking system |
|
Impact of WAN on IT Infrastrcutre
|
Operating System - translate to netowrk language, deal with locking packages message efficiently
deals w/ network CCP (communications control portal) Hardware - CCP is background program, need more power processor for reasonable performance Software - programs have to be written indepently of physical storage of data Data - local primary user vs global 3 alternates 1) centralized global data - -put all in one place 2) replication/duplication -- clone database 3) partitioning --take % on each of 5 servers but only 1 true form of data (lowers efficiency) |
|
Internet Service Provider (ISP)
|
organization or firm that provides access to the internet
|
|
hyperext transfer protocol (http)
|
application transfer protocol that allows Web browsers and web servers to communicate with each other
|
|
file transfer protocol (FTP)
|
application transfer protocol that is used to copy files from one computer to another
|
|
cybersquatting
|
acquiring a domain name that refers to a firm, individual, product, or trademark with the goal of exploiting it for financial gain. illegal
|
|
IP address
|
Internet protocol address - value used to identify a device that is connected to internet. Usually 4 numbers from 0 to 255 separated by periods
|
|
domain name service
|
DNS - internet directory service that allows devices and services to be named and discoverable. The DNS, for example, helps your browser locate the appropriate computers when entering an like finance.google.com
|
|
cache
|
temporary storage space used to speed computing tasks
|
|
bandwidth
|
network transmission speeds, typically expressed in some form of bits per second bps
|
|
last mile internet types
|
cable broadband - coaxial cable
DSL - use existing copper lines that phone company already ran Fiber optic - FTTH (fastest) wireless |
|
phising
|
con executed using technology, typically targeted at acquiring sensitive information or tricking someone into installing malicious software
|
|
encrpytion
|
scrambling data using a code or formula, known as a ciper, such that it is hidden from those who do not have he unlocking key
|
|
staying power
|
long term viability of a product/service
|
|
total cost of ownership (TCO)
|
economic measure of the full cost of owning a product (typically computing hardware and/or software). TCO includes direct costs such as purchase price, plus indirect costs such as training, support and maintenance
|
|
blue ocean strategy
|
an approach where firms seek to create and compete in uncontested "blue ocean" market spaces, rather than competing in spaces and ways that attracted many similar rivals
|
|
online analytical processing (OLAP)
|
method of querying and reporting that takes data from standard relational databases, calculates and summarizes the data and then stores the data in a special database called a data cube
|
|
data mining
|
process of using computers to identify hidden patterns in, and to build models from, large data sets
|
|
for data mining to work conditions:
|
org must have clean consistent data
events in that data should reflect current conditions and future trends |