• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key

image

Play button

image

Play button

image

Progress

1/113

Click to flip

113 Cards in this Set

  • Front
  • Back
Decision problems
At their simplest, we can consider the job of a computer to be:
given a particular input, produce a specific output
for example, given 2*3, produce 6
Let’s consider a simple version of this:
f(N) = {1, 0}
for any given input N (which is a natural number), this function outputs either a one or zero
odd(12) = 0
prime(13) = 1
This is known as a “decision problem”
--Function that can spit out a 1/0 based on a decision
The Turing machine
Turing proposed a simple theoretical computer with 4 components
Machine table
rules of the form “if state == y and input == j, do z”
this is the “program” of the computer
Machine state
A number that represents the states y
Tape
Of infinite length, divided into discrete squares
On each square, the machine can write a 1 or 0
Represents the inputs, outputs, and parts of program
Read/write head
for reading/writing 1/0 to the tape
Turing machine: example
A Turing machine that adds 1 to a natural number
Represent numbers as a sequence of 1’s
1 -> 1 ; 5 -> 11111

Internal State=0
-Read=0; new internal state=0; write=0; Action=R
-Read=1; new internal state=1; write=1; Action=R

Internal State=1
-Read=0; new internal state=0; write=1; Action=STOP
-Read=1; new internal state=1; write=1; Action=R

So 1111 would become 11111
The Universal Turing Machine
Imagine a Turing machine whose input tape specifies the machine table of another Turing machine
This Turing machine would be programmable
It could imitate any other Turing machine
It could compute any computable function
Most of us have a UTM on our desk (and perhaps even in our pocket or on our wrist)
The Turing-Church hypothesis
If a particular decision problem can be computed by a Turing Machine, then it can be computed by any reasonable computer
Conversely, if it cannot be computed by at Turing Machine, then it cannot be computed by any reasonable computer (no stopping point)
All known models of computing are equivalent!
What does it mean for a decision problem to be computable?
Computability
A problem is computable if:
The algorithm contains a finite number of steps
The algorithm makes use of a finite set of basic operations
The algorithm comes to a halt for any valid input data
Problems for which no algorithm exist are called “undecideable” or “uncomputable”
example: what is the value of Pi?
Is the following sentence true or false:
“This sentence is false”
Just because it is solvable doesn’t mean that it’s solvable in human time
This is the domain of computational complexity theory
-Computational steps
Is the human mind a Turing machine?
There are many problems that are computable but that the human mind can’t solve
e.g., problem-space search in chess
We don’t have unlimited memory like a Turing Machine
We prefer to use heuristics rather than algorithmic computations
Humans and uncomputable functions
There are many functions that humans perform that are in theory uncomputable
inverse optics
estimating transcendental numbers (e.g., Pi)
induction
We solve these unsolvable problems by using heuristics
What is intelligence (average "human" level of thinking from a computer)?
How do we decide if a machine is intelligent?
The Turing Test (1950)
Put a judge in one room connected by teletype to a computer in second room and a human in a third room
If the judge can’t tell which one is the computer and which is the human, then the computer is intelligent

--Rarefied form of interaction
The Turing Test
The Turing Test provides a principled definition of intelligence that only refers to behavior
It is a behaviorist approach to intelligence
It doesn’t say anything about what is going on in the person or computer’s mind
An early example of AI: ELIZA
A simulation of a psychiatrist written by Weizenbaum in the 1970’s
Uses a set of very simple strategies

Eliza worked by simple parsing and substitution of key words into canned phrases. Depending upon the initial entries by the user the illusion of a human writer could be instantly dispelled, or could continue through several interchanges. It was sometimes so convincing that there are many anecdotes about people becoming very emotionally caught up in dealing with ELIZA for several minutes until the machine's true lack of understanding became apparent. All this was due to people's tendency to attach to words meanings which the computer never put there.
Problems with the Turing Test
Who gets to be the judge?
A naïve judge might think that anything was intelligent
In a public test, 5/10 people thought that a version of ELIZA was intelligent
A system with canned answers to every possible question could pass the test without any intelligence (reasoning, awareness, etc.) at all
Strong vs. Weak AI
Different AI researchers have different goals (Searle, 1980)
Weak AI
The idea that computers can be useful to simulate mental processes
Strong AI
The idea that an appropriately programmed computer really is a mind, in the sense that it understands the world
The mind is just another kind of program
Intentionality
Computers are “syntactic engines”
automatic formal systems
They manipulate symbols according to a specific set of rules
These rules apply on the basis of form, not meaning
e.g., “Boston” versus Boston
-Experienced, exists in the world and has memories/knowledge attached to it

Human minds are “semantic engines”
Our thoughts have meaning, in that they refer to things in the outside world (e.g., Boston the city)
The symbols in the human mind have intentionality (about-ness)
The language of thought
Fodor (1975) proposed that the symbols in the human mind are in a “language of thought”
It is this language on which the computations of the mind are performed
This language is like a natural language
but not necessarily the same
The LOT hypothesis explains the productivity of thought
Just as the syntax of language explains its productivity
The Chinese Room
Searle (1980) argued that a computer program can’t understand anything (not learned but stored)
It doesn’t have intentionality
He did this using a thought experiment fashioned after the Turing Test
Locked in the Chinese Room
Jack (an English speaker who knows nothing about Chinese writing) is locked in a room and is given a batch of Chinese writing
He is then given a second batch of Chinese writing along with a a set of rules (in English) for correlating the first batch with the second batch and producing an appropriate Chinese character
The Chinese Room problem
Unbeknownst to Jack, he is part of a Chinese Turing Test
The first batch of Chinese writing was his “knowledge”
The second batch was “questions”
His response was “answers to the questions”
To someone observing from outside the Chinese room, it appears that the person in the room “understands” Chinese
But: Does anyone really believe that Jack understands Chinese?
-Doesn't know content of symbols
-Doesn't know relationship between symbols and outside world
Can computers understand?
If we take the program to be analogous to the rules provided to Jack, and Jack as the computer, the Chinese Room suggests that computers cannot understand
They simply manipulate symbols on the basis of formal rules (syntax)
These symbols do not relate to anything in the world (they don’t have intentionality)
New approaches to AI
The GOFAI (“good old-fashioned artificial intelligence”) approach has yielded great success in many domains, but has failed to approach the flexibility of human intelligence
Recent work in AI has tried to build intelligence from the ground up rather than from the top down
The evolution of intelligence
Timeline of biological evolution

Single-cell organisms 3,500mya
Photosynthetic plants 2,500
Fish and vertebrates 550
Insects 450
Reptile 370
Dinosaurs 330
Mammals 250
Primates 120
Humans 2.5
The hard part of intelligence
The things that we commonly think of as “intelligent” are very new
Other things are much harder
Mobility in a dynamic environment
Sensing surroundings well enough to maintain life and reproduce
Workers in AI have begun to focus on making machines that exhibit these kinds of simple intelligence
How smart are insects?
Beetle Brain
just ganglions rather than brain, but a dung beetle can execute its egg-laying technique and solve simple problems caused by obstacles
The lowly roach
Escape skills of the roach (Clark, 1997)
It senses the wind disturbance caused by the motion of an attacking predator
It distinguishes winds caused by predators from normal winds and air currents
It does not avoid contact with other roaches
When it does initiate an escape motion, it does not run at random. Instead, it takes into account its own initial orientation, the presence of obstacles (such as walls), the degree of illumination, and the direction of the wind
The roach-brained car
Imagine there was a car with a computer as intelligent as a roach
It would:
sense approaching vehicles, but only those moving in abnormal ways
in the case of an impending collision, initiate a turn taking into account its own current state (speed and position), road conditions, and the presence of other dangers
Doesn’t sound so bad, does it?
Brooks’ approach
Rodney Brooks of MIT has argued against the classic AI approach
Intelligence can be produced without the explicit representations and reasoning processes of GOFAI
Rather, intelligence is an emergent process of certain complex systems
Intelligence grows out of basic processes
Mobility in a changing environment
Survival-related tasks (feeding, reproduction)
Acute sensation of the environment

Brooks and his colleagues have focused on building robots that exhibit these basic behaviors
-only programmed to get around the world generally
Can intelligence really emerge?
The example of cellular slime mold
As long as there is sufficient food locally, the mold remains in a vegetative state
individual mold cells grow and divide like amoeba
When food runs out, the cells cluster together into a tissue-like mass
acts like a miniature slug
is attracted to light, and follows temperature and humidity gradients
Once new food is found, the cluster forms into a stalk-like creature and releases spores, which start a new
Physical symbol systems
Newell (1980) argued that both brains and computers can be treated as “physical symbol systems”
These are necessary and sufficient for general intelligent action
Any intelligent system will be a symbol system
A symbol system has several properties:
Memory - to hold information about the world
Symbols - to represent the world or internal goals
Operations - to manipulate symbols
Two essential properties of symbols
They can designate things in the world
They can represent things or operations
Cognition as search
In a symbolic system, the problem of cognition boils down to searching for the appropriate operator in any particular state
Heuristic search
Blindly searching the problem space will fail for complex problems
combinatorial explosion!
Solution: use rules (assoc?>)to guide search
Newell & Simon’s GPS
Problem solving has both task-dependent and task-independent aspects
Task-independent: “If you can’t solve all of the problem, try to solve part of it”
Task-dependent: “If the Windows Registry gets corrupted, the machine may hang”
The “General Problem Solver” was designed to be able to solve any problem
Explicitly separates the task-independent and task-dependent aspects

The user defined objects and operations that could be done on the objects and GPS generated heuristics by Means-ends analysis in order to solve problems. It focused on the available operations, finding what inputs were acceptable and what outputs were generated. It then created subgoals to get closer and closer to the goal.
Representation in GPS
Problem solving in GPS (general problem solver) occurs by heuristic search through a state space
Each state is a snapshot of what the system knows at a particular point in time
composed of one or more objects, which contain symbolic structures and/or operators (programs)
Operators in GPS
Operators are like programs that take symbols as input and output
GPS applies operators to the current state in order to generate new states that are closer to the goal state
Operators can do various things:
compare symbol structures
create new symbol structures
read input and write output
store symbol structures to memory
Means-ends analysis
Problem: How to decide which operator to apply to any given state?
Option 1: Apply them all (exhaustive search)
This is not possible for complex problems
Option 2: Apply the best one
How do we know which that is?
Means-ends analysis says that one should apply the operator that eliminates the most differences between the current state and the goal state
Each operator is classified in terms of the differences that it eliminates (the GPS uses his information to choose)
GPS: defining operators
(defparameter *school-ops*
(list
(make-op :action 'drive-son-to-school
:preconds '(son-at-home car-works)
:add-list '(son-at-school)
:del-list '(son-at-home))
(make-op :action 'shop-installs-battery
:preconds '(car-needs-battery shop-knows-problem shop-has-money)
:add-list '(car-works))
(make-op :action 'tell-shop-problem
:preconds '(in-communication-with-shop)
:add-list '(shop-knows-problem))
(make-op :action 'telephone-shop
:preconds '(know-phone-number)
:add-list '(in-communication-with-shop))
(make-op :action 'look-up-number
:preconds '(have-phone-book)
:add-list '(know-phone-number))
(make-op :action 'give-shop-money
:preconds '(have-money)
A GPS trace
> (gps '(son-at-home car-works)
'(son-at-school)
*school-ops*)

Achieve SON-AT-SCHOOL ...
Trying DRIVE-SON-TO-SCHOOL ...
Achieve SON-AT-HOME ...
Achieve CAR-WORKS ...
Executing DRIVE-SON-TO-SCHOOL
State = (SON-AT-SCHOOL CAR-WORKS)
SOLVED
Testing GPS
GPS was meant as a theory of human problem solving
It was tested by having humans solve the same problems while they thought out loud
This is called verbal protocol
These protocols were then analyzed to classify what kind of behavior the subject was exhibiting
In many cases, humans behaved very similarly to GPS
Newell & Simon (1972): 84% of utterances showed patterns that were also exhibited by GPS
only 10% of utterances showed patterns that were not exhibited by GPS
Importance of GPS
One of the first attempts to provide a general theory of problem solving behavior
“weak methods” AI: search heuristics that don’t rely upon knowledge about the specific domain
Introduced means-ends analysis into cognitive science/AI
Production system models
Production systems are currently the most common symbolic approach within cognitive science
A production system models cognition using condition-action sets known as productions
If a certain condition is met, then perform a certain action
IF person 1 is the father of person 2
and person 2 is the father of person 3
THEN person 1 is the grandfather of person 3

[Simulating understanding (of relationships)]
Unified theories of cognition
Two well-known production-system frameworks aim to provide full accounts of cognition
They argue that all of cognition can be modeled using a single architecture
ACT-R (John Anderson)
More heavily tested against psychological data
SOAR (Newell, Laird, & Rosenbloom)
More comprehensive implementation
Expert Systems
One of the most successful areas in AI has been the development of expert systems
Two components of an expert system:
knowledge base
derived from extensive interviews with human experts
inference engine
works with the knowledge base to reason about the domain
MYCIN
MYCIN was an expert system developed to assist physicians in the treatment of certain bacterial infections
It asks specific questions about symptoms, test results, suspected organisms, etc
It then provides recommendations for treatment
Also provides an explanation for its reasoning, and provides a measure of uncertainty for its recommendations
MYCIN rules
MYCIN uses production (if-then) rules, such as:

if the stain of the organism is gram-positive
and the morphology of the organism is coccus
and the conformation of the organism is clumps
then (0.7) the identity of the organism is staphyloccus
Evaluating Expert Systems
MYCIN was compared to the performance of actual physicians on its recommendation
Expert raters didn’t know that MYCIN was included in the comparison
MYCIN outperformed all of the physicians
including Medical School faculty!
Particularly good at prescribing exact dosages and dealing with drug interactions, which human doctors aren’t so good at
Problems with expert systems
Limited to verbalizable knowledge
If human experts rely upon intuition or other forms of implicit knowledge, then expert system will fail
Expert systems don’t have common sense knowledge
Learning
Most expert systems do not learn from their experience
e.g., in MYCIN, new diseases have to be coded in by hand
CYC: Formalizing Common Sense
The CYC (“Encyclopedia”) project began in 1984
Goal: Build a “universal” expert system that can understand natural language and detect violations of common sense as well as humans can
More than 3 million hand-coded rules
Trees usually grow outside
When people die, they stop buying things
Glasses of liquid should be carried right-side up
The problem of context (CYC)
The CYC project ran into huge problems because rules almost always depend upon their context
Vampires are not real, but in fictional settings they may be treated as real
This has increased the size of the knowledge base by a factor of 10
The hype of CYC has not lived up to the reality
> $50 million has been spent so far
Expertise and problem solving
What makes a person an expert in a particular domain (e.g., chess)?
Experts are not just smarter
You wouldn’t want to visit Einstein if you needed brain surgery
They have extensive knowledge in a particular domain
They know how to solve many problems in that domain
They can diagnose novel problems
Chess expertise and memory
Chess experts are better able to remember real chess positions
They are actually worse at remembering random positions
They chunk the entire position as one item
Experts see the world differently
They process the world in terms of their knowledge

Chase & Simon (1975)
Chase & Simon (1975)
Chess positions to analyze expertise in chess

Beginner: worst at meaningful
Class A: middle-ish
Master: best at meaningful but worst at random chess positions
Experts categorize differently than novices do (Chi et al., 1983)
Physics problem as example: block on incline

Novices categorize in terms of surface features of the problem
e.g., "block on an inclined plane," "...coefficient of friction"

Experts categorize in terms of underlying physical laws.

e.g., "Conservation of energy," "work-energy theorem, these are straightforward problems,"
Deep Blue
Deep Blue is an expert system for chess
Beat Gary Kasparov in 1997
It contains extensive knowledge of the game
from human chess experts
It can search ~200 million positions/second
Gary Kasparov can consciously evaluate ~ 3 positions/second
Problems with symbolic AI
Brittleness
The systems fail completely whenever the problem extends beyond their built-in knowledge
Example:
The robot ant won’t be able to climb a 3-inch telephone book if its program only includes instructions for climbing a 2-inch book

The frame problem
The frame problem in AI (Robot R1)
Imagine a robot R1 (Dennett, 1984)
its only job is to fend for itself
One day its designers arrange for it to learn that its precious spare battery is locked in a room with a time bomb set to go off soon.
R1 located the room, found a key to open the door, saw that the battery was on the wagon, and executed the operator PULLOUT(wagon, room)
Unfortunately the bomb was also on the wagon, and R1 didn’t realize that an implication of its action would be to move the bomb with the wagon
-context
the robot-deducer R1D1
The designers create a new robot that deduces not just the implication of its acts, but also of their side-effects
Put in the same situation, R1D1 also chose the PULLOUT(wagon,room) operator, and began to deduce its implications and side effects
It had just finished deducing that pulling the wagon out of the room would not change the color of the walls, and was embarking on a proof that pulling the wagon from the room would cause its wheels to turn more revolutions than there are wheels on the wagon - when the bomb exploded!
the robot-relevant-deducer R2D1
The designers set out to build a robot that knows the difference between relevant and irrelevant implications, and ignores irrelevant ones
When put to the same test, the robot simply sat outside the room
“Do something!” the designers yelled
“I am!” said the robot. “I’m busily ignoring the thousands of implications I’ve deemed irrelevant. Every time I discover a new irrelevant implication, I put it on the list to ignore”. Then the bomb exploded.
The frame problem
The frame problem refers to the fact that humans seem to just know which facts are relevant in any particular situation
Computers, on the other hand, must be programmed to know which of the infinite number of possible implications of any action are relevant
This has proven to be the most difficult problem for strong symbolic AI
Neural computation
How does a computer compute?
Complex operations built-in
Very fast and error-free
Fixed architecture
How does a neural network compute?
Lots of simple computational units
Each unit is slow and noisy
Highly connected units
Plastic connections
What is the simplest thing we could try to compute?
Simple Boolean logic

A/B/AND/OR/XOR
0/0/0/0/0
0/1/0/1/1
1/0/0/1/1
1/1/1/1/0
The perceptron
A simple neural network that can compute some Boolean logic functions

Input units:
A and B (with modifiers that can adjust strength of inputs separately)
Output Unit:
O
Decision:
1/0

OR gate and AND (with 0.5xinput) gates work, but XOR causes problems
Linearly separable OPERATIONS
AND and OR are linearly separable problems

AND
A0B1 A1B1--N Y
A0B0 A1B0--N N

OR
A0B1 A1B1--Y Y
A0B0 A1B0--N Y
XOR gate
XOR is not linearly
separable

A0B1 A1B1--Y N
A0B0 A1B0--N Y

So, it can’t be solved
by a simple perceptron
How can we solve more complex problems? (XOR and above)
Minsky & Papert’s 1969 book “Perceptrons” highlighted the limitations of the perceptron
Had a chilling effect on neural network research until the 1980’s
In 1986, a set of books was published (“the PDP books”) that brought this approach back to life
Outlined an approach that solved many of the limitations of perceptrons
Components of a PDP model
AKA neural net, connectionist model
Set of processing units, each having:
a state of activation
an output function
Pattern of connectivity between units
and rule for propagation of activity
Activation rule
for combining the inputs to a unit to produce a new activation level
Learning rule
for changing the patterns of connectivity based on experience
Representation of the environment
coding of features of the the environment
Processing units of a PDP model
Each unit (Input, Hidden, and Output) has a level
of activation ai

The output of the unit
is determined by the
output function:

oi = f(ai)

Usually identity function,
threshold function, or
stochastic function
Connectivity of a PDP model
Changing the patterns of connectivity based on experience (and the activation function).

Modifying operations of input passed along connectivity lines
Activation function
In PDP models...
Determines the level of activation in a unit based on the net input (which is determined by the input unit activity * weights)
The type of problem that can be learned depends upon the activation function
Needs to be nonlinear to learn interesting problems
threshold
sigmoid
Learning rules
Rules for altering the connectivity of the network depending upon experience
Hebb rule
increase connectivity when two units fire at the same time
Delta rule
Change weights in proportion to the amount of error between desired and actual output
Beyond Perceptrons (PDP benefits)
The main aspect of PDP models that makes them more powerful than Perceptrons is that they have multiple layers
Allows solution of problems that are not linearly separable
The main advance of Rumelhart, McClelland, & colleagues was to (re)discover a method to train a multi-layer network
Known as “backpropagation of errors”
Solving XOR using a multi-layer network
[Image of proper network]

Backpropagation can look at out put, see if it matches what was desired and then change weights accordingly
Forms of machine (statistical) learning
There are several different ways in which a neural network can learn
Unsupervised learning
Reinforcement learning
Supervised learning

The intersection of machine learning and computational neuroscience is one of the hottest areas in cognitive science right now
Unsupervised learning
Learning about the environment without being told what is going on
E.G., An infant learning to see
How can a system do this?
By being sensitive to the statistics of the input, such as correlations between different features
Hebbian learning
When an axon of cell A is near enough to excite cell B or repeatedly or consistently takes part in firing it, some growth or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased. Donald Hebb (1949)
Hebbian learning
An example: Ocular dominance
Cells in the visual cortex respond preferentially to one eye
This develops with experience
Deprivation of input to one eye reduces its representation in the cortex

Plasticity of ocular dominance:
shows that deprivation can affect growth of brain regions typically dedicated to the deprived eye
A model of ocular dominance
Miller et al (1989)
Showed that Hebbian learning can result in ocular dominance
Relies on fact that nearby inputs from same eye are more correlated than inputs from different eyes

-Cells sensitive to how correlated their firing is
A neural mechanism for Hebbian learning
Long-term potentiation (LTP)
A persistent change in the firing of neurons due to previous activity
potentiation = increase in firing strength
The mechanisms by which LTP is induced and maintained are still a matter of controversy
But the basics of the mechanism are fairly well understood
Parallels between LTP and memory
LTP is prominent in hippocampus, which is important for memory
LTP develops rapidly (within 1 minute)
LTP is long-lasting (up to several weeks)
LTP is specific to active synapses
LTP is associative
detects simultaneous activity across neurons
The Morris water maze
The rodent is placed in a tub of cloudy water
The tub has a small platform that the rodent can stand on
Rodents don’t like being in the water!
LTP and the water maze
Blocking LTP (by blocking NMDA receptors) does not impair the ability to swim or find the platform
However, it does impair the rodent’s long-term memory for the location of the platform
Reinforcement learning
Learning how to act without being told what the right action is, but receiving reinforcement when the right action is taken
A problem: credit assignment
How do you know which of the 20 things you did in the last minute are responsible for the reinforcement?

Prediction of reward affects repetition of action
-reward higher than predicted=more responses

The basic paradigm of reinforcement learning is as follows: The learning agent observes an input state or input pattern, it produces an output signal (most commonly thought of as an "action" or "control signal"), and then it receives a scalar "reward" or "reinforcement" feedback signal from the environment indicating how good or bad its output was. The goal of learning is to generate the optimal actions leading to maximal reward. In many cases the reward is also delayed (i.e., is given at the end of a long sequence of inputs and outputs). In this case the learner has to solve what is known as the "temporal credit assignment" problem (i.e., it must figure out how to apportion credit and blame to each of the various inputs and outputs leading to the ultimate final reward signal).
The actor/critic model
Two components
Actor
Implements a policy for which actions should be chosen depending upon their potential value (‘advantage’)
Critic
Provides an error signal comparing the predicted outcome to the actual outcome and allowing the potential value of actions to be revised
Reinforcement diagram
(I)Evaluate actions (the ACTOR)
-Assess reward, delay, risk
--Striatum; Frontal and Parietal Cortex
---place expected value on available actions

(II)Choose an Action
-Biased toward richest options
--Same areas as (I) or downstream

(III)Learn from Experience
-Compare predicted and actual reward
--Dopaminergic error signal (the CRITIC--leads to plasticity)
Dopamine neurons signal
reward prediction error
Schultz, 1988

[graphs that show effect in relation to prediction errors: no prediction and reward (Dopa at reward), prediction and reward (Dopa at CS), and prediction and no reward (DIP in Dopa at approximate time of reward)]
An example: TD-Gammon
A neural network model that learned to play backgammon using a reinforcement learning algorithm (the temporal difference, or TD, algorithm)
Was able to learn to play world-class backgammon, and actually changed the way that top players played the game

negative points/game smaller each revision while number of training games increased from 300,000 to 800,000 to 1,500,000

TD-Gammon is a neural network that trains itself to be an evaluation function for the game of backgammon by playing against itself and learning from the outcome.
Supervised learning
Learning by being taught what to do
Requires a “teacher”
Supervised learning
An example: NetTalk
Learned how to read
-backpropagation correction
-told right answers

TRAINING TEXT (from Carterette & Jones, first-grade conversation):
You mean uh um like England or something.
When we walk home from school I walk home with two friends and
Because um one girl where every time she wants to runs she gets the
And then she cant breathe very well and she gets sick.
Thats why we cant run.
I like to go to my grandmothers house.
Well because she gives us candy.
Well um we eat there sometimes.
Sometimes we sleep over night there.
Sometime when I go to go to my cousins I get to play soft ball or
Thing I hate to play is doctor.
Oh.
I hate to play doctor or house or that.
Dont like it or stuff.
Weve been learning a lot of Spanish words.
Our teacher speaks Spanish sometimes.
So does my father.
Well my father doesnt know very much Spanish but he doesnt know what
In Spanish.
The “rules” controversy
The classical view in cognitive science is that knowledge is stored in a explicit symbol system
Connectionist models do not explicitly store symbols, but instead store knowledge in connection weights
Radical connectionism
The classical symbolic view of cognition is wrong
It can’t explain:
Graceful degradation with damage
Spontaneous generalization of knowledge
Context-sensitivity of knowledge
Others take a weaker view
Neural nets may implement a symbolic system

Final word
The neural computing approach has been much more productive than the symbolic approach over the last 20 years
There are many models that show how symbolic processing can arise from neural networks
Consciousness
Consciousness is probably the most baffling problem in cognitive science
Consciousness can mean many things
many of these meaning are things that we have already discussed
conscious sensation
attention (direction of)
the ability to describe mental states using language
The “hard problem” of consciousness relates to subjective experience
How do things seem to you?
-subjective experience
This is also known as the problem of qualia
-qualities of conscious perceptions
Scientific approaches to consciousness
Scientific interest has mostly centered on asking what kinds of cognitive processes require conscious awareness
These studies avoid the hard problem
Conscious awareness is always defined in terms of specific tests for awareness
The implicit/explicit distinction
Most research regarding the role of consciousness has centered around the implicit/explicit distinction
Explicit phenomena are those that involve conscious awareness and memory of the past (declarative memory)
Your perception of my voice
Your memory for breakfast
Implicit phenomena do not involve conscious awareness or memory
Your memory for how to ride a bicycle
Implicit memory
Think about your ability to drive a car
When you are driving, is it necessary to think back to all of the times when you were learning to drive?
Implicit memory is the memory that underlies skills and other effects of experience that don’t require conscious memory for the past
Multiple memory systems
Research with amnesic patients has shown that they can learn some kinds of information normally
Even though they can’t consciously remember the past!
This has led to the suggestion that there are multiple memory systems in the brain
Separate, independent structures that support separate kinds of memory
Damaging one system does not necessarily impact the other system
Memory systems
The most common distinction is between declarative and nondeclarative memory systems
Declarative memory:
supports conscious memory for facts and events
“knowing that”
relies upon the hippocampus
Nondeclarative memory
supports effects of experience without conscious memory
skill learning, repetition priming
“knowing how”
relies upon a number of brain structures
Skill learning
Patients with amnesia can exhibit normal learning of many different types of skills
Even though they don’t remember having learned them!
Mirror-reading
Read the following words aloud as quickly as possible from right to left:

ambitious bedraggle plaintiff

The amnesics were
able to learn the
mirror-reading
skill just as well
as the normal
controls

However, when
later tested on their
memory for the
words, the amnesics
were much worse.
Repetition priming
Experience with a stimulus leads to enhanced processing of that stimulus later
A commonly studied form of implicit memory
Priming in amnesia
Amnesic patients show normal repetition priming effects
At the same time, they are impaired on tests of recognition memory
Graf et al. (1984)
Amnesic patients and controls studied lists of words
MOTEL, WINDOW
After each list, they were given word stems (MOT__, HOU__) and asked to do one of two tasks:
Cued recall:
Complete the stem with a word from the study list
This is a “direct” test of memory
Word stem completion:
Complete the stem with the first word that comes to mind
This is an “indirect” test of memory
Measuring priming (Graf et al. (1984))
Repetition priming is measured by:
The increase in the likelihood of completing the word stem with a studied word, compared to the case when that word was not studied
Priming in amnesia (Graf et al. (1984))
Amnesics were badly impaired on free recall and cued recall
They were actually somewhat better on stem completion
Double dissociation of memory systems
Do recognition memory and repetition priming rely upon separate brain systems?
Gabrieli and colleagues demonstrated a double dissociation
Patients with amnesia demonstrate impaired recognition memory but normal repetition priming
Patients with lesions to the occipital lobe show impaired repetition priming but normal recognition memory
Subliminal perception?
Implicit memory shows that people can be affected by stimuli that they no longer consciously remember
Can people be affected by stimuli that they aren’t even aware of when they occur?
This is known as subliminal perception
Source of great controversy over the last 30 years
Subliminal priming
Semantic priming:
The subject performs a lexical decision (word/nonword) task
On some trials, a related stimulus (“prime”) precedes the stimulus in the task
Example: bread -> butter
The subject is faster to say that butter is a word than if they had seen an unrelated word or a letter string “XXXXX”
In the 1980’s, Tony Marcel claimed that semantic priming occurred even if the subject wasn’t aware of the prime
prime is presented for a very short period (less than 50 milliseconds) with a masking stimulus following it
Does subliminal priming occur?
There is substantial evidence that priming can occur even when subjects cannot discriminate the identity of the prime
The subject can still detect whether the prime was present
There is little evidence for priming when the subject cannot detect whether the prime was present or not
Can subliminal ads sell products?
James Vicary claimed in 1957 that the messages “Eat Popcorn” and “Drink Coca-Cola” were shown subliminally during the movie “Picnic”
He claimed that these messages led to a 57.7% increase in popcorn sales and an 18.1% increase in Coke sales
Vicary later claimed that these results had been fabricated
There is no empirical evidence that subliminal messages can influence behavior in this way
Subliminal messages in music
In the 1980s and 1990s, there was great controversy over whether rock music contained backwards messages
Judas Priest was sued (unsuccessfully) over the suicide attempt by the parents of two teenagers in Reno who attempted suicide
They claimed that backwards message in the music had driven them to it
This debate confounded two issues:
Are the messages put there intentionally?
Do backwards messages have any effect?
Effects of backwards speech (Vokey & Read (1985))
Vokey & Read (1985) examined whether people could perceive the contents of backwards speech
Subjects were able to determine:
gender (98.9%)
same versus different individuals (78.5%)
language (English/French/German) (46.7% vs. 33% chance)
Vokey & Read (1985)
(What couldn't be done)
Subjects were unable to accurately:
say how many words were were spoken
whether it was a question or a statement
whether it was a sensible sentence or not
The backwards speech was equally ineffective at causing implicit memory effects
Unconscious perception
The upshot from research on unconscious perception in normal subjects is:
Stimuli that are detectable but not discriminable can cause effects
These effects are small and limited in time
They are probably not sufficient to cause people to buy particular products or commit suicide
Stimuli that are not detectable do not cause effects
Unconscious perception in prosopagnosia
Prospopagnosia results in the inability to recognize faces
They cannot consciously distinguish between familiar and unfamiliar faces
However, prosopagnosics show differential galvanic skin response (GSR) to familiar and unfamiliar faces
Shows that their autonomic nervous system registers familiarity even when conscious awareness does not
Capgras syndrome
Patients with Capgras syndrome think that significant others have been replaced by imposters, aliens, or robots
These patients do not show differential GSR between familiar and unfamiliar faces
Flipside of prosopagnosia

-Recognition but lacks "feeling" --the implicit recognition
Blindsight
People with lesions to the visual cortex are unable to consciously see part of their visual field
Weiskrantz et al. (1974) showed that these people can exhibit some aspects of vision
They can accurately point at the location of a flash of light
Even though they claim not to have seen the light!
Blindsight does not occur following earlier lesions in visual system (e.g., optic nerve, retina)
Probably relies upon visual pathways other than the visual cortical pathway, such as the superior colliculus