• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/31

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

31 Cards in this Set

  • Front
  • Back

Stimulus-stimulus learning vs procedural learning

Stimulus-stimulus:


- learning associations between different things


- very quick


- pretty much all explicit memory is this


- not how the brain works fr movement- if this were the case we could watch someone do something and know how to do it.


- our brain doesnt know exactly which neural activity generated the movement



Procedural


- the brain works more trial and error


- brain envisions what movement it wants then reinforces the random neural activity that got it closer to it


- patterns of activity that generated successful movements will be more likely to be activated in future


In basic terms, how does a reinforcement learning signal increase the likelihood that the brain will regenerate the same neural activity that created successful actions?

using a prediction error signal which comes from fluctuations in reinforcement (dopamine) signal


- the reinforcement signal must alert brain to when current reward or expectations of reward are better/worse than anticipated.



Prediction error signal= actual-expected value of current situation

How/what does the prediction error signal?

- broadcasts throughout the brain


- appropriates credit/ blame to neural patterns


- this feedback is used to adjust your cost-benefit analysis / your probability of repeating that behaviour in the future


(we dont know exactly how the brain does this yet)


where in the brain do actions get reinforced?

dopamine signals--> reinforce glutamate signals--> info encoded in glutamatergic inputs from cortex--> all motor commands sent to striatum (input of basil ganglia, which also has NAc)



the midbrain projects widely throughout the brain with few dopamine cells

Describe the dopamine neuron system (4)

- there are few dopamine cells, but each sends out tons of projects


- they are homogeneous and fire at same time in general


- unmyelinated, not fast (0-4hz), not temporally precise signal


- dopamine is cleared from extracellular space 100 times slower than glutamate

describe how dopamine may work as the prediction error signal (3)

- dopamine release occurs every time your estimation of the current moment is better than you anticipated it to be


- dopamine release is withheld every time your estimation oft he current moment is worse than you anticipatied it to be


- as expectations grow, the dopamine system becomes more and more selective in terms of when it fires in response to the movements you generate (there should always be brief increase in dopamine release when you perceive your actions to be more effective than you had anticipated them to be)

Explain the Three-factor Rule

Abrupt changes in dopamine levels can strengthen or weaken glutamate cells in striatum


strength og the glutamate input in striatum can change based on:



1) presynaptic activity


2) posstynaptic activity (cells in midbrain fire and result in successful commands)


3) abrupt changes in local dopamine receptor activity


So, if you fire together (pre and post synaptic activity) you become eligible for the 3rd rule- dopamine receptor activity changes

Raster plot

each row on x axis is a 2 sec increment of time and each dot is an AP of a neuron

Describe dopamine firing for the reward and cue when: a) reward is completely unexpected b) 25%, 50, 75% chance of it c) guaranteed reward d) expected reward is not given

Unexpected: large burst upon reward


25% expected: smaller burst upon reward but still big


50% less, 75% even less, little but during cue


Guaranteed: no burst during reward, but with the cue preceding it


Expecting but none given: drop during supposed reward time

Describe Blocking

when a reward-predictive stimulus is paired in conjunction with another stimulus that has beed learned to be predictive of reward, then learning of the second stimulus is blocked. this is because reward is expected with the bell so there will be no dopamine firing along with light+ bell combo



present bell with food. then present light and bell with food. test light- it has not been learned to predict reward

explain this

explain this

Learning of the first compound stimulus is blocked, but the others is not, even though they are paired with reward same amount of times. For the first one, dopamine release just after the stimulus is giving info about the cues that preceded it and since the one cue was expected to receive reward, the one it was paired with was blocked.


For the second one, the non-novel stimulus was trained to not be rewarding, so there would be no firing at the cue but rather when the unexpected reward was received, resulting in learning of the compound

Describe second-order conditioning

- after a well- learned task,dopamine neurons fire to the cue and not the reward. Second order conditioning is when you have a random cue that precedes cue one.


-Cue one is learned to predict reward so it gets dopamine release. this dopamine response goes back to reinforce cue 2 which preceded cue one your behavioural expectations move back in time.

Define Pleasure; What are two main functions dopamine has? is it multiplex signalling?

pleasure= what you feel when you consume rewards, situation-dependent;


the dopamine striatum regulates motivational state via tonic activity, and reinforces our actions and decisions via feedback signal encoded in phasic activity



multiplex means thatone signal can carry two types of info and the reader of the signal needs to be able to tell the difference. Dopamine kind of multiplexes reinforcement and motivation but not exactly since motivation and reinforcement interact with each other


What is tonic activity and how do we know about it?

- because dopamine is cleared from extracell space slowly, there is always some chillin in there


- this amount can rise and fall


- changes in the speed of this activity, number of dopamine neurons firing, and amount of phasic activity can cause large changes in resting tonic amount of dopamine

Why do we care about tonic dopamine signalling?

- tonic concentrations thought to influence "gain" settings (how excitable the motor neurons are)


- high tonic levels in striatum lead to high gain setting and low to low, making messages more or less easily to get through


- this gain setting corresponds to motivational state and willingness to exert effort

what happens if you increase/ decrease tonic dopamine signalling?

increase:


- more more, hyperactive


- engaged in environment


- motivated


- take risks and do hard things



decrease:


- lazy


- not interested in rewards


- engage less


- ignore effortful things for reward but may press an easy lever


- loss of all dopamine receptor activity= Parkinsons


Both instrumental and pavlovian learning probably depend on...

both types of learning at dependent on the dopamine reinforcement signal.


where you choose to attend is probably sensitive to reinforcement

Goal-tracking vs Sign tracking; dopamin receptor blocking in the ventral striatum (NAc) abolishes...

there is often variability in CR behaviour


goal-tracking= fixated on goal ie run over to food bowl upon seeing it


sign-tracking= fixating on reward-predictive cues ie run over to the lever upon seeing it



* both are conditioned/ learned approach responses;



this abolishes sign-tracking but not goal-tracking

describe the dopamine deficient mouse (DNA, behaviour, l-dopa)

- a DNA stop signal added to gene that encodes tyrosine hydroxylase (enzyme that encodes dopamine), loxP site flagged it


- these mice thus have no dopamine at all


- dopamine mice are slow and have no motivation


- eventually starve to death beside food


- but can swim!! aimlessly, not to an exit, but still


- can keep them alive by injecting L-dopa and they eat a days worth of food

L-dopa, tyrosine hydroxylase, dopamine, relationship

Tyrosine hydrox. makes l-dopa from tyrosine and then l-dopa is turned into dopamine by another enzyme

Describe the Dopamine KO mouse

- poorly named, actually contain 5% dopamine in striatum


- apparently dopamine transporter protein DaT is not found in 5% of dopamine neurons


- so he only made a cre-dependent tyrosine gene gene knockout mouse strain but only injected cre into neurons with the transporter


- so now only those with cre (those who dont make DaT) will not produce dopamine

Explain the KO dopamine mice and Pavlovian Learning

- trained mice that lever presentation means food is coming in hole and have to stick head in


-1)TKO mice do not learn to associate the cue with the food although they may go over to the bowl - no sign tracking


- 2) is it because of lack of motivation? gave l-dopa after training- still cant do association


- 3)if you give l-dopa before training they will do it but once you stop the behaviour extinguishes (maybe low dopamine levels acting even as neg reinforcement compared to before)

5% dopamine is enough...

for them to perform actions but not learn associations and maintain them

How did they restore dopamine synthesis to select populations of neurons in DD mice? What did they find?

-added cre to two types of dopamine neurons: ones that project to either the dorsal or ventral striatum to restore dopamine here


- only restores to about 40% in area intended


- also increases 5% to opposing side of striatum bc a portion of dorsal neurons project to ventral and vice versa (although we know this isnt enough for learning)


- found: restoring dopamine synthesis to ventral but NOT dorsal striatum-projecting neurons restores learning and performance of cue associations




What is needed for reinforcement learning? (3)

Contiguity- close temporal proximity of stimuli (how soon depends on the part of the brain)



Contingency- the probability of one event/stimulus has to be dependent on the presence of another= correlation



novel predictiveness- stimulus association has to add new info that can lead to improved predictions about the world

The authors wanted to see if temporally-specific activation of dopamine neurons (as opposed to drugs) act as a reward prediction error and promote learning. Describe the experiment set-up and result.

- used dopamine-cre rats with cre-dependent ChR2 virus


- 30 sec tone--> stick head into food port-->food--> 5 sec delay then more food if they reenter or stay in


- then add a light- compound stim to the tone


- as they are putting their head down, one group has dopamine neurons optogenetically activated, making the juice better than expected


- for this group, the stimulation stopped light blocking and they learned the association

describe the dopamine-cre rats with cre-dependent ChR2 virus

-added DNA with Cre. to the genes that produce dopamine so that those that make dopamine will also make cre.


-then added crd-dependent viral DNA encoding for Chr with a stop signal with loxP sites. so that the in dopamine neurons, cre. will stop the stop signal and dopamine neurons will make Chr.


- authors Targeted the middle of the brain

Does dopamine neuron stimulation prevent downshifts in behavioural responding? Describe experiment set up and result.

- used dopamine-cre rats with cre-dependent ChR2 virus


- train animal that tone predicts juice


- after well learned, the tone now predicts water= downshift


- one group no stimulations- their behaviour extinguishes quickly


- other group dopamine stimulated when at bowl- their behaviour goes down in terms of how long they leave their head in, but they still run over to the port since the dopamine stimulation= reward

Dopamine neuron intracranial self-stimulaiton and extinction of behaviour... what might we conclude form this?

-used dopamine-cre rats with cre-dependent ChR2 virus


- if you train them to seek a dopamine stimulation at a food port when a tone sounds and then take away stimulation, behaviour extinguishes much faster than for say food-reward seeking behaviour... why?


- maybe no underlying drive like with hunger (but animals equally show as food as with heroin)


- maybe dopamine self-stim isnt associated with pleasure, so no memory of pleasure feeling to motivated behaviour- they cant recall why they liked to press the lever


= maybe reinforcement and pleasure are separate in the brain!

Can dopamine neuron stimulation cause you to prefer one reward over another in a different situation? describe study and result.

-dopamine-cre rats with cre-dependent ChR2 virus and normal


- both given strawberry juice one day, grape another in lab


- the cre group is stimulated every time they get strawberry juice but not grape


- then put them out of context in home cage and give both juices


- they do not prefer one over the other!


- dopamine reinforcement of active neural circuits promotes an activity in a similar situation/setting in the future


- we have to teach the mice they can generalize it by doing in a bunch of dif settings

Skinner's radical behaviourism theory and its problem

- behaviour is consequence of environmental histories and reinforcement


- but what about thinking and introspection?


if you give rat juice and saline hell choose juice and avoid saline but if you deprive him of salt hell run over to the saline even though it was never reinforced