• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/157

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

157 Cards in this Set

  • Front
  • Back
Distinguish between nominal, ordinal, interval, and ratio scales
Nominal - numbers are assigned to events as labels (ex. coding gender as a 'dummy variable' in stat analysis by assigning 0 to males and 1 to females).
Ordinal - numbers are assigned to recognize rank order of products with regard to some sensory property, attitude or opinion (ex. increasing numbers represent increasing amounts/intensities of sensory experience.
Interval - zero level is arbitrary; subjective spacing of responses is equal so assigned numbers can represent actual degrees of difference, allowing for comparisons (ex. centigrade and farenheit)
Ratio - zero level is not arbirary; numbers reflect relative proportions (ex. mass, length, and temperature on the Kelven scale).
Describe an appropriate method of analyzing data from a nominal scale.
Make frequency counts and report the mode; the mode is used as a summary statistic for nominal data. Different frequencies of response for difference products or circumstances can be compared by chi-square analysis or other nonparametric statistical methods.
What is the reasoning behind not using parametric statistical methods (eg. means, standard deviations) for ordinal data?
Ordinal data is categorical data with a logical ordering to the categories; for example, ranking items on a scale of strongly agree = 1 to strongly disagree = 5 is ordinal. The numbers do not tell anything about the relative differences among the products so mathematical operations that involve addition are not appropriate. A more reasonable analysis would be to simply count the number of respondents in each category, and to compare frequencies.
In what situations can ordinal data be usefully analyzed by parametric statistical tests?
In situations where the subjective spacing of responses is equal, so the numbers assigned can represent actual degrees of difference, allowing for comparison between the degrees of difference. Examples include scales of temperature, the 9-point hedonic scale, or in the case of a race, knowing not only the order that competitors finished in, but how many lengths separated each competitor.
Compare and contrast line scales and category scales:
Line scales are a class of scaling techniques in which a person responds by choosing a position on a line representing different perceived intensities of a specified attribute or different degrees of liking for a product.
Category scales are any of a class of scaling procedures in which the judge is presented with discrete and limited alternatives as choices for rating the perceived intensity of a specified attribute or degree of liking of a product.
-line scaling: the panelist can mark a line anywhere on the scale; category scaling: the panelist can only make a mark within a fixed category.
-line scaling: in most cases only the end points are labeled and anchors may be used to help avoid end effects; category scaling: anchors are not used, ends are labeled
-line scaling and category scaling: both have variations in which central reference point representing the value of a standard or baseline is used and test products are scaled relative to that point.
-category scaling: too many points can lead to random error variation; too few points are often a detriment to highly skilled panelists capable of distinguishing many levels of stimuli; not problem with line scaling
-line marking and category scales are approximately equivalent in sensitivity to product differences.
What are the advantages to having greater numbers of categories or points on a category scale?
This is advantageous for highly trained panelists capable of distinguishing many levels of stimuli. Also, rather than just using endpoints, including intermediate points in the middle range of the scale where many products are found is helpful in allowing panelists to more accurately categorize sensory perceptions.
What are the advantages and disadvantages of assigning physical examples or standards to specific points on a category scale?
Advantage:
-allows for achievement of a certain level of calibration, a desirable feature for trained descriptive panelists.
Disadvantage:
-restriction of the subject's use of the scale; what appears to be equal spacing to the experimenter may not appear so to the participant.
What is a modulus in magnitude estimation?
Magnitude estimation is a scaling procedure in which numbers are generated freely by a panelist to represent the ratios of perceived intensities of products or stimuli. A modulus is a stimulus given to the subject as a kind of reference or anchor assigned a fixed value for the numerical response; all subsequent stimuli are rated relative to this standard.
If magnitude estimation is done without a standard and/or modulus the data from each judge must be rescaled. Why?
If no standard/modulus is given, the participant is free to choose any number he or she wishes for the first sample and then all subsequent samples are rated relative to this first intensity. The numbers must be rescaled so that all participants are working within the same range before statistical analysis can be performed; this prevents subjects who choose very large numbers from having undue influence on measures of central tendency in statistical tests ('normalizing').
Why is it often 'better' to work in logs when analyzing magnitude estimation data?
When the data is converted to logs, it tends to be because the data tend to be log normally distributed or at least positively skewed; ie. there tend to be few outlying values, while the bulk of the ratings lie in the lower range of numbers.
What are some solutions to the 'problem' of zeros in magnitude estimation data?
Assign a small positive value to any zeros in the data, perhaps one-half of the smallest rating given by a subject OR use the artithmetic mean or median judgments in constructing the normalization factor.
How does the category-ratio scale compare to Magnitude Estimation and to category or line scaling?
The category-ratio scale is a hybrid method; the response is a vertical line-marking task but verbal anchors are spaced according to calibration using ratio-scaling instructions. Data from this scale resemble those from magnitude estimation.
Describe the basis for a Thurstonian scale as contrasted to a category scale.
A Thurstonian scale is a variability based scale for comparative judgment and its extension into determining distances between category boundaries; the proportion correct in a choice experiment (ex. triangle test) is converted into Z-scores.
A category scale is one in which a judge is presented with limited and discrete alternatives as choices for rating perceived intensity, a specified attribute, or liking of a product.
What are the problems with using 'indirect scaling' to determine Thurstonian scale values for the sensory evaluation of foods?
-Many subjects must be tested to get a good estimate of the proportion of choices.
-The method only works when there are small differences between the products in terms of the attribute in question, and some confusability of the items.
What makes a 'good' scale?
-operates on the basis of a linear response output function
-must effectively match responses to sensations
-should be useful at identifying differences between products
-should show low interindividual variability for increased sensitivity, more significant differences, and lower risk of Type II error
-user friendly and easy to understand
-applicable to a wide range of products and questions.
Why might people avoid using the ends of the scale?
-In order to reserve the extreme responses just in case a more extreme sample appears later in the evaluation session.
-A tendency to gravitate toward the middle of the scale so as not to appear 'wrong' in one's evaluation
People's responses are influenced by other samples presented in a test session. What 3 strategies do the authors suggest for managing these problems of context?
1. Stabilization through calibration by training with physical reference standards.
2. Understanding the contextual effects that may be occurring so as to factor them into the interpretations of the data; be aware of contextual effects and how they might have affected the experimental results, and provide appropriate interpretations and warnings about the values in the data and how they might have been affected by other samples present in the session.
3. Providing a constant frame of reference - this may be through providing a 'weak' end anchor to remind panelists about an attribute or to ask panelists to scale items as more or less intense than a standard reference.
Explain each of the 'guiding principles' the authors list for choosing scale labels.
Singularity - term should refer to only one sensory experience
Lack of hedonic/sensation confusion - terms should not be vague
Orthogonality to other terms - terms should be separate dimensions of the sensory experience from a product, or they should be uncorrelated.
Lack of quality-shiftings as ratings increase
Avoiding vagueness
Potential for finding physical reference examples to align panelists' concepts
What three reasons do the authors give for avoiding tacking a preference test onto a triangle test?
1.Participants in the two tests are not chosen on the same basis. In preference tests, participants are users of the product; in discrimination tests, panelists are screened for acuity, oriented to test procedures, and may undergo training.
2.Participants are in an analytic frame of mind for discrimination, while they are in an integrative frame of mind and are reacting hedonically in preference tests.
3.There is no good solution to the problem of what to do with preference judgments from correct vs. incorrect panelists in the discrimination test. Even if data are used only from those who got the test correct, some of them are likely guessing.
What are the pros and cons for including a 'no preference' option in a paired preference test?
PROS:
-could provide valuable insight into the consumers state of mind
-there are consumers who honestly do not have a preference between the two products; not giving them this option could be insulting.
-there is an intuitive difference in interpretation to the sensory specialist between 100 subjects with 5 giving no-preference data and 100 subjects with 85 giving no-preference data.
CONS:
-the no-preference option complicates the data analysis since the usual methods assume that the test had a forced choice
-power of the test is decreased and it is possible that a real difference in preference may be missed
-it could give some consumers an 'easy out'; they don't have to make a choice and therefore do not.
In the postscript they note that subjects did not perceive a significant difference in firmness between the samples when only the first samples tasted were compared. When the second samples tasted were compared, they found a significant difference. Why?
For the first test, the subjects had no frame of reference for firmness, so the initial scores were not discriminating the samples. The second sample was evaluated from the point of view of the first, so the mean scores were significantly different.
What type of sensory test would you recommend for measuring small children's (less than 3) liking for foods? What techniques would you recommend for measuring 4-5 year olds preferences?
3 year olds: 3 point facial hedonic scale with P&K verbal descriptors.
4-5 year olds: 5 point facial hedonic scale with P&K verbal descriptors
Explain what the authors mean when they say 'in the combination of hedonic and intensity information, the actual intensity scores are obscured'.
The authors mean that although two groups might both mark 'just right' on the just right scale, one group might think the product is very strong and the other might think the product is fairly mild, although in each case they are both at the levels that the respondents prefer. The results might lead the product developers into thinking there is a homogeneous population, while there are actually two consumer segments. Because of this, intensity scales should be used in addition to just right scales to aid in interpretation.
What are the advantages to having subjects score both the absolute intensity and intensity of an ideal sample?
The absolute intensity information is obtained as well as where the ideal product lies for that person on the scale. The just right directional information can be obtained, and an 'ideal product profile' can be constructed if the data from the panelists are reasonably homogeneous and if mean deviations of each of the test products are measured from this ideal profile.
What is the bliss point?
The optimum of a sensory continua; it appears as a peak in a nonmonotonic function.
What measurement of a concentration v. liking or a concentration v. JAR response may serve as an indicator of a judge's tolerance for deviations from an ideal level of an ingredient?
The slope of the line, relative to error, of a plot of the linear relationship (usually on a log scale) of just right scores against sensory intensity or ingredient concentrations.
What is the food action rating scale? How does it compare with the 9-point hedonic scale?
A scale based on attitudes and actions, combining statements about frequency of consumption and motivationally related statements. The FACT and 9-point hedonic scales are not necessarily interchangeable, however, a study on food likes showed correspondence of measures on the two scales had a high degree of correlation. The FACT scale gave lower mean values but less skiew than the 9-point hedonic scale.
What is the 'acceptor set size'?
The proportion of consumers who find a product acceptable.
Compare and contrast the use of a 9-point hedonic scale vs. the use of an acceptor set size measurement
Comparisons:
-both measure liking/acceptance
-preference does not always match up with acceptance (ie. may be a group that finds product acceptable, but a group that dislikes it strongly since it is not in their favorite style); this is a problem with both scales.
Contrasts:
-acceptor set size measurement lacks the graded richness of information available on the 9-point hedonic scale
-9-point scale uses a series of descriptors to indicate degree of liking/disliking; acceptor set uses 'yes' and 'no' answers to questions about product acceptability.
What are the most important qualifications for participants in consumer tests?
-panel should be representative of target population - consumers who actually purchase and use the product
-should be likers of the general product category
-should use the product with desired frequency as determined by the client or user of the data
-can include frequency of purchase, preparation, consumption, and degree of liking for increasingly stringent requirements.
What are some potential problems with using employee panelists for liking tests?
-employees may be more familiar with or critical of the products tested than the eventual consumers of the product
-may have unusual patterns of product use because they get products for free, at the company store, or due to company loyalty
-subject to overtesting
-can communicate among themselves so that judgments are biased and no longer independent.
How can you determine whether an employee panel is responding in the same manner as an outside consumer panel?
-check the results derived from the employee or standing consumer panels with those derived from consumers recruited outside the company on a regular basis
-use a split-plot ANOVA for a one-within and one-between design to determine if results are similar between panels (F value for both must be a nonsignificant source of variation in order to conclude panels are comparable)
-increase the size of alpha to allow the p-value of significance to be tested at a higher level to ensure a difference is not missed.
What are some uses of descriptive analysis?
-a class of sensory tests in which a trained panel rates specified attributes of a product on scales of perceived intensity
Used:
-when a detailed specification of the sensory attributes of a single product or a comparison among several products is desired-to determine how a competitor's product is different
-in shelf-life testing
-in product development to measure how close a new introduction is to the target
-to assess suitability of prototype products
-to define a problem in quality assurance
-in troubleshooting major consumer complaints
-to define sensory-instrumental relationships
Contrast the way we learn about colors with the way we learn about flavors.
Color is a well-structured concept for most individuals and possesses a widely understood scientific language for its description; for example, we are taught as children to associate certain labels with certain stimuli-- green for grass, blue for sky. When we do not use the right labels, we are corrected. Flavor, on the other hand, is rarely described with precise terms; for example, 'the bread smells good'. For taste, odor, and texture there are no standards that can be used across all sensory tests.
List desirable features of sensory attributes used in descriptive analysis and describe what each means.
Discriminate - selected descriptors should discriminate between the samples; in other words, they should indicate perceived differences.
Nonredundant - chosen term should have little or no overlap with other terms used, and it should be orthogonal.
Relate to consumer acceptance/rejection - used to interpret consumer hedonic responses to the same samples; terms chosen should be helpful in relating consumer acceptance/rejection to the product (ie. no scientific terms - baby vomit instead of butyric acid)
Relate to instrumental or physical measurements - ideal descriptors can be related to underlying natural structure of the product; for example butyric acid to describe an odor in aged cheese is tied to the compound likely responsible for the odor
Singular - terms should not be combinations of several terms; for example: creamy, soft, clean, fresh-- all can be confusing to panelists
Precise and reliable
Consensus on meaning - panelists should fairly easily agree on the meaning of a specified term
Unambiguous - clearly defined
Reference easy to obtain - makes life easier for panel leader but should not prevent panelists from being able to use a specific term
Communicate - chosen descriptors should have communication value and not be jargon
Relate to reality - helpful if the term has been used traditionally with the product or if it can be related to the existing literature.
Describe a strategy for teaching the concept of a sensory attribute such as mushiness
-panelists are trained to precisely define mushiness over a 2-3 week program
-samples are tasted and all perceived textures are recorded
-the panel is exposed to a wide range of products with the same attribute
-after this exposure, the panelists review and refine the descriptors used
-reference standards and definitions for each descriptor are also created during the training phase
-at the end of the training phase, the panelists have defined a frame of reference for expressing intensities of the descriptors used.
What is the meaning of the term amplitude in the flavor profile technique?
The degree of balance and blending of flavor; in other words, the overall impression of balance and blendedness of the product. It is not to be understood, just experienced (ex. heavy cream, whipped cream with sugar, and whipped cream with sugar and vanilla have low to high amplitude, respectively).
List some desirable characteristics of panelists for a descriptive analysis panel.
-should be available for years, if possible
-keen interest in the product category
-some background knowledge on the product type
-normal odor and taste perceptions
-articulate and sincere with an appropriate personality (not timid or overly aggressive).
What are the functions of a descriptive analysis panel leader?
-participant in language development and evaluation phases of the study
-moderates interactions between panelists, leading the entire group toward some unanimity of opinion
-coordinates sample production
-directs panel evaluations
-verbalizes the consensus conclusions of the entire panel
-will often resubmit samples until reproducible results are obtained
-responsible for communication with the panel
How did QDA differ from the flavor profile analysis?
-data are not generated through consensus discussions
-panel leaders are not active participants
-unstructured line scales are used to describe the intensity of rated attributes
Briefly describe what happens during a series of QDA training sessions.
10-12 judges are exposed to many possible variations of a product to facilitate accurate concept formation, and panelists generate a set of terms that describe differences among the products.
Through consensus, panelists develop a standardized vocabulary to describe the sensory differences among the samples, decide on reference standards and/or verbal definitions that should be used to anchor the descriptive terms, and determine the sequence for evaluating each attribute.
Late in the training sequence a series of trial evaluations are performed, allowing the panel leader to evaluate individual judges based on statistical analysis of their performance relative to that of the entire panel
What is the dumping effect? How can it affect the results of descriptive analysis?
The inflation of a rating on an attribute due to the absence of an appropriate attribute on a ballot or questionnair to allow a participant to respond to a salient sensation. The response is 'dumped' onto an inappropriate question or scale. This can affect the results of descriptive analysis in that panelists will probably subconsciously express their frustration by modulating the scores on some of the scales used in the study; results will not be accurate.
Give an example of what the authors mean when they say the QDA data is relative as opposed to absolute.
QDA data being relative refers to panelists using different parts of the scale to rate products for attributes. For example, Judge A scores the cripsness of potato chip sample 1 as an 8, but Judge B scores the same sample as a 5. This does not mean that the two judges are not measuring the same attribute in the same way; it may just mean that they are using different parts of the scale. The relative responses of these two judges on a second different sample would indicate that the two judges are calibrated with respect to the relative differences between the samples. This can be corrected with statistical analysis.
What is the texture profile definition of hardness?
The forced required to bite completely through a sample placed between the molars.
What is the advantage of having the different foods in the hardness scale break down different when bitten?
All foods used as references in the hardness scale vary in hardness, but they illustrated to panelists that they do not necessarily react in the same way to an applied compressive force; this will help in evaluation of other products compared to the references.
Contrast the Sensory Spectrum method with QDA for: relative vs. absolute scales, panel training strategies, the ability to compare samples evaluated at different times or by different panels and the effort required to train the panel.
Relative vs. absolute scales:
Absolute
Panel Training Strategies:
-panelists trained to use the scales identically so data is absolute
Ability to compare samples evaluated at different times or by different panels:
-allows for comparison of samples evaluated at different times or by different panels because panelists are NOT allowed to create their own descriptors.
Effort required to train panel:
-much more extensive than QDA training and panel leader has a more directive role
-judges exposed to wide variety of the products in a specific product category
-panel leader provides extensive info on the product ingredients
-underlying chemical, rheological, and visual principles are explored by the panelists and relationships b/w these principles and sensory perceptions of the products are considered
-panelists are provided with word lists that may be used to describe perceived sensations associated with the samples
-panelists use numerical intensity scales that are supposedly created to have equi-intensity across scales (ie. same for salty, sweet, etc)
-difficulties: panelists must understand the vocab chosen to describe a product, and grasp the underlying technical details of the product, the basic physiologay and psychology of sensory perception, and finally must be extensively 'tuned' to one another to ensure that all panelists are using the scales the same way.
What are the functions of the reference points for descriptive analysis scales?
They give panelists something to refer to in rating products; if panelists rate products relative to the reference points, hopefully all panelists can come to a consensus based on how similar or dissimilar a product is for a specific attribute relative to the reference.
The authors give instructions for doing descriptive analysis in 3 easy steps. What are they?
1. Training the judges
-either consensus training (panelists generate descriptors and reference standards needed to describe differences among products) or 'ballot training' (panelists are provided with a wide range of products within the category as well as a word list of possible descriptors and references that could be used to describe the products)
2. Determining judge reproducibility
-the first 2-3 sessions are used to determine judge consistency, but panelists are told evaluation of products is beginning
3. Evaluating samples
-judges evaluate samples in duplicate, preferably in triplicate
What is free choice profiling? How does it differ from generic descriptive analysis?
Free choice profiling is a descriptive method in which untrained or minimally trained panelists evaluate products using their own individual set of descriptors.
This differs from generic DA because vocab to describe flavor notes is generated individually by panelists and only needs to be understood by each individual panelist. It also differs in terms of statistical analysis. Generic DA is usually analyzed with ANOVA, free choice profiling with Procrustes.
Describe the repertory grid method.
A way of eliciting multiple descriptive terms from panelists through a series of comparisons among groups of objects. In this method, panelists are presented with objects arranged in groups of three. The arrangement is such that each object appears in at least one triad and that an object from each triad is carried over to the next triad. Two objects in each triad are arbitrarily associated with each other, and the panelist is asked to describe how these two objects are similar and in the same way, different from the third. Once all similarities and differences have been exhausted, the researcher presents the remaining two combinations within the triad to the panelist who repeats the effort to describe similarities and differences. This is repeated for all triads. The descriptors are placed on scales, and the panelists then use their own sets of scales to describe the objects in the study. Data are analyzed using Procrustes.
How can one develop a frame of reference for subjects doing a category scaling task?
Preliminary practice so that the subject can learn the general range of stimuli and correlate it with the given response scale, and use of stimulus end anchors – these are additional stimuli more extreme than the experimental stimuli to be studied that help define the frame of reference.
What are the two processes contributing to the ratings assigned to a stimulus by a person?
1.A psychophysical process by which stimulus energy is translated into physiological events that have as their result some subjective experience of sensory intensity.
2.The function by which the subjective experience is translated into the observed response (ie. how the percept is calibrated against the rating scale).
The authors give instructions for doing descriptive analysis in 3 easy steps. What are they?
1. Training the judges
-either consensus training (panelists generate descriptors and reference standards needed to describe differences among products) or 'ballot training' (panelists are provided with a wide range of products within the category as well as a word list of possible descriptors and references that could be used to describe the products)
2. Determining judge reproducibility
-the first 2-3 sessions are used to determine judge consistency, but panelists are told evaluation of products is beginning
3. Evaluating samples
-judges evaluate samples in duplicate, preferably in triplicate
What is free choice profiling? How does it differ from generic descriptive analysis?
Free choice profiling is a descriptive method in which untrained or minimally trained panelists evaluate products using their own individual set of descriptors.
This differs from generic DA because vocab to describe flavor notes is generated individually by panelists and only needs to be understood by each individual panelist. It also differs in terms of statistical analysis. Generic DA is usually analyzed with ANOVA, free choice profiling with Procrustes.
Describe the repertory grid method.
A way of eliciting multiple descriptive terms from panelists through a series of comparisons among groups of objects. In this method, panelists are presented with objects arranged in groups of three. The arrangement is such that each object appears in at least one triad and that an object from each triad is carried over to the next triad. Two objects in each triad are arbitrarily associated with each other, and the panelist is asked to describe how these two objects are similar and in the same way, different from the third. Once all similarities and differences have been exhausted, the researcher presents the remaining two combinations within the triad to the panelist who repeats the effort to describe similarities and differences. This is repeated for all triads. The descriptors are placed on scales, and the panelists then use their own sets of scales to describe the objects in the study. Data are analyzed using Procrustes.
How can one develop a frame of reference for subjects doing a category scaling task?
Preliminary practice so that the subject can learn the general range of stimuli and correlate it with the given response scale, and use of stimulus end anchors – these are additional stimuli more extreme than the experimental stimuli to be studied that help define the frame of reference.
What are the two processes contributing to the ratings assigned to a stimulus by a person? Which one of these would be influenced by adaptation? anosmia? mixture suppression? Standards used to illustrate the intensities of the scale end points? Judges ability to use numbers for a mag estimation task?
1.A psychophysical process by which stimulus energy is translated into physiological events that have as their result some subjective experience of sensory intensity.
2.The function by which the subjective experience is translated into the observed response (ie. how the percept is calibrated against the rating scale).

Influenced by adaptation?
the psychophysical process itself is altered
Anosmia?
the subjective experience is altered
Mixture Suppression?
The psychophysical process is altered
Standards used to illustrate the intensities of the scale end points?
Standards will influence how the percept is calibrated against the rating scale
A judges ability to use numbers for a magnitude estimation task?
Judge's ability will influence how the experience is translated into the observed response
Give a definition of response bias
A process that causes a shift or change in response to a constant sensation
What is a contrast effect?
Any stimulus will be judged more intense in the presence of a weaker stimulus and less intense in the presence of a stronger stimulus, all other conditions being equal.
What is an assimilation or convergence effect?
The opposite of contrast effect; using acceptability of foods as an example, that a poor food would seem even worse when preceded by a good sample, and that a good sample would be rated lower when following a poor sample (this second convergence was never observed).
Describe Helson's adaptation Level Theory
The theory that we take as a frame of reference the average level of stimulation that has preceded the item to be evaluated; in other words, we refer to our most recent experience in evaluating the sensory properties of an item.
What is the reversed-pair technique?
A technique in which the context-inducing stimulus comes after the target stimulus so there can be no effects of sensory adaptation, carry-over, or simple physiological alteration of the actual sensory experience
Explain how using the reversed-pair technique could show that a contrast effect was due to a change in the frame of reference and NOT to adaptation or carryover?
Because in reversed-pair technique the context-inducing stimulus comes after the target stimulus there can be no effects of sensory adaptation, carry-over, or simple physiological alteration of the actual sensory experience. Rather, it is the frame of reference and nature of the response mapping strategy that have been altered by the item that comes later in experience.
Give an example of a shift in sensory intensity rating due to a shift in the range of stimulus intensities.
What constitutes a small horse depends on whether the frame of reference includes Clydesdales, Shetland ponies, or prehistoric equine species.
Give an example of a shift in sensory quality ratings due to a shift in quality context.
Dihydromyrcenol, when presented among a set of woody or pine-like reference materials is perceived as too citruslike. When presented among a set of citrus reference materials, however, it is perceived as too woody and pine-like. Among the woody samples, the compound is given a higher citrus rating than when among the citrus samples and vice versa.
Give an example of a shift in liking ratings due to a shift in stimulus context.
When embedded in a series containing a lot of unpleasantly bitter stimuli, ratings of pleasant stimuli are higher than when embedded in a series with many sucrose solutions.
How does the Range Frequency Theory differ from the idea that the frame of reference is based on the average of the sensory experiences?
According to this theory, the entire distribution of items in a psychophysical experiment influences the judgments of a particular stimulus, and that behavior in a rating task is a compromise between two principles. The range principle: subjects use the categories to subdivide the available scale range, and will tend to divide the scale into equal perceptual segments. The frequency principle: Over many judgments, subjects tend to use the categories on the scale an equal number of times. In other words, how stimuli are grouped or spaced along a continuum determine how a response scale is used.
What is the frequency principle?
Over many judgments, subjects tend to use the categories on the scale an equal number of times.
How does increasing the frequency of presentation of a sample affect the slope of a line relating stimulus concentration to sensory intensity?
The slope of the line is steeper because samples that are bunched at one end tend to be spread into neighboring categories.
Give an example of a “real life” situation in which you would expect to see a frequency effect.
In rating pleasant flavors or perfumes, the high end of the distribution is overrepresented so the tendency for panelists is to drop ratings into lower categories to make use of all the categories on the rating scale.
What is the range principle?
Subjects use the categories to subdivide the available scale range, and will tend to divide the scale into equal perceptual segments.
How does increasing the range of stimulus intensities affect the slope of a line relating stimulus concentration to sensory intensity?
The slope gets less steep because the scale is divided into equal perceptual segments—with more segments, there is therefore more subdivision an spread.
Give an example of a “real life” situation in which you would expect to see a range effect.
When asked to rate the thickness of silicone on two different scales, one with wide stimulus ranges and one with narrow stimulus ranges, panelists distributed their ratings across the range of available scale points, resulting in a steeper slope in the narrow range because more of the scale was used than expected.
Contrast the effect of using an anchor stimuli that is about the same size as the strongest stimuli a participant would experience during a study with the effect of using an anchor stimuli that is much more extreme than what would be encountered during a study.
Anchor stimuli about the same size as the strongest stimuli – subjects will distribute their responses across the range in a nicely graded fashion giving the appearance of a reasonably linear use of the scale.
Anchor stimuli much more extreme than strongest stimuli – the endpoint influence tends to diminish, as if they are irrelevant to the judgmental frame of reference.
What is the contraction or assimilation bias?
Subjects may rate a stimulus relative to a reference or mean value that they hold in memory for similar types of sensory events. They tend to judge new items as being too close to this reference value, causing underestimation of high values and overestimation of low values. There also may be overestimation of an item when it follows a stronger standard stimulus, or underestimation when it follows a weaker standard stimulus
Where has assimilation been observed?
In experiments on consumer expectation. In this case, assimilation is not toward other actual stimuli, but toward expected levels.
What is the logarithmic response bias?
Observed with open-ended response scales that use numbers, such as magnitude estimation. For example, subjects rating a series using numbers like 2, 4, 6, 8 will continue to rate by larger steps for values greater than 10, using 20, 40, 60, 80 etc. In addition, subjects perceive the magnitude of higher numbers in smaller steps as numbers get larger. For example, the difference between 1 and 2 seems much larger than the difference between 91 and 92.
What is the transfer bias?
The general tendency to use previous experimental situations and remembered judgments to calibrate oneself for later tasks.
What are some physiological differences among people that could produce some idiosyncratic scale use? Describe what difference in scale use you would expect for each physiological difference.
Idiosyncratic scale use – people have preferred ranges or numbers on the response scale that they feel comfortable using.
Possible physiological differences that could produce idiosyncratic scale use – specific anosmia, PTC bitter blindness, or desensitization to hot-pepper compounds. In each case, these people would probably use lower ends of the scale due to an inability to detect specific compounds (anosmia, PTC) or desensitization to specific compounds (hot peppers)
What is a halo effect?
Positive correlation of unrelated attributes (ie. a carryover from one positive product to another).
In figure 9.10, how do you explain the higher ratings for sweetness, creaminess and thickness produced by the added vanilla flavor?
This was due to the halo effect; adding just perceivable vanilla flavor (a positive attribute) resulted in a positive correlation with unrelated attributes (sweetness, creaminess, and thickness). The effect may also have been due to misuse of the response scale: because there was no scale for vanilla flavor, subjects used the provided scale to report their experience which may have contributed to the apparent enhancement effects.
What is “dumping”? Give an example. Explain how the dumping in your example could have been minimized.
The inflation of a rating on an attribute due to the absence of an appropriate attribute on a ballot or questionnaire to allow a participant to respond to a salient sensation. The response is “dumped” onto an inappropriate question or scale.
Example: The enhancement of sweet ratings in the presence of a fruity odor was stronger when ratings were restricted to sweetness only. This was minimized by allowing sweetness and fruitiness ratings—in this case, no enhancement of sweetness was observed.
What is the opposite of dumping? How would you minimize it?
Overpartitioning of sensory experience when too many scales are given. This could be minimized with careful pretesting and discussion phases during descriptive analysis training.
Describe these psychological errors:
error of anticipation
error of habituation
stimulus error
time order error
Error of anticipation
Occurs when the subject shifts responses in the sequence before the sensory information would indicate that it is appropriate to do so.
Error of habituation
Occurs when the subject perseverates or stays put too long with the previous response, when the sensory information would indicate that a change is overdue (may have to do with lack of motivation or attention).
Stimulus error
Occurs when the observer knows or presumes to know the identity of the stimulus, and thus draws some inference about what it should taste, smell, or look like. The judgment is biased due to expectations about stimulus identity.
Time order error (Postional bias)
A general term applied to sequential effects in which one order of evaluating two or more products produces different judgments than another order.
What are the disadvantages of presenting only 1 sample at a time in a test?
-highly inefficient financially and statistically
-does not take advantage of inherent comparative abilities of human observers
-transfer bias; people will evaluate products based on their memory of similar items that they recently experienced.
What are the disadvantages to comparing ratings from separate sessions or separate experiments?
Unless the contest and frame of reference were the same in both sessions, it is not possible to say whether differences between the products arose from true sensation differences or from differences in ratings due to contextual effects on the responses. A difference in the data set might occur merely because the two items were viewed among higher or lower valued items in their test sessions, or two items might appear similar in ratings across sessions because of range or centering effects.
Describe some strategies for stabilizing ratings across experimental sessions.
1.Randomization
2.Stabilization - attempt to keep context the same across all evaluation sessions so that the frame of reference is constant for all observers.
3.Calibration - the training of a descriptive panel so that their frame of reference for the scale is internalized through training with reference standards.
4.Interpretation – careful consideration of whether ratings in a given setting may have been influenced by experimental context (eg. The specific items presented in that session).
How does one calibrate judges?
With descriptive analysis training.
If you add an ingredient 'to taste' at the bench top, what biases might influence your ingredient level?
Contraction or assimilation bias
What do the authors mean by visual texture?
The texture of an object perceived by sight; for example, skin of an orange has a visual roughness absent from the skin of an apple. Visual texture is used as an indicator of freshness and creates expectations as to the mouthfeel characteristics.
Give some examples of foods where texture is perhaps the most important sensory attribute?
-Soggy (not crisp) potato chips
-Tough (not tender) steak
-Wilted (not crunchy) celery sticks
What are some of the most frequently used texture terms? How was this determined?
Some of the most frequently used texture terms are crisp, soft, crunchy, juicy, and creamy; these words describe hardness, crispness/crunchiness, and moisture content. This was determined with word association tests: people from different countries were asked to describe the textural properties of foods. Results were compared across countries, and words used most frequently were determined.
Describe how sounds are produced from foods during eating.
Sounds are produced by mechanical disturbances that generate sound waves that are propagated through air or other media. In terms of food, sound generation differs between wet and dry foods. Wet crisp foods are made up of turgid cells (if enough water is available) exerting outward pressure against other turgid cells; when chewing destroys this structure, the cells pop and a noise is produced. Dry crisp foods have air cells or cavities surrounded by brittle cell or cavity walls. When chewing breaks these walls, any remaining walls and fragments snap back to their original shape. This causes vibrations that generate sound waves.
Sounds produced by a food when eaten appear louder and lower in pitch to the person eating the food than to a person standing nearby. Why?
Lower pitch sounds one hears while eating are conducted through the bones of the skull and jaw to the ear. The jawbone and skull resonate at about 160 Hz, and bones amplify sounds in this frequency range; the eater’s own crunch sounds are therefore perceived to be lower and louder than anyone else.
Give some examples of visual texture.
Lumps in tapioca pudding, surface roughness of a cookie, or viscosity of a fluid.
How large do particles need to be for a person to perceive them as gritty?
Soft, rounded, or relatively hard flat particles are not perceptually gritty up to about 80 um. Hard angular particles, however, contribute to grittiness perception above a size range of 11-22 um.
Is texture perception independent of the sample size eaten? What is the implication of this for sensory testing?
No, texture perception is not independent of same size. Results from a study indicated that both hardness and chewiness increased as a function of sample size; as such, researchers must be careful to ensure that textural measurements are accurate.
Distinguish between mouthfeel characteristics and oral tactile characteristis
Mouthfeel characteristics are tactile but often tend to change less dynamically than most other oral tactile texture characteristics. Examples of mouthfeel are astringency, puckering, tingling, tickling, hot, stinging, burning, cooling, numbing, and mouth coating. Oral-tactile sensations refer to the force of breakdown or rheological properties of the food.
Describe the relationship between the amount and type of fat in a food and perceptions of phase change or dynamic contrast in the mouth.
Fat is primarily responsible for the highly palatable dynamic contrast of products such as ice cream and chocolate; as fat is decreased, so too is dynamic contrast.
Compare the primary and secondary texture terms from the original texture profile to some consumer terms
T: Adhesiveness
C: sticky, tacky, gooey

T: Chewiness
C: short, mealy, pasty, gummy

T: Viscosity
C: thin, thick
What are some reasons why sensory/instrumental correlations may miss-represent the true relationship between an instrumental measurement and a sensory perception?
-panelists’ chewing efficiencies affect the breakdown rates of foods which affects texture discrimination; instruments can not account for individual panelist variation
-instrumental analysis is empirical; instrumental measurements are not well defined and do not highly correlate with sensory texture measurements.
-there has been little improvement in instrumental analysis in the last 20 years
-instruments can not measure everything (ex. Sweetnes of lemonade vs. sweetness of water with same concentration of sugar).
Distinguish between the texture profile analysis (TPA) and the sensory texture profile method.
TPA is the measurement technique based on the texturometer, an instrument designed to evaluate texture of foods through penetrating a sample in two cycles. The penetration force is recorded, and attributes of the instrumental texture profile are selected to correlate well with the sensory texture parameters rated by trained texture profile panelists.
Sensory texture profile method??
Distinguish between mouthfeel characteristics and oral tactile characteristis
Mouthfeel characteristics are tactile but often tend to change less dynamically than most other oral tactile texture characteristics. Examples of mouthfeel are astringency, puckering, tingling, tickling, hot, stinging, burning, cooling, numbing, and mouth coating. Oral-tactile sensations refer to the force of breakdown or rheological properties of the food.
Describe the relationship between the amount and type of fat in a food and perceptions of phase change or dynamic contrast in the mouth.
Fat is primarily responsible for the highly palatable dynamic contrast of products such as ice cream and chocolate; as fat is decreased, so too is dynamic contrast.
Compare the primary and secondary texture terms from the original texture profile to some consumer terms
T: Adhesiveness
C: sticky, tacky, gooey

T: Chewiness
C: short, mealy, pasty, gummy

T: Viscosity
C: thin, thick
What are some reasons why sensory/instrumental correlations may miss-represent the true relationship between an instrumental measurement and a sensory perception?
-panelists’ chewing efficiencies affect the breakdown rates of foods which affects texture discrimination; instruments can not account for individual panelist variation
-instrumental analysis is empirical; instrumental measurements are not well defined and do not highly correlate with sensory texture measurements.
-there has been little improvement in instrumental analysis in the last 20 years
-instruments can not measure everything (ex. Sweetnes of lemonade vs. sweetness of water with same concentration of sugar).
Distinguish between the texture profile analysis (TPA) and the sensory texture profile method.
TPA is the measurement technique based on the texturometer, an instrument designed to evaluate texture of foods through penetrating a sample in two cycles. The penetration force is recorded, and attributes of the instrumental texture profile are selected to correlate well with the sensory texture parameters rated by trained texture profile panelists.
Sensory texture profile method??
Contrast the set-up and learnings from a consumer sensory test with the set-up and learnings from a marketing product-concept test.
Sensory Consumer Test
Set-up:
Conceptual information is kept to a minimum; only enough information to ensure proper use of the product is given.
Learnings:
Sensory characteristics are measured in isolation from advertising claims and not contaminated by the influences of any concepts about the product that are not obvious to the senses.

Product-Concept Test
Set-up:
Participants are shown product concept, and are questioned about their response and expectations of the product. Those that respond positively are asked to take the product home and use it, and to later evaluate its sensory properties, appeal, and performance relative to expectations.
Learnings:
Sensory properties, appeal, and performance are measured relative to expectations.
Contrast the learnings from central location tests vs. home use tests.
Central Location:
-Compliance with instructions, manner of examining samples, and ways of responding are easier to monitor and control.
-Easier to minimize outside influence
-Introduces a realistic element in the testers’ frame of reference (products aimed at children could be brought to school locations, etc).

Home Use:
-Consumer uses product over a period of time and can examine its performance on several occasions before forming an overall opinion
-Provides opportunity to test product in a variety of settings
-Opportunity to test product and packaging interactions
-Facilitates a more critical assessment of the product relative to the consumer’s expectations.
Give an example of a home use test situation where a monadic design would be better than a monadic sequential design. Why is it better?
Monadic design – one product is placed with an individual
Monadic sequential design – products are placed one at a time in sequence
An example of a home use test situation in which monadic design would be better than monadic sequential design is a hair conditioner. In this case, the product may create such changes in the substrate that it is not practical to get a clear picture of product performance for the second product in a sequence. (A washout or recovery period could be used to evaluate a second product, but this is not typically realistic in a market test).
Give an example of a home use test situation where a monadic sequential design would be better than a monadic design. Why is it better?
Any situation in which the product has multiple steps. Cake + frosting? When a change in the substrate is required to judge the performance of the second product?
What are the disadvantages of a one-product monadic test?
This test puts too much faith in the raw value of the scores received; humans are prone to context effects so the absolute value of the scores is nearly meaningless. Rather, a baseline product for comparison should be included.
How can you validate the responses on questionnaires from a home use test?
Through telephone callbacks to ~10-15% of respondents to verify that their opinions were correctly recorded and the interview was not faked
How can you be sure that the people coming to a CLT or participating in an HUT meet your screening criteria?
Validate qualifications through close supervision of interviewers by the field supervisor
On page 501 the authors suggest an order or flow of questions for a consumer test. Give reasons for the order of each adjacent pair of topics.
Questions move from general to specific; sensitive info should be left until the end when a person is most comfortable with the interviewer and reassured that the info is confidential
1.screening questions to qualify the respondent - ensure the respondent qualifies for the questionnaire
2.general acceptability - give an overall idea of the person’s opinion of the product
3.open-ended reasons for liking or disliking - this provides an opportunity to get at some reasons for likes and dislikes before other issues are brought to mind
4.specific attribute questions - discussed after overall acceptability because these items may not have been on the consumer’s mind and may take on false importance if asked before the overall acceptance question (also, respondents may try to please the interviewer or use words that they think are appropriate)
5.claims, opinions and issues
6.preference if a multisample test and/or rechecking acceptance via satisfaction or other scale
7.sensitive personal demographics
Know the 10 guidelines for questionnaire construction and some examples of each. (#1-5)
1.Be brief
-Good visual layout with lots of ‘white space’
-Rule of thumb: consumer surveys should be no more than 15-20 minutes.
2.Use plain language
-Spell out technical acronyms if they must be used at all
-Use qualitative probing and pretesting to indicate whether consumers understand the technical issues and terminology.
3.Don’t ask what they don’t know
-Only ask for info that a reasonable person has available and/or accessible in memory
-Don’t put unreasonable demands on consumer so that information can be recalled more easily; for example, ask how many times floor was waxed on a monthly rather than yearly period.
4.Be specific
-Be very specific when referring to products or people
-Don’t assume that the consumer understands what is being discussed
-Provide a checklist of alternatives
5. Multiple choice questions should be mutually exclusive and exhaustive
-This must be done to ensure that no one is left out (ex. What is your marital status: single or married; this doesn’t cover all categories)
-Allow for a ‘don’t know’ or ‘no answer’ category
Know the 10 guidelines for questionnaire construction and some examples of each (#6-10)
6.Do not lead the respondent
-Do not suggest an answer that the interviewer is looking for (ex. “given the ease of preparation, what is your overall opinion of the product?”)
-Questions should be read by someone without a vested interest in the product to ensure this is avoided
7.Avoid ambiguity
-Must remember that certain words in the English language have multiple meanings and can be confusing.
-Avoid double-barreled questions.
8.Beware the effects of wording
-Wording a question with only positive or negative terminology can influence respondents
-Example: “do you favor or oppose” is better than “do you favor” for a question.
9.Beware of halos and horns
-Asking questions only about good attributes can bias the overall rating in a positive direction; conversely, asking questions about only defects can bias opinions in a negative way.
-Get overall opinion ratings first to avoid bias of more specific issues
10.Pretesting questionnaires is necessary
-The draft should be reviewed for potential problems in interpretation
-Provides the opportunity to see whether items and issues are applicable to all potential respondents
What is a double-barred question?
A double-barreled question is one in which a consumer is asked about two items using the word ‘and’ (ex. “do you think frozen yogurt and ice cream are nutritious?”). Because of ‘and’, logic dictates that both parts be positive for there to be an overall positive response, however, respondents are not always logical and may respond positively to this question even tough they think only one is nutritious.
How would you analyze the data from 100 responses to the following 5-point scale?
Would you
Definitely buy this product
Probably buy this product
Don’t know
Probably not buy this product
Definitely not buy this product.
Nonparametric frequency analysis
What is a Likert scale?
A psychometric scale used to measure a respondent's agreement or disagreement with a question or series of questions. Typically range as follows: agree strongly, agree, neither agree nor disagree, disagree, and disagree strongly.
Discuss the advantages of open-ended questions
-Easy to write
-Unbiased in the sense that they gather opinions and reasons for judgement in the respondents’ own words
-well-suited to areas where the respondent has the info readily in mind but the interviewer is unable to anticipate all the possible answers
-allow for issues that may have been omitted from the structured rating scales and fixed-answer questions
-good for soliciting suggestions about, for example, opportunities for product improvement, added features, or variations on the theme of the product
-they are amenable to follow-up and further probing, both during the interview and in follow-up discussion groups that may be arranged after the data analysis of the formal questionnaire
-they confirm the info gathered in the structured attribute questions.
Discuss the disadvantages of open-ended questions
-hard to code and tabulate
-ambiguity can arise among specific sensory characteristics like the taste descriptors sour, acid, and tart
-answers can be difficult to aggregate and summarize
-if self-administered, sometimes difficult to read the handwriting of respondents
-responses may be ambiguous or misleading
-respondents may omit the obvious
-more readily answered by the more outspoken or better-educated respondents, those w/ better verbal skills
-higher non-response rate than to fixed-alternative questions
-statistical analysis of the frequencies of responses is not straightforward.
What information can you get from intensity time scaling?
This method asks panelists to scale their perceived sensations over time; it provides the sensory specialist with temporal information about perceived sensations, and allows them to quantify the continuous perceptual changes that occur in the specified attribute over time.
The following information for each sample for each panelist can be obtained: maximum intensity perceived, time to maximum intensity, rate and shape of the increase in intensity to the maximum point, the rate and shape of the decrease in intensity to half maximal intensity and to the extinction point, and the total duration of the sensation.
Name some products for which time intensity measurements are especially useful.
Chewing gum, wine, sweeteners, hand lotions.
What happens to the intensity of astringency when wine is repeatedly sipped?
Astringency becomes more intense (a build-up effect)
Why has time-intensity methodology been so often applied to sweeteners?
Two reasons:
1.Many artificial sweeteners have a different time course than sucrose and other ‘natural’ or carbohydrate sweeteners. This characteristic detracts from the initial consumer acceptability of foods containing these sweeteners.
2.The sweetener industry is huge.
What information can you get from intensity time scaling?
This method asks panelists to scale their perceived sensations over time; it provides the sensory specialist with temporal information about perceived sensations, and allows them to quantify the continuous perceptual changes that occur in the specified attribute over time.
The following information for each sample for each panelist can be obtained: maximum intensity perceived, time to maximum intensity, rate and shape of the increase in intensity to the maximum point, the rate and shape of the decrease in intensity to half maximal intensity and to the extinction point, and the total duration of the sensation.
Name some products for which time intensity measurements are especially useful.
Chewing gum, wine, sweeteners, hand lotions.
What happens to the intensity of astringency when wine is repeatedly sipped?
Astringency becomes more intense (a build-up effect)
Why has time-intensity methodology been so often applied to sweeteners?
Two reasons:
1.Many artificial sweeteners have a different time course than sucrose and other ‘natural’ or carbohydrate sweeteners. This characteristic detracts from the initial consumer acceptability of foods containing these sweeteners.
2.The sweetener industry is huge.
Sketch two T-I curves – one for the sweetness of sucrose and one for the sweetness of a high potency sweetener that lingers in the mouth.
TI curves
What information can you get from intensity time scaling?
This method asks panelists to scale their perceived sensations over time; it provides the sensory specialist with temporal information about perceived sensations, and allows them to quantify the continuous perceptual changes that occur in the specified attribute over time.
The following information for each sample for each panelist can be obtained: maximum intensity perceived, time to maximum intensity, rate and shape of the increase in intensity to the maximum point, the rate and shape of the decrease in intensity to half maximal intensity and to the extinction point, and the total duration of the sensation.
Name some products for which time intensity measurements are especially useful.
Chewing gum, wine, sweeteners, hand lotions.
What happens to the intensity of astringency when wine is repeatedly sipped?
Astringency becomes more intense (a build-up effect)
Why has time-intensity methodology been so often applied to sweeteners?
Two reasons:
1.Many artificial sweeteners have a different time course than sucrose and other ‘natural’ or carbohydrate sweeteners. This characteristic detracts from the initial consumer acceptability of foods containing these sweeteners.
2.The sweetener industry is huge.
Sketch two T-I curves – one for the sweetness of sucrose and one for the sweetness of a high potency sweetener that lingers in the mouth.
TI curves for sucrose and sweetner
Sketch a T-I curve for the melting of ice cream in the mouth. Label the axes.
TI curve for ice cream
How does the process of eating a food affect the volatilization of aroma compounds?
Volatile flavors are released from food when it is eaten. In addition, flavor volatilization changes as a function of mixing with saliva, pH change, enzymatic processes such as starch breakdown by salivary amylase, warming, mechanical destruction of the food matrix, and changes in ionic strength.
Why might retronasally perceived odors differ from the same odor perceived orthonasally?
Sniffing – orthonasal
Sipping – retronasal
The flavor balance, interactions with other tastes, and time properties of release are different depending on how the food is perceived, resulting in differences between ortho and retronasal perception.
What is meant by an individual person’s T-I signature?
Individual judge TI differences that are unique and reproducible within individuals; the causes are unknown, but may be due to differences in anatomy, oral manipulation, and scaling.
Which T-I parameters are almost always highly correlated?
Curve size parameters are highly correlated and usually loaded on the first principal component
What is a focus group?
A qualitative research method typically involving about 10 consumers sitting around a table and discussing a product or idea with the seemingly loose direction of a professional moderator. The interview is focused in the sense that certain issues are on the agenda for discussion. The flow is not entirely unstructured, but is instead centered on a product, ad, concept or promotional materials.
Contrast Qualitative and Quantitative consumer research.
Qualitative:
-small numbers of respondents (N<12 per group)
-interactions among group members
-flexible interview flow, modifiable content
-well suited to generate ideas and probe issues
-poorly suited to numerical analysis, difficult to assess reliability
-analysis is necessarily subjective, nonstatistical

Quantitative:
-large projectable samples (N>100)
-independent judgments
-fixed and consistent questions
-poorly suited to generate ideas, probe issues
-well suited to numerical analysis, easy to assess reliability
-statistical analysis is appropriate.
Compare and contrast the use of individual interviews and focus group interviews.
-beliefs voiced that would not easily be offered by consumers in a more structured and directed questionnaire study.
-issues that were not expected beforehand can be followed up on the spot via a moderator
-interaction between participants; one person’s remark may bring an issue to mind in another person who may not have thought of it
-take a lot of time in terms of moderator travel, recruiting and screening participants, and data analysis
-can be expensive
-consumer contact with 12 people can be directly observed, all within the space of an hour and with the participants collected by appointment (ie. faster than one-on-one interviews)
How does one determine the reliability and the validity of focus group or individual interview results?
-use multiple moderators and more than one person’s input on the analysis
-use additional groups to see if they yield similar information to the first group
for individual interviews:
1.peer-debriefing where emerging concepts are questioned and discussed by co-investigators
2.prolonged engagement with return interviews that can assess consistency of emerging themes
3.participant checks in which conclusions and key findings can be checked with participants
What is the rationale for conducting 3 focus groups (as opposed to just 2 or more than 3)?
Three groups are used because in case two groups conflict, one can get a sense of which group’s opinions may be more unusual. There is a marginal utility in increasing the number of groups, as repeated themes will emerge
Three issues (nondirection, full participation and coverage of issues) are keys to good moderating. Describe them.
Nondirection – guiding the discussion without suggesting answers or directing discussion towards a specific conclusion. The moderator extracts ideas, perceptions, opinions, attitudes and beliefs from the group without imparting his or her own opinion.

Full Participation – Encouraging inclusion of all participants to ensure that all sides of an issue are raised and aired. This includes probing the lack of consensus, encouraging new opinions by asking if anyone disagrees with what’s been said, and dealing appropriately with overly talkative and too quiet participants, to ensure that all have contributed equally.

Coverage of Issues – Ensure that all issues are covered. This involves good time management, flexibility if some issues arise naturally and out of order, and understanding what the client wishes to learn from the consumers and doing all things possible to ensure these issues are included barring any major time constraints.
What are the advantages and disadvantages of having a user of the focus group data present to view the focus group discussion? Describe some strategies for overcoming the disadvantages.
Advantages:
-hear actual consumer comments in their own words with tone of voice, gestures, and body language.

Disadvantages:
-selective listening; people will often remember the comments that tend to confirm their preconceived notions about the issues
-observers tend to form their opinions long before they see the report and sometimes without the information from subsequent groups, which may be contradictory or complementary.
-possible to skew the reporting of comments by extracting them out of context in order to confirm a client’s favorite hypothesis

Strategies for overcoming disadvantages:
-the biggest proponent of a project, concept or prototype should be given the job of writing down every negative comment or consumer concern.
-give the job of writing down positive information to the biggest skeptic.
-include observers in a debriefing session held just after the group is concluded, stressing important issues and promoting a balanced view of the proceedings.
In a typical food production facility a very small proportion of products are defective. Explain how this can produce a high rate of false alarms from sensory testing.
Example: alpha and beta-risks for a testing program of 1000 tests are set at 10%, which means that 10% of the time when a defective product is sent for testing it will go undetected by the evaluation, and 10% of the time a product that has no defects will be flagged due to random error. This will lead to 810 correct ‘pass’ decisions and 90 correct detections of sensory problems. Due to the high incidence of good products being test, however, the 10% false alarm rate unfortunately leads to 90 products also being flagged where there is no true sensory problem.
In a typical food production facility a very small proportion of products are defective. Explain how you can adjust a sensory QC program to deal with the high rate of false alarms.
Build in some backup system for additional testing to ensure that marginal products are in fact defective before action is taken.
2. Describe several management issues (cost / time concerns) about sensory QC programs?
defining standards and cutoffs or specification limits to ensure that consumer needs are met
-cost factors, including the need for technician time to setup and the costs of panel screening
-personnel time away from the person’s main job to come to sensory testing
-level of thoroughness in sampling needed for management comfort versus the cost of overtesting; ideally, sensory testing would occur along all stages of production, in every batch and every shift, but this is not practical.
-reporting relationships; to avoid conflict of interest (ie. QC department reporting directly to manufacturing) a separate reporting structure may be desirable so that higher executives committed to a corporate quality program can insulate the QC department from pressures to pass bad products.
-ensuring continuity in the program; management must recognize that a sensory instrument needs maintenance, calibration and eventual replacement, as well as refreshment or replacement of reference standards, and programs to ensure against downward shift due to relaxation of judgment criteria or falling standards.
3. How might one determine an allowable size of a specific sensory deviation in a product? For example how could you decide what range of raspberry flavor was allowable in a raspberry yogurt?
Use ratings for an overall degree of difference from a standard or control product through a scale with ‘extremely different from the standard’ on the far left end, and ‘the same as the standard’ on the far right end. Panelists should be shown samples in training that represent most or all of the points along the scale, and the scale should be calibrated at an early stage using consumer input. Management will choose some level of the difference as a cutoff for action; the level should be the point at which regular users of the product will notice and object to differences—this should be the benchmark for action standards.
Why are simple difference tests (e.g. triangle or paired comparison tests) not generally suitable as sensory QC tests?
With QC tests, testers are typically looking for a range of acceptable variation. Simple difference tests are useful for detecting any difference at all, but not suitable for finding the range (ie. just because a product is found to be different from the standard does not mean that it is unacceptable
What is the advantage of including a ‘blind control’ in a sample set where several production samples are compared against a control (gold standard) sample?
The blind control is inserted into the test set to be compared against the labeled version of itself. This helps establish the baseline of responding on the scale, since two products are rarely rated as identical. In other words, it provides a false-alarm rate or an estimate of the placebo effect
6. Explain how descriptive analysis applied to quality control differs from descriptive analysis applied to research and product development purposes?
Descriptive analysis applied to research and product development usually involves rating all sensory characteristics in order to provide a complete specification of sensory properties when comparing products. For QC purposes, attention to a few critical attributes may be appropriate.
7. What are the strengths of using descriptive analysis for QC?
Strengths:
-the detail and quantitative nature of the descriptive specification lends itself well to correlation with other measures such as instrumental analysis.
-it presents less of a cognitive load on panelists once they have adopted the analytical frame of mind because they are not required to integrate their various sensory experiences into an overall score, but merely to report their intensity perceptions of the key attributes.
-the reasons for defects and corrective actions are easier to infer since specific characteristics are rated, which can be more closely associated with ingredients and process factors than an overall score.
What are the weaknesses of using descriptive analysis for QC?
Weaknesses:
-tends to be more laborious in panel training than some of the other techniques.
-it is better suited for quality evaluation of finished products due to the need for data handling and statistical analysis as well as having a sufficient number of trained judges
-may be difficult to arrange for descriptive evaluation of finished products
-may be difficult to arrange for descriptive evaluation for ongoing production, particularly on later shifts if production is around the clock
-training regimen is difficult and time consuming to set up since examples must be found for the range of intensities for each sensory attribute in the evaluation
-problems can occur in some attributes not included on the scorecard and/or outside of the training set.