twitter

Wednesday, 22 June 2016

Pigeon visual short-term memory directly compared to primates

Volume 123, February 2016, Pages 84–89
Comparative Cognition: In Honor of Ed Wasserman

Abstract

Three pigeons were trained to remember arrays of 2–6 colored squares and detect which of two squares had changed color to test their visual short-term memory. Procedures (e.g., stimuli, displays, viewing times, delays) were similar to those used to test monkeys and humans. Following extensive training, pigeons performed slightly better than similarly trained monkeys, but both animal species were considerably less accurate than humans with the same array sizes (2, 4 and 6 items). Pigeons and monkeys showed calculated memory capacities of one item or less, whereas humans showed a memory capacity of 2.5 items. Despite the differences in calculated memory capacities, the pigeons’ memory results, like those from monkeys and humans, were all well characterized by an inverse power-law function fit to d’ values for the five display sizes. This characterization provides a simple, straightforward summary of the fundamental processing of visual short-term memory (how visual short-term memory declines with memory load) that emphasizes species similarities based upon similar functional relationships. By closely matching pigeon testing parameters to those of monkeys and humans, these similar functional relationships suggest similar underlying processes of visual short-term memory in pigeons, monkeys and humans.

Keywords

  • Change detection;
  • Visual short-term memory;
  • Visual working memory;
  • Pigeons;
  • Monkeys;
  • Humans

1. Introduction

Visual short-term memory (VSTM) refers to the ability to transiently store visual information for brief time intervals of seconds to several minutes. VSTM (also called visual working memory) underlies numerous cognitive and motor functions including: detecting changes in the environment, planning and executing goal directed movements, and combining information across eye movements (e.g., Brouwer and Knill, 2007, Irwin, 1991 and Henderson, 2008). Over the past 27 years, the task of choice for investigating human VSTM has been change detection where subjects are presented with an array of visual stimuli and following a short delay they report which stimulus changed or whether (or not) there was a change (e.g., Alvarez and Cavanagh, 2004, Eng et al., 2005, Luck and Vogel, 1997, Pashler, 1988 and Rensink, 2002). More recently, the emphasis has been on identifying the nature of VSTM limitations in terms of capacity or accuracy (Anderson et al., 2011 and Bays and Husain, 2008; Devkar et al., in press; Donkin et al., 2013, Elmore et al., 2011, Gorgoraptis et al., 2011, Keshvari et al., 2013, Pashler, 1988, Rouder et al., 2008, Sims et al., 2012, Van den Berg et al., 2012, Wilken and Ma, 2004, Zhang and Luck, 2008 and Zhang and Luck, 2011).
Despite considerable effort and research to characterize the nature of human VSTM, similar studies of nonhuman animal VSTM have only recently been conducted (Buschman et al., 2011 and Devkar et al., 2015; Elmore et al., 2011, Elmore et al., 2012, Elmore and Wright, 2015, Gibson et al., 2011, Heyselaar et al., 2011 and Lara and Wallis, 2012; Lazareva and Wasserman, 2016; Wright et al., 2010). Among the reasons for the lag in animal VSTM research, is that training nonhuman animals to perform these demanding memory tasks is very time consuming, often requiring a year or more of training to achieve stable accurate performance with delays and as many as six to-be-remembered items. Nevertheless, it is important to understand how VSTM works in species other than humans for evidence about differences and similarities, including evolutionary continuity of such a fundamental processes as VSTM. Indeed, all visual memory (including long-term visual memory) begins with VSTM.
Training difficulties notwithstanding, we (and others) have developed procedures to train monkeys and pigeons to achieve reasonably accurate performance in tasks similar to some of those used to test humans. Although rhesus monkeys are not typically as accurate as humans in these tasks, nevertheless, both species have shown progressive and systematic declines in accuracy as the number of to-be-remembered items is increased (e.g., Elmore et al., 2011, Elmore and Wright, 2015 and Heyselaar et al., 2011). In some of these tasks, the basic change-detection procedure (change vs. no change; 2-stimulus test vs. all array stimuli) for animals has differed compared to that used for humans. In addition to differences in basic change-detection procedures, there are often parameter differences (item types, item number, visual angle, viewing times, delay times, intertrial times, etc.), that complicate direct species comparisons in VWM, particularly across laboratories, but even within the same laboratory. Indeed, in our experiments with monkeys we initially had used longer presentation times and shorter delay times than with humans, to promote accurate monkey performance (Elmore et al., 2011). But later we redid the experiment with those parameters matched to those used with humans, including making the items the same shape (squares) and same colors for more direct species’ comparisons (Elmore and Wright, 2015).
The results from the Elmore and Wright (2015) study showed differences in accuracy and capacity that emphasized species differences, but a continuous-resource account provided a simpler and more straightforward explanation based upon similar functional relationships that emphasize species similarities. Memory sensitivity (d’) declined precisely as an inverse power law function of N (display size) and the functions from both species were well fit by power law functions that accounted for 85% of the variance. By closer matching of monkey testing parameters to those of humans, conclusions based upon the similar functional relationships strengthened the evidence for similar VSTM processing between monkeys and humans.
The purpose of the experiment presented in this article was to test pigeons with colored-square stimuli with the same basic change-detection procedure (2-stimulus test) and the same parameters previously used to directly compare monkeys and humans (Elmore and Wright, 2015).

2. Methods

2.1. Subjects

Three White Carneaux pigeons, 5–9 y.o., from the Palmetto Pigeon Plant (Sumter, SC) and Double T Farm (Glenwood, Iowa) participated in the experiment. They had been trained and tested in a change-detection task with colored circles (Elmore et al., 2012 and Wright et al., 2010). In the experiment presented here, testing was conducted 5 days per week. Pigeons were maintained at 85% of their free-feeding weights with free access to grit and water in their individual home cages. A 14–10 h light–dark cycle was maintained in the room containing the home cages. All animal procedures conformed to the National Institutes of Health guidelines, and were approved by the Institutional Care and Use Committee at the University of Texas Health Science Center at Houston.

2.2. Apparatus

Pigeons were tested in a custom designed and built wooden testing chamber (35.9-cm wide × 45.7-cm deep × 51.4-cm high) equipped with a custom-built wooden grain hopper tray containing mixed grain that was centered below a 17-in Eizo T550 color monitor (800 × 600) and was accessed by pigeons through an opening (5.1 × 5.7 cm) centered in the front panel 3.8-cm above the chamber floor. An infrared touch screen (Carroll Touch, Round Rock, TX) detected responses and interfaced with the computer. An exhaust fan was located at the back of the chamber. A houselight (Chicago Minature #1829, 24 V) located in the center of the ceiling illuminated the pigeon’s portion of the chamber during intertrial intervals (ITI).
Custom software written with Visual Basic 6.0 on a Dell Optiplex GX110 recorded and controlled all events in the operant chamber. A video card (ATI 3D Rage Pro AGP 2X, Ontario Canada) controlled graphics generated by the computer and a computer-controlled relay interface (Model no. PI0-12, Metrabyte, Taunton, MA) operated the grain-hopper, hopper light, and chamber light.

2.3. Stimuli & displays

The stimuli were six approximately 1.4-cm colored squares (RGB 24 bit values: aqua—0, 255, 255, blue—0, 0, 255, green—0, 255, 0, magenta—255, 0, 255, red—255, 0, 0, yellow—255, 255, 0) like those shown in Fig. 1. The stimuli were presented in random locations on an invisible 4 × 4 matrix (9 cm horizontal and 7 cm vertical). (These stimuli and displays were sized to compensate for the pigeons’ closer proximity to the screen than monkeys and humans.)
Progression of events for two trials with colored squares in the change ...
Fig. 1.
Progression of events for two trials with colored squares in the change detection task.

2.4. Testing procedures

Following extensive training to steady-state performance accuracy, the pigeons were tested for 12 (pigeon G345 and P8040) or 13 (pigeon P8893) consecutive sessions with 96 trials per session. Trials began with a 1000 ms presentation of 2, 3, 4, 5, or 6 colored squares in random positions within the 4 by 4 matrix. The number of items in the sample display (display size) was randomized across trials. The sample array disappeared for a delay interval which was 50 ms for half of the trials or 1000 ms for the other half of trials, randomly intermixed. Two delay intervals were used in order to encourage vigilance in the task. For the results presented here, only the 1000-ms delay trials were included. After the delay, two colored squares were presented, one of which had changed in color from the sample display. The pigeons were required to peck the colored square that had changed in order to receive grain reinforcement. Correct responses were followed by access to the grain hopper for 3–5 s depending upon the individual pigeon to control weight. Incorrect responses were followed by a click noise, but no grain reinforcement presented. Following reinforcement or an incorrect response the chamber was illuminated for a 15-s ITI and then darkened at the beginning of the next trial.

3. Results and discussion

Fig. 2, top panel, shows percent-correct performance and its decline as a function of display size. A one-way ANOVA revealed a significant effect of display size for the pigeons (= 9.31, p = <0.0001, partial η2 = 0.18).
Pigeon change-detection performance and models. Top panel: percent correct in ...
Fig. 2.
Pigeon change-detection performance and models. Top panel: percent correct in the change detection task with colored squares. Middle panel: capacity estimates calculated based on change detection performance. Lower panel: power law fits for d’ values calculated based on change detection performance. Error bars represent standard error of the mean.

3.1. Testing models of VSTM

3.1.1. Fixed-capacity model.

VSTM capacity was calculated using the following Eq. (1), originally developed by Eng et al. (2005).
equation1
View the MathML source
In this equation, A is the empirical accuracy, N is the display size tested, and C is VSTM capacity. The likelihood that a single test item was not among the C items remembered is (N−C)/N, and the likelihood that both test items were not among the C items remembered is [(N−C)/N]2. If both test items were among the C items remembered, then accuracy is 100% whereas if neither were among the C items remembered, then accuracy is 50%, because the participant would be guessing. Thus, the equation above can be used to compute capacity, given that C. For each pigeon and each display size, capacity was computed by solving for C as shown in Fig. 2, middle panel. By averaging across display sizes and pigeons, the pigeon’s mean capacity for colored squares was 0.77 ± 0.11 (95% CI). Thus, pigeons were on the average maintaining somewhat less than one colored-square stimulus in memory during the delay interval, according to the fixed capacity model.
However, it is worth noting that Eq. (1)’s capacity formula assumes that if subjects do not remember one of two sample display items being tested, but remember the other one was the same as the sample item, then they can infer that the not-remembered item must be the one that changed. However, it is unknown whether or not pigeons make this inference. If they do not make this inference, then it could be assumed that the correct response is made if the changed item was remembered (probability of C/N), or by a correct guess of 0.5 (we are indebted to Nelson Cowan for this suggestion). In this case, Eq. (1) would become:
equation2
View the MathML source
Eq. (2) actually produces somewhat larger capacity estimates, since correct answers are based on VSTM for changes plus guessing (but not inference) when the VSTM does not detect a change. With this equation, the pigeon’s mean capacity would be 1.14 ± 0.18, thus producing a modest rise in the maximum capacity estimate for pigeons. Nevertheless, if the pigeons’ VSTM capacity limit is one item or less, then survival by avoiding predators, finding and remembering food sources, identifying one’s mate, building one’s nest, and training one’s chicks would be difficult if not impossible. However, pigeons do succeed very well in doing these things. Resolution of this seemingly contradictory capacity is not likely to lie solely with a lack of ecological validity of the colored-square stimuli used in this experiment. Performance of 77 percent correct with two colored squares is very respectable accuracy for pigeons in this difficult task; indeed slightly better than monkeys as discussed below. Perhaps pigeons remembered two items perfectly on some trials, but at other times remembered no items (inattention), resulting in a mean capacity that vacillates around 1 item. Yet another possibility, considered in the next section, is that all memory items are encoded and remembered imperfectly—for example imperfect memory might be distributed across many if not all items to-be-remembered. Furthermore, distribution of a limited memory resource would result in less perfect memory as the number of to-be-remembered items increased.

3.1.2. Continuous-resource model

The continuous-resource model employs d’ values from signal detection theory as a measure of memory sensitivity ( Wilken and Ma, 2004, Green and Swets, 1966, Macmillan and Creelman, 2005 and Elmore et al., 2011). d’ was computed using Eq. (3).
equation3
View the MathML source

The difference of the z scores of the hits and false alarms are divided by the square root of 2. The division by the square root of 2 is necessary because the task is a two-alternative forced-choice task (2AFC) and there are two ways to make a correct response: by remembering that one item is the same as an item in the sample display (and choosing the other), or by noticing the item that has changed and choosing it ( Macmillan and Creelman, 2005). Hits and false alarms were defined based on stimulus location in the test display. Locations were numbered from 1 to 16 as the locations went from left to right and then down into the row below, such that the bottom right corner was location number 16. A hit was defined as a correct response to the lower numbered location in the test display. So, if test stimuli were displayed in locations 2 and 9 and the stimulus in 2 was the changed object, a correct response to location 2 would constitute a hit. A false alarm was defined as a response to the lower numbered location when that location did not contain the changed item. The definitions of a “hit” and a “false alarm” are arbitrary but equivalent to the obverse. The d’ values for each stimulus type and display size are plotted in Fig. 2, lower panel. There was little or no response bias for colored squares in keeping with findings from humans in two-alternative forced-choice as opposed to yes-no procedures (Green and Swets, 1966 p. 408).
The d’ values were fit by power-law functions. As anticipated, power-law functions were found to be good fits to each individual pigeon’s d’ values as well as to the group mean. Power-law functions should provide good fits according to the continuous-resource model, because memory sensitivity (d’) should be proportional to 1/N, where N is the number of items in the display. The general form of the power law function is shown in Eq. (4), where Y is a constant, N is the number of items in the sample display, and x is the exponent.
equation4
d=Y×N−x
Eq. (4) was fitted to the d’ values of the results shown in Fig. 2, top panel resulting in d’ = 3.18x−1.03 and fits well as shown in Fig. 2, lower panel. Eq. (4) accounts for 84 percent of the variance (R2 = 0.84).

3.2. Comparisons to monkeys and humans

3.2.1. Percent correct comparisons

The results for monkeys and humans are shown in Fig. 3 and Fig. 4, respectively (Elmore et al., 2011 and Elmore and Wright, 2015). Much like results for pigeons presented here (Fig. 2, top panel), performance by monkeys and humans (Fig. 3 and Fig. 4, top panels) declined as a function of display size. On average, humans outperformed pigeons and monkeys by 15.4% and 27.3%, respectively. A repeated measures ANOVA of display size (2, 4, and 6 only) × species revealed a main effect of display size [F(2,2) = 20.79, p = 0.0004, partial η2 = 0.78] and species [F(2,2) = 15.68, p = 0.0012, partial η2 = 0.82]. There was no significant interaction of display size × species [F(2,4) = 1.95, = 0.18, partial η2 = 0.46].
Monkey change-detection performance and models. Top panel: percent correct in ...
Fig. 3.
Monkey change-detection performance and models. Top panel: percent correct in the change detection task with colored squares. Middle panel: capacity estimates calculated based on change detection performance. Lower panel: power law fits for d’ values calculated based on change detection performance. Error bars represent standard error of the mean.
Human change detection performance and models. Top panel: percent correct in the ...
Fig. 4.
Human change detection performance and models. Top panel: percent correct in the change detection task with colored squares. Middle panel: capacity estimates calculated based on change detection performance. Lower panel: power law fits for d’ values calculated based on change detection performance. Error bars represent standard error of the mean.

3.2.2. Fixed capacity comparisons

Capacity estimates also differed among species. Not surprisingly, the humans’ capacity estimates were the largest with a mean of 2.5 ± 0.4. Thus, on the average humans could remember 1.5 to 2.0 more colored squares than pigeons or monkeys according to the fixed-capacity model of VSTM.
Although the mean capacity estimate was somewhat lower than typically found for humans, other researchers using similar procedures (two-item test displays with one item changed) showed virtually identical capacities (2.4–2.5) for colored shapes (Eng et al., 2005, Experiment 1A). Somewhat different change-detection procedures (e.g., testing the entire sample display with one object changed) have shown larger capacities of 3.6 for colors (e.g., Alvarez and Cavanagh, 2004), more similar to the claimed 4 +/− 1 human capacity limit (Cowan, 2001, 2005). But presenting all (unchanged) sample items in the test display has been shown to provide additional context cues that artificially enhance VWM capacity measures (Chun and Jiang, 1998 and Jiang et al., 2000).
Two-item test displays breakup these configural-pattern context cues, thereby providing better comparisons across the range of sample-display sizes tested, and likely level the playing field for nonhuman animals (e.g., pigeons) that may not use enhancing configural-pattern strategies to the degree that humans do. Moreover, 2-item test displays are better suited for testing animals because chance performance is 50% correct, difficulties (e.g., extinction) encountered using larger test displays are avoided (e.g., with 6 item test displays chance performance would be 16.7%), and are equally well suited for testing humans for direct species comparisons.
In conclusion, the fixed-capacity model may be a reasonable measure to compare human VSTM, under some conditions, but is not very meaningful when applied to pigeon and monkey VSTM with capacities less than one item.

3.2.3. Continuous resource comparisons

We also characterized monkey and human performances using d’ values according to the continuous-resource model. The d’ values for these two species are, like pigeons, well characterized by power law functions as shown in Fig. 3 and Fig. 4, lower panels. Exponents characterize the rapidity of change in d’ over the range of set sizes tested. The exponents were −1.03, −0.94, and −0.86 for pigeons, monkeys, and humans, respectively; a one-way ANOVA showed no significant differences in exponent value across species [F(2,8) = 0.24, p = 0.79, partial η2 = 0.06]. Multiplicative coefficients of the power law characterize the ‘level’ of the function, the Y-axis intercept, extrapolated memory for a single item, and overall memory sensitivity (d'). Not surprisingly, humans far surpass the animals in this measure. A one-way ANOVA showed a significant difference in coefficient across species (F(2,8) = 4.47, p = 0.049, partial η2 = 0.65); interestingly, pigeons showed a marginally superior coefficient compared to monkeys (3.18, 2.27, and 6.34) for pigeons, monkeys, and humans, respectively. However, there was no significant difference between the pigeon and monkey coefficient (t-test, t = 1.05, p = 0.19). The coefficient for humans was significantly greater than monkeys (t-test, = 2.1, p = 0.04) and pigeons (t-test, t = 2.05, p = 0.04).

4. Conclusions

In this article, we have shown that pigeons can be trained and tested in change-detection tasks with parameters closely matching those used to test rhesus monkeys and humans, and thereby provide more direct VSTM comparisons. Humans have an advantage over nonhuman animals in that they come to the task with a lifetime of game-playing and test-taking experience, requiring little or no training, except instructions to reach asymptotic accuracy. Pigeons and monkeys, on the other hand, learn the ‘rules of the task’ through the contingencies of reinforcement. With continued training, pigeons in this study maintained accurate performance similar to that shown in a previous study (Wright et al., 2010), despite progressive changes that made the task more difficult (shorter 1-s viewing times, shorter 1-s delay times, and larger stimulus sets of 6 colored squares) eventually matching those used to test humans (Elmore et al., 2011) and monkeys (Elmore and Wright, 2015). We observed no major differences in the pigeon’s performance for most of these changes, including the two delays (the exception of course being memory set size which is the major independent variable). Early change-detection research using very short delays (less than a few hundred milliseconds) raised concerns of attentional capture and artifact-mediated performance (e.g., Pashler, 1988). This is why we chose to use the 1000-ms delay; to rule out issues of attentional capture as opposed to (proper) memory based on VSTM.
Findings and conclusions from the better matched experiment of this article support and strengthen many conclusions from our previous experiments (Elmore et al., 2011, Elmore and Wright, 2015 and Wright et al., 2010). Among these conclusions is that all of these species, pigeons, monkeys and humans, have very limited VSTM. They are only able to accurately remember relatively small amounts of visual information over the course of a brief delay, and as such, VSTM accuracy declines with display size. Said otherwise, the more items that the participants must remember, the lower will be their accuracy. The research presented in this article also highlights and supports species’ VSTM differences shown in these experiments. The pigeons and monkeys were less accurate than humans at similar set-sizes (2, 4, & 6 items) and showed calculated capacities of one item or less with colored squares, similar to what monkeys had also shown for colored circles (Elmore et al., 2011). Implications of such findings supported a continuously distributed resource account of VSTM as opposed to all-or-nothing fixed-capacity account. Recently, other theories and models have been proposed that blend these accounts, plus corrections for some other factors (e.g., attention lapse rates) that have been added—all of which make model distinctions more difficult, particularly with the procedures we used in this experiment (e.g., Donkin et al., 2013, Gorgoraptis et al., 2011, Keshvari et al., 2013, Sims et al., 2012, Van den Berg et al., 2012, Van den Berg et al., 2014 and Zhang and Luck, 2011). An example of such an account for pigeon fixed capacity involved multiple-item changes in a change/no-change detection procedure, resulting in a mean capacity of 1.9 items for pigeons when corrected for a mean lapse rate (random guessing) of 18 percent (Gibson et al., 2011).
Despite differences in accuracy and capacity that emphasize species differences, the continuous-resource account provides a simple and straightforward explanation based upon similar functional relationships which emphasize species similarities (cf., Wright, 2013; also see Young, 2016, for similar arguments about the utility of functional relationships). Memory sensitivity (d’) is specified to decline precisely as an inverse power law function of N (display size). The d’ values from pigeons as well as monkeys and humans were well fit by these power-law functions. The additional constraint of five display sizes for pigeons and monkeys (similar to what was used with humans) increased the power in the continuous-resource model tests, which nevertheless strongly supported this model and accounted for roughly 85% of the variance.
By closely matching of pigeon and monkey testing parameters to those of humans, conclusions based upon the qualitatively similar functional relationships shown here strengthen the evidence for similar VSTM processing across these diverse species, suggesting strong evolutionary convergence of memory processing and relevance of future research involving invasive manipulations and recordings to mechanisms of visual short-term memory, generally.

Acknowledgements

Support for this research was provided by NIMH Grants R01MH-072616 and R01MH091038 (A.A. Wright). We thank Jeffrey S. Katz, Wei Ji Ma, and John Magnotti for their help and support on early stages of this research. This research was conducted following the relevant ethics guidelines for research with animals and was approved by UTHSC’s institutional IACUC.

References

    • Devkar et al., 2015
    • Devkar, D.T., Wright, A.A., Ma, W.J. (in press). Monkeys Show the Same Underlying Visual Short-Term Memory Processes as Humans. Journal of Vision.
    • Green and Swets, 1966
    • D.M. Green, J.A. Swets
    • Signal Detection Theory and Psychophysics
    • Wiley, New York (1966)
    • Henderson, 2008
    • J.M. Henderson
    • Eye movements and visual memory
    • Visual Memory, Oxford University Press, Oxford (2008), pp. 87–121
Corresponding author.