Geographic variation in bird vocalizations is common and has been associated with genetic differences and speciation, as well as with short-term changes in response to anthropogenic noise. Because vocalizations are used for individual recognition in many species, geographic variation in these traits may affect mate choice, pair bonding, and territory defense. Anecdotal evidence suggests the existence of geographic variation in vocalizations between isolated populations of Gentoo Penguins (Pygoscelis papua), but there have been no comprehensive studies of Gentoo Penguin vocalizations across a broad geographic range. We used acoustic recordings of ambient colony sound at 22 breeding colonies in the Antarctic Peninsula and South Shetland Islands, South Georgia, the Falkland Islands, and Argentina to address 2 main questions regarding Gentoo Penguin vocalizations: (1) How do ecstatic display calls vary both within and between individuals, colonies, and regions? (2) Can ecstatic display calls be used to distinguish subspecies? We found high levels of variation between individuals and between colonies, but little additional variation between regions or subspecies. We found no trends to suggest a latitudinal gradient in vocal characteristics, although we did find that some measures varied with relative distance between colonies. Although we found significant differences at the colony level, unknown calls could not easily be categorized to colony or region by machine learning. We conclude that the vocal soundscape of each colony is driven by variation between individuals within a colony and, developing independently from neighboring colonies, becomes differentiated from other colonies through a process of drift. Although individual calls could, in most cases, be identified to subspecies by machine learning, our analysis suggests that subspecies differences may be driven by variation among colonies and that subspecies identification may be unreliable using acoustics alone.
Vocal communication has been widely studied in birds and is known to be important for mate choice, pair bonding, and territorial defense. In a noisy environment, vocal distinctiveness allows mates to recognize each other within a breeding season as well as across seasons (e.g., Aubin and Jouventin 1998, Leader et al. 2002, Tibbetts and Dale 2007) and allows individuals to communicate information such as fitness (de Kort et al. 2009) and relatedness (McDonald and Wright 2011). Variation in vocal traits may be associated with geographic isolation (e.g., Wright 1996, Dalisio et al. 2015, Shizuka et al. 2016), speciation (Mulard et al. 2009, Pieplow and Francis 2011, Greig and Webster 2013), and range shifting (Xing et al. 2013), and can even occur over short time scales in response to anthropogenic noise (Rheindt 2003, Villain et al. 2016).
Vocal characteristics in penguins are relatively understudied compared to other bird taxa, but studies have shown gradual interspecies differentiation over time (Thumser et al. 1996, Favaro et al. 2016). Penguins that build nests or burrows to incubate eggs and chicks can use geographic cues to guide them to their nest, and thus their calls may be less complex than those of King Penguins (Aptenodytes patagonicus) or Emperor Penguins (A. forsteri), which use vocalizations to identify a mate or chick within massive, noisy colonies (Searby and Jouventin 2005). However, despite this reduced complexity, individual recognition has been observed in Rockhopper Penguins (Eudyptes chrysocome; Searby and Jouventin 2005), Adélie Penguins (Pygoscelis adeliae; Speirs and Davis 1991), Gentoo Penguins (P. papua; Speirs and Davis 1991, Jouventin and Aubin 2002), African Penguins (Spheniscus demersus; Seddon and van Heezik 1993, Favaro et al. 2016), Magellanic Penguins (S. magellanicus; Clark et al. 2006), and Macaroni Penguins (E. chrysolophus; Searby et al. 2004). These studies all indicate that while there may be differences in complexity between species, vocalizations play an important role in behavior across all penguins.
Gentoo Penguins are distributed widely across geographically isolated sub-Antarctic islands in the Atlantic, Indian, and Pacific oceans as well as the Antarctic Peninsula (Lynch 2013). In the Atlantic region of their range, the polar front creates a strong ecological boundary between populations in Argentina and the Falkland Islands and those in South Georgia, the South Sandwich Islands, the South Orkney Islands, and the Antarctic Peninsula (Figure 1). This geographic and ecological isolation, combined with high mate and colony fidelity (Lynch 2013), results in strong population genetic structure between regions (Levy et al. 2016). There are currently 2 described subspecies, originally based heavily on morphology (Stonehouse 1970) and now confirmed with genetics (de Dinechin et al. 2012, Levy et al. 2016). Pygoscelis papua papua lives above the polar front in the Falkland Islands, and P. p. ellsworthii on the Antarctic Peninsula and sub-Antarctic islands below the polar front (Levy et al. 2016). Gentoo Penguins have recently colonized Isla Martillo in the Beagle Channel in Argentina, though their subspecies designation is not yet known. De Dinechin et al. (2012) proposed a third subspecies for the sub-Antarctic islands above the polar front in the Indian and Pacific oceans.
The ecstatic display call is the most common contact call used by Gentoo Penguins and serves to attract and contact mates, though in some cases it is used in the absence of a mate and without obvious provocation; in these situations, its function remains unknown. Regardless, the ecstatic display call can easily be distinguished from the calls associated with pair bowing as well as from the calls of the other Pygoscelis species (Jouventin 1982). It is characterized by a series of repeated pairs of syllables, each comprising a long exhale followed by a short inhale with a highly variable number of syllables (Figure 2). Prior to recent genetic evidence, several authors noted differences in Gentoo Penguins across broad ecoregions and included assessments of vocal similarity. Jouventin (1982) noted that although ecstatic display calls are similar between Macquarie Island, Kerguelen Islands, and Crozet Island, these calls differ from those heard in the Falkland Islands, South Orkney Islands, and South Georgia. Both Jouventin (1982) and de Dinechin et al. (2012) have suggested that ecstatic display calls might therefore be used as an indicator of geographic and reproductive isolation.
In order to more fully investigate vocalizations, we undertook a survey of Gentoo Penguin ecstatic display calls across the Antarctic Peninsula and South Shetland Islands, South Georgia, the Falkland Islands, and Argentina to address 2 main questions: (1) How do ecstatic display calls vary both within and between individuals, colonies, and regions? (2) Can ecstatic display calls be used to distinguish subspecies? These questions address a knowledge gap in both our basic understanding of vocalizations of Gentoo Penguins and how those vocalizations differ in a highly site-faithful bird with a broad geographic range.
Passive soundscape audio recordings were taken during the breeding season at 22 Gentoo Penguin colonies in the Antarctic Peninsula and South Shetland Islands, South Georgia, the Falkland Islands, and Argentina (Figure 1 and Table 1) using Song Meter SM2+ recorders (24,000 Hz sampling rate, stereo recordings). Recordings were taken with stationary units and were not targeted at specific individuals, and as such they recorded the ambient soundscape of the colony from which high-quality individual calls were selected. Audio recorders were placed 3–5 m from one or more small subgroups of nesting Gentoo Penguins within each colony and paired with either a video recorder (GoPro Hero3+) or a time-lapse camera (Brinno TL200) that were used in subsequent analysis to identify, where possible, the individuals associated with each vocalization. Birds were neither tagged nor marked but were identified by the location of the nest they were incubating. All recordings were from colonies during egg or chick incubation, such that only one parent was attending the nest during recordings and usually remained on the nest for the duration of the recording (approximately 2–4 hr). Because the highest-quality audio recordings were frequently from individuals not captured on video (e.g., nearby in the colony but not within the camera frame), not all of the recorded ecstatic display calls could be identified to individual.
Sampling locations in the Antarctic Peninsula, South Georgia, the Falkland Islands, and Argentina, with colony size at time of sampling (number of breeding pairs), number of ecstatic calls used in the analysis, and number of those ecstatic calls that could be identified to individual.
Ecstatic display calls were analyzed in Raven sound analysis software (Bioacoustics Research Program 2014; window size = 625 samples, overlap = 65%, DFT size = 2,048 samples). Ecstatic display calls were identified within the recordings using a band-limited energy detector, selected on the basis of quality, and manually classified. We defined the ecstatic display call as any call that followed the pattern described in Jouventin (1982) with a repeated series of long, low-frequency exhale syllables and short, higher-frequency inhale syllables. Although the mutual display call is almost identical to the ecstatic display call (Jouventin 1982), only calls made by a single individual were selected, so it is highly unlikely that any mutual display calls were included in this analysis. Given that recordings were usually taken between mid-morning and late afternoon, pair exchange on the nest was unusual, further decreasing the likelihood of mutual display calls being included in the analysis. Duration, center frequency, 5% frequency, 95% frequency, peak frequency contour (PFC), PFC slope, and peak frequency inflection points were measured for each individual syllable as well as for the entire call (Figure 1 and Table 2). A total of 544 calls were analyzed from 14 colonies in the Antarctic Peninsula and South Shetland Islands (n = 359 calls), 5 colonies in South Georgia (n = 117 calls), 2 colonies in the Falkland Islands (n = 41 calls), and 1 colony in Argentina (n = 27 calls). Of those calls, 183 were identified to individual (Figure 2 and Table 1). Because the number of syllables was highly variable, we included in our analysis only measurements for the entire call and for the first 2 syllables (the first exhale and first inhale) of each call.
Descriptions of spectrogram measurements used in our analysis (PFC = peak frequency contour). Measurements were chosen from a suite of measurements in Raven sound analysis software.
Ecstatic display calls were selected only if they could be isolated without any interference from other animal vocalizations (e.g., chicks, flying birds, elephant seals) or from other background noise. Given that there may be differences in acoustic environment between sites and especially between regions (e.g., rock and ice habitat on the Antarctic Peninsula and tussock grass habitat in South Georgia), background noise was filtered out from each selection made in Raven sound analysis software. The low-frequency filter was minimized for each site and ranged from 100 to 150 Hz. After analyzing background noise at select colonies from each region, we found that although the center frequency of background noise was highest in South Georgia and the Antarctic Peninsula, the 95% frequency of background noise was consistently below the 5% frequency of any ecstatic call measured, minimizing the possibility that background noise interfered with our analysis.
Call measurement data were standardized and then visualized with principal component analysis (PCA). A nested random-effects analysis of variance (ANOVA) on the first principal component was used to partition variation between individuals, colonies, regions, and subspecies. We first ran the nested ANOVA on only those calls (n = 183) from individuals that could be identified. We then repeated the analysis on the entire dataset (n = 544), using 2 different assumptions about the identity of unknown individuals (thus covering the range of possible pseudoreplication among unidentified calls). In the first scenario all unidentified calls within each colony were considered to be from the same individual, and in the second scenario all unidentified calls within each colony were considered to be from unique individuals.
Because the first principal component captured only a portion of the variation among calls, we also used a nonparametric permutation test (n = 5,000 permutations) on the multivariate analysis of variance (MANOVA) F-statistic to quantify the effect of colony and region on the suite of measurements for the entire call, syllable 1, and syllable 2. Permutations at the region level maintained the colony identity of each call but permuted the region associated with each colony. With only 2 subspecies and 4 regions, we did not have enough power to detect a statistically significant effect of subspecies through permutation of the subspecies–region relationship, so differences associated with subspecies were examined by permuting the subspecies associated with each colony instead.
As a third approach to investigating differences among calls, we trained a random forest (RF) machine learning algorithm (R package “h2o”; Aiello et al. 2016; sample rate = 0.8, number of trees = 5,000) on a known subset of calls using the suite of measurements for the entire call, syllable 1, and syllable 2, and then classified calls to which the algorithm was naive. To address the disparity in sample sizes between categories, a random subsample of calls (n = 82 for each subspecies, n = 123 for each region) was used in the RF analysis.
Although the existence of unidentified individuals may raise concerns regarding pseudoreplication for the MANOVA and RF analyses, we had few repeat calls from the same individuals from which individuals could be identified, and it is reasonable to assume that repeated calls would occur at a similarly low rate among unidentified individuals. For relevant statistical methods, tests with P values <0.05 are considered strong evidence against the null hypothesis and are referred to as statistically significant. Samples of audio recordings for each site have been deposited in Dryad (DOI: 10.5061/dryad.rm228); videos are available upon request from the authors.
Ecstatic display calls were characterized by wide variation with respect to several measures of frequency and duration. Calls ranged from 2 to 15 syllables and from 0.8 to 5.3 s (mean = 2.66 s) in duration and had center frequencies that ranged from 117 to 2,203 Hz (mean = 770 Hz) in syllable 1 and from 117 to 3,023 Hz (mean = 858 Hz) in syllable 2. The 5% frequency (a measure of the 5% quantile of power within the spectrogram) varied between 105 and 668 Hz (mean = 225 Hz) in syllable 1 and from 106 to 891 Hz (mean = 235 Hz) in syllable 2, indicating that spectral power was concentrated in the low frequencies for both syllable types.
We found significant variation both within and between colonies, and although comparisons of select colonies within the PCA showed differences in colony- or region-specific ellipse area and location, there was no clear pattern (Figure 3 and Table 3) or linear relationship between single variables and latitude (e.g., 5% frequency, P = 0.09; center frequency, P = 0.93). We did find a slight negative trend for change in center frequency (P < 0.001) and a slight positive trend for change in 5% frequency (P < 0.001) when compared to inter-colony distance—although, given the considerable variation in these measures of similarity, it is not clear whether these trends are biologically significant (Figure 4).
Variable loadings for each acoustic measurement for the first 5 principal components (PC1–PC5) in the principal component analysis, with the percentage of variation explained by each principal component in parentheses (PFC = peak frequency contour).
Using a 3-factor random-effects nested ANOVA on the first principal component (PC1) for the subset of identified individuals, we can attribute a large amount of the variation to differences among colonies (30.20%) and individuals within colonies (35.80%), but no significant variation was associated with region or subspecies. When using the entire dataset that includes calls from individuals of unknown identity, the results are robust to our treatment of these unknown individuals. We find similar results whether we classify all unidentified calls as coming from unique individuals (colonies: 39.18%, individuals: 21.78%) or classify all unidentified calls as coming from the same individual within each site (colonies: 32.13%, individuals: 21.57%), indicating that unknown identifications are unlikely to skew our analyses.
Consistent with our nested ANOVA analysis, the nonparametric permutation test on the MANOVA F-statistic for the suite of measurements revealed highly significant differences between colonies (F = 3.72, P < 0.001), but no significant difference between regions (F = 3.81, P = 0.47) or subspecies (F = 3.81, P = 0.43).
The RF algorithm was able to classify unknown calls into correct colonies better than an untrained random classification (30.0% vs. 5.4% accuracy), consistent with genuine differences between colonies, but error rates in classification remained high. At the regional level, it could correctly classify calls from the Antarctic Peninsula (class error = 14.0%) but performed poorly for other regions (mean per-class error = 40.3%). While the RF algorithm did correctly classify calls into subspecies (mean per-class error = 20.6%), the ANOVA and MANOVA results suggest that this classification may be due to differences between colonies (which are nested within subspecies) rather than true differences between subspecies. All analyses consistently ranked various measures of frequency rather than those related to duration as the most important variables for classification (Table 4).
The 5 most important variables from random forest machine learning for subspecies classification.
We found a high degree of between-individual variation in ecstatic display calls within Gentoo Penguin breeding colonies. Even with this large within-colony variation, we found significant differences between colonies, which can be attributed primarily to frequency parameters of the ecstatic display call. Long-term geographic and reproductive isolation in this highly site-faithful species may have resulted in differentiated vocal traits between breeding colonies. These colony-specific vocalizations may drift over time and may be mostly independent of the characteristics of other colonies.
Based on the RF variable importance values, we found that Gentoo Penguin ecstatic display calls are most easily differentiated on frequency-related variables, even though the duration of calls, in terms of both temporal length and number of syllables, is highly variable. This is consistent with previous work by Jouventin and Aubin (2002) that found frequency to be the key variable for individual recognition between Pygoscelis spp. chicks and their parents, and that changes in pitch of as little as 25 Hz may affect the ability of a chick to recognize its parents. As such, the frequency differences of >100 Hz that we observed between colonies are likely to be biologically meaningful in terms of penguin behavior.
In addition to the variation between colonies, we found a large amount of variation between individuals within the same colony. It may be beneficial for an individual to be differentiated from others in the colony if this differentiation allows for mate recognition, though high colony fidelity suggests there may be little benefit to differentiated vocalizations beyond the immediate geographic area of the breeding colony. The independent origin of each colony's vocal portfolio results in variation but shows no discernible geographic pattern in ecstatic display calls across the Gentoo Penguin range. Geographic variation likely arises by slow drift over time between colonies, whereas within-colony variation is more likely to reflect an active process, occurring on faster time scales, that exploits what appears to be a relatively distinctive individual trait.
These findings are important, considering the vocal differentiation between the Indo-Pacific sub-Antarctic islands and the Atlantic sub-Antarctic islands described by de Dinechin et al. (2012) and Jouventin (1982). The present study is the most comprehensive analysis of geographic variation in Gentoo Penguin ecstatic calls to date and provides a finer geographic scale at which to examine vocal differentiation. Genetic data from de Dinechin et al. (2012) and Levy et al. (2016) show the Falkland Islands populations as divergent clades from the Antarctic Peninsula, South Georgia, and the South Orkney Islands. The polar front provides a strong ecological barrier that is likely to maintain this separation and may have led to drift of ecstatic display calls over a long period of geographic isolation. Although the RF algorithm was able to successfully classify subspecies, given the nonsignificant findings in both the ANOVA and MANOVA permutation analyses, we suspect that differences between subspecies may be difficult to discern and may stem from inter-colony differences rather than robust differences between the 2 subspecies. As such, we suggest caution in inferring subspecies based on recorded vocalizations of individuals.
Although our results suggest that colony-level variation complicates classification of subspecies, the ability to differentiate subspecies vocal characteristics would have interesting implications for determining the origin of new colonies. The population at Isla Martillo in Argentina is relatively new, and it was suspected that these penguins were related to the Falkland Islands populations (Raya Rey personal communication). Surprisingly, the RF algorithm classifies them as P. papua ellsworthii when it is trained on data that exclude the Argentina population, and those calls differed significantly from those of all other regions in post hoc Dunn tests for frequency variables of the entire call as well as both syllable 1 and syllable 2. However, given the challenges we have identified in determining subspecies designations through acoustic analyses alone, genetic analyses will be necessary to determine the origin of the Isla Martillo population.
Future investigation into the degree of plasticity and the role of genetics in vocal characteristics may help disentangle how these processes play out on behavioral, ecological, and evolutionary time scales. Playback experiments may expand our understanding of individual recognition and may also help us determine how individuals become differentiated from their neighbors and whether that process happens continuously or during a set phase of development. Understanding vocal characteristics of Gentoo Penguins and how those traits vary between individuals and regions may give us a better understanding of behavioral ecology and how individual interactions shape ecological processes such as the assembly and establishment of new colonies.
We thank N. Bender, A. Borowicz, C. Foley, C. O'Leary, P. McDowall, M. Schrimpf, and C. Youngflesh for help in data collection; L. Cooper, J. Enzmann, E. Muchnick, and M. Pandey for help in data analysis; and J. A. Clark for advice and consultation. We thank the reviewers for their comments on the manuscript. We also thank Oceanites Inc., One Ocean Expeditions, and Cheesemans' Ecological Safaris for travel and field support.
Funding statement: Funding was provided by Oceanites Inc. and the Tinker Foundation. No funders had input into the content of the manuscript or require approval of the manuscript before submission or publication.
Ethics statement: This research did not require handling any birds and was conducted under Stony Brook University IACUC no. 237420, Antarctic Conservation Act no. ACA 2014-024, and South Georgia & The South Sandwich Islands Regulated Activity Permit no. 2016/035.
Author contributions: M.A.L. collected and analyzed the data. M.A.L. and H.J.L. formulated the questions and wrote the manuscript.