Species distribution modeling (SDM) is a booming area of research that has had an exponential increase in use and development in recent years. We performed a search of scientific literature and found 5,533 documents published from 1993 to 2018 using SDM, representing a global network of 4,329 collaborating institutions from 155 countries, with Brazil and Mexico being in the top 10 of the most prolific countries globally. National Autonomous University of Mexico, Chinese Academy of Sciences, University of Kansas, and U.S. Geological Survey are the most prolific institutions worldwide. Latin American institutions (n = 556) participated in 1,000 (18% of global productivity) documents published in collaboration with 591 institutions outside Latin American countries, from which the National Autonomous University of Mexico, Federal University of Goiás, Institute of Ecology A.C., National Scientific and Technical Research Council in Argentina, University of São Paulo, and University of Brasilia were the most productive. From this body of literature, the most frequently modeled taxonomic groups were Chordata and Insecta, and the most common realms of application were conservation planning and management, climate change, species conservation, epidemiology, evolutionary biology, and biological invasions. From the 36 modeling methods identified to generate SDMs, MaxEnt is used in 73.5% of the papers, followed by Genetic Algorithm for Rule-Set Prediction (GARP) with 18.7%, and just 7.4% of the papers compared between 3 and 10 modeling methods. In Latin American countries, productivity in SDM research could be improved as the network of collaborations diversifies and connects with other productive countries (such as United Kingdom, China, Spain, Germany, Australia, and France). The scientific collaboration between Latin American countries should be increased, as the most prolific countries (Brazil, Mexico, Argentina, and Colombia) share less than 10% of its productivity. Some of the main challenges for SDM development in Latin America include bridging the gaps from (a) software use to research productivity and (b) translation to decision-making. To address these challenges, we propose to strengthen communities of practice where modelers, species experts, and decision-makers come together to discuss and develop SDM to shift and enhance current paradigms on how science and decision-making are linked.
Introduction
Species distribution modeling (SDM) is one of the most recent and robust tools used in ecology, biogeography, and macroecology (Soberón, Osorio-Olivera, & Peterson, 2017). SDM is a method of assessing areas that provide suitable abiotic environments (often, climate) for a given species, using the relationship between observed points of occurrence and environmental variables, to generate a spatial prediction of regions within which environmental conditions are suitable for species survival and population growth. Previous reviews of SDM have focused on their use to predict the distribution of invasive species (Barbosa, Schneck, & Melo, 2012), to determine MaxEnt model usage in wildlife research (Baldwin, 2009), and to inform conservation practitioners in decision-making (Villero, Pla, Camps, Ruiz-Olmo, & Brotons, 2017), among others. Cayuela et al. (2009) pointed out that 39% of the papers they reviewed on SDM were mainly focused on the development of new methodologies and the evaluation of performance of different methods, while the other 61% of the studies applied SDM in contexts such as species conservation, biological invasions, climate change, autecology, and biogeography.
The state of knowledge of biodiversity at the species’ level tends to be lower in megadiverse countries (Meyer, Kreft, Guralnick, & Jetz, 2015), in which the so-called Wallacean shortfall is evident (i.e., the incomplete knowledge of species distributions; Lomolino, 2004). Latin America harbors 58.3% of the world’s megadiverse countries (United Nations Environmental Program/Convention on Biological Diversity, 2016), and in this region, a very high proportion of endemic species is specially problematic for overall biodiversity knowledge as there is scarce data on geographical distribution for some highly rare species that occur in specialized microhabitats (Urbina-Cardona & Loyola, 2008). While uncertainty in species distributions is highly unknown for Latin America (but see modeling methods to estimate uncertainty in Diniz-Filho et al., 2009; Loyola, Lemes, Faleiro, Trindade-Filho, & Machado, 2012; Sales, Neves, De Marco, & Loyola, 2017), countries in the region are increasing their economic dependence on extraction activities, profoundly impacting their ecosystems (Rosales, 2008; Villarroya, Barros, & Kiesecker, 2014). As pinpointed by Villero et al. (2017), SDM should inform decision-making for unknown and rare species, but the absence of information should be viewed as an opportunity for collaboration between institutions more than a weakness of the region. Networks of scientific collaboration could pave the road to improve species’ knowledge not just on their distributions but on habitat requirements, tolerance to disturbance, and population dynamics, to have robust forecasting on the impacts of environmental change on species’ conservation status (Franklin, 2010).
In this regard, it is crucial to evaluate and understand publication patterns to identify possible taxonomic and ecosystem bias, as well as opportunities for collaboration both within the region and with institutions from other regions, to accelerate the generation of new knowledge on SDM and to reduce the Wallacean shortfall for Latin American countries. In recent years, an exponential increase in online biodiversity data (e.g., GBIF.org), environmental data (Worldclim v1: Hijmans, Cameron, Parra, Jones, & Jarvis, 2005; CliMond; Kriticos et al., 2012; Microclim: Kearney, Isaac, & Porter, 2014; Worldclim v2: Fick & Hijmans, 2017; Chelsa: Karger et al., 2017; Envirem: Title & Bemmels, 2017; MERRAclim: Vega, Pertierra, & Olalla-Tárraga, 2017), and open software to run models (R programming language) have resulted in increased productivity of SDMs with several new modeling methods developed (Figure 1(a)). Thus, it is essential to have an updated review of SDM usage from Latin American institutions.
Here, we aim to (a) identify the spatial and temporal pattern of publications using SDM from Latin American countries, from a global perspective; (b) identify the network of collaborations at both national and institutional level that resulted in SDM publications from Latin American institutions in the scientific literature; and (c) characterize SDM research conducted by Latin American institutions according to taxonomic groups, realm of application, geographic region, ecosystem type, and modeling methods used. We then expand upon our results to identify the main challenges for SDM development in Latin America and discuss ways to overcome them.
Methods
One of the authors (H.M.-D.) did a structured literature search on Web of Science (WoS) and SCOPUS for scientific literature up to December 31, 2018, using a query string for each database (see online Appendix A). The metadata for each document was compiled and standardized using tech mining techniques (Porter & Cunningham, 2004) and structured into a database. For data processing, we used Vantagepoint version 10.0 (Porter & Cunningham, 2004), a text mining software that allows incorporating and standardizing large volumes of information from multiple online research sources, such as SCOPUS and WoS, to analyze and visualize information to find patterns and relationships (Search Technology, 2018). With Vantagepoint, we deleted duplicates (there was 69% duplicity between WoS and SCOPUS) and obtained 13,388 unique documents. The title of each document was read by the first author (N.U.-C.) excluding 7,109 documents in thematic areas outside disciplines of interest (online Appendix B). From the resulting 6,279 documents, N.U.-C. read the title, abstract, and keywords sections and excluded 746 documents that did not employ SDM (online Appendix B). The final database of 5,533 documents represents the global studies on SDM included in qualitative synthesis (online Appendix B).
From this compilation of global literature, we obtained the country and institutional affiliation of each coauthor. We then normalized the country and institution names, filtered Latin American institutions, and created co-occurrence matrices (online Appendix C). The Latin American database was composed of 1,000 documents with the participation of at least one coauthor from a Latin American institution. One of the authors (H.M.-D.) extracted 4,481 words from the title, abstract, and keywords, then N.U.-C. identified 1,661 pertinent terms, with a frequency of occurrence of more than 3 times, and conducted a more in-depth qualitative analysis, classifying each keyword into four different categories: taxonomic group, realm of application, geographic region, and ecosystem type. Each category was visualized through a foam tree (Carrot Search FoamTree, 2019) in which the available space is divided into polygons of different color and size, depending on their class and frequency, respectively.
To identify the modeling methods used on each document to model species distributions, N.U.-C. read the title and abstract sections of each document, from the Latin American database. In addition, J.V.-T. classified the different modeling methods into three categories: profile methods (i.e., methods that use only presence data to estimate geographic distributions; this includes envelope methods such as BIOCLIM, distance methods—DOMAIN, and multivariate methods—ecological niche factor analysis); statistical methods and machine learning methods. A total of 687 documents reported the modeling method in title and abstract sections (68.6% of the whole Latin American database) and were migrated to VOSviewer (van Eck & Waltman, 2010). VOSviewer allows to build and visualize bibliometric maps of keywords based on co-occurrence data, using the modularity-based cluster and mapping techniques that minimize a weighted sum of squares of Euclidean distances between all pairs of items (categories) through an optimization process (van Eck & Waltman, 2013).
Based on the body of global and Latin American documents, we built and visualized collaboration networks at both national and institutional levels in which node-based centrality (number of links that each country or institution has) was applied to measure the importance of a node within a network (Bonacich, 1987; Freeman, 1977). We visualized the number of papers published by country in a map, using the COM complement Power Pivot for Excel—Office 2016.
We also used data from the American Museum of Natural History on the number of downloads of MaxEnt SDM software between December 12, 2016, and June 8, 2018 (178 days after the new open source version of MaxEnt became available; Phillips, Anderson, Dudik, Schapire, & Blair, 2017), to compare patterns of MaxEnt software downloads with SDM research publication patterns by country.
Results
Global Network of SDM Research
The global network of collaboration was composed by 155 countries, of which the 10 most productive, in descending order of productivity, were United States, United Kingdom, China, Spain, Germany, Australia, France, Brazil, Mexico, and Canada (Figure 1(b)). The United States led the global scientific production extensively with 1,863 papers (Figure 1(b)). Further, the 15 countries that represented the nodes with the highest centrality degree (with collaborations with more than 30 countries) were, in descending order, United States, Germany, United Kingdom, France, Spain, Australia, Canada, Italy, Netherlands, Switzerland, Brazil, China, Belgium, Denmark, and South Africa, indicating their roles as a collaboration hubs (Figure 1(c)). Worldwide, the most important collaboration networks occurred between Portugal and Spain (n = 84 published papers in collaboration), United Kingdom and Spain (n = 69), Germany and United Kingdom (60), as well as between United States and United Kingdom (n = 144), Australia (n = 135), Mexico (n = 112), China (n = 95), Canada (n = 94), Spain (n = 89), Brazil (n = 86), France (n = 86), and Germany (n = 85; Figure 1(c)). From this network of collaboration, the most productive Latin American countries were Brazil, Mexico, Argentina, Colombia, Chile, and Ecuador, with 356, 350, 145, 103, 57, and 39 papers, respectively (Figure 1(b, c)).
Latin American Networks and Their Articulation With the World
The Latin American collaboration network was composed of 26 countries that mainly interact with 60 countries worldwide, from which the most productive collaborations (represented by links known as edges) are with United States, Spain, United Kingdom, Germany, and Australia (Figure 2(a, b)). The strongest network of collaboration was evidenced between United States with Mexico (n = 112 collaborations), Brazil (n = 86), Colombia (n = 44), Chile (n = 21), and Ecuador (n = 21); followed by the interactions of Brazil with United Kingdom (n = 23) and Germany (n = 21); as well as between Spain and Mexico (n = 28), Brazil (n = 15), Argentina (n = 11), Ecuador (n = 11), Colombia (n = 8), and Chile (n = 8; Figure 2(b)). Further, the 10 countries that represented the nodes with the highest centrality degree (with collaborations with more than 30 countries) between Latin America and the rest of the world are, in descending order, United States, Brazil, Australia, United Kingdom, Mexico, Colombia, Spain, Germany, France, and Canada (Figure 2(a, b)).
The top 10 of the most prolific Latin American countries are Brazil (n = 356), Mexico (n = 350), Argentina (n = 145), Colombia (n = 103), Chile (n = 57), Ecuador (n = 39), Costa Rica (n = 20), Peru (n = 20), Uruguay (n = 19), and Bolivia (n = 18; Figure 2(a, b)). However, when considering the number of published documents per country population density (population/km2), a new ranking pattern appears. In descendant order, the top 10 prolific countries are Brazil, Argentina, Mexico, Colombia, Chile, Bolivia, Uruguay, Peru, Ecuador, and Venezuela. In contrast, when considering the number of published documents per country population, the ranking pattern changes: Uruguay, Costa Rica, Argentina, Chile, Mexico, Guyana, Panama, Ecuador, Colombia, and Suriname.
The most robust collaboration edges between Latin American countries were evidenced between Brazil with Argentina (n = 18), Mexico (n = 17), and Colombia (n = 11) and between Mexico with Colombia (n = 17), Ecuador (n = 10), and Argentina (n = 8); these collaborations represent just between 2.2% and 10.7% of the country’s publications. From Central American institutions, Costa Rica, Panama, and Nicaragua were the most prolific with 60, 27, and 11 published documents, respectively, followed by Honduras, Guatemala, and El Salvador (with 7, 6, and 2 publications, respectively). From the Caribbean islands, Cuba and Puerto Rico were the most prolific with 16 and 5 publications, respectively, followed by Jamaica, Antigua and Barbuda, Dominican Republic, and Haiti (with 3, 3, 2, and 1 publications, respectively; Figure 2(a)).
From an institutional perspective, the global collaboration network was composed of 4,329 institutions, from which 51 published more than 34 papers each (mean 61.6), shaping a major component of the network (Figure 3(a)). Top 10 institutions included two from Latin America and, in descending order of production, were National Autonomous University of Mexico (UNAM), Chinese Academy of Sciences, University of Kansas, U.S. Geological Survey (USGS), Consejo Superior de Investigaciones Científicas, Federal University of Goiás (Brazil), University of Melbourne, University of Porto, U.S. Forest Service, and Colorado State University. Based on the amount of productivity, it was possible to identify that the UNAM had the highest centrality degree among institutions worldwide, being the most prolific institution, with 199 published papers, collaborating with other 225 institutions (38.2% from Latin American countries) in the network. In contrast, the second most prolific institution, the Chinese Academy of Sciences, collaborated with 17 institutions but published 70% of the documents in collaboration with a single institution (University of Chinese Academy of Sciences). There was no collaboration between the UNAM with the Chinese Academy of Sciences (Figure 3(a)).
University of Kansas and USGS (USA) are in the third and fourth level of productivity, with 161 and 125 published documents, respectively. Within the major component of the global network, University of Kansas has collaborated with 28 institutions of which five are from Latin America (UNAM, Institute of Ecology A.C.—INECOL, University of São Paulo (USP), Federal University of Goiás, and University of Brasilia), while USGS collaborated with just 18 institutions showing no interaction with Latin American institutions (Figure 3(a)).
The Latin American network was composed of 556 Latin American institutions that collaborate with 519 other institutions outside Latin American countries (online Appendix C). A total of 45 Latin American institutions and 13 institutions from outside, published more than 10 papers each (mean 22.8) shaping the major component of the network (Figure 3(b)). From those, the top institution is UNAM, mainly collaborating with University of Kansas, INECOL, and University of Texas at Austin (17%, 10.5%, and 6.5% of UNAM publications, respectively). The second most productive Latin American institution was Federal University of Goiás collaborating with 24 institutions, mainly with State University of Goias, Consejo Superior de Investigaciones Científicas, Federal University of Paraná, UNAM, and National Institute of Amazonian Research (between 5 and 7 shared papers). The other four most prolific Latin American institutions were INECOL (55 papers published with 17 institutions), National Scientific and Technical Research Council in Argentina—CONICET (52 papers published with 19 institutions), USP (51 papers published with 28 institutions), and University of Brasilia (36 papers published with 20 institutions; online Appendix C).
Thematic Characterization of Latin American Research on SDM
From the characterization of keywords reported among the 1,000 SDM papers published with collaboration by Latin American institutions, the main patterns were as follows:
Taxonomic group: From the 25 taxonomic groups of species modeled (from Kingdom to Class or Division), the most frequent were Chordata (45%), Insecta (17%), and Magnoliophyta (7%; Figure 4(a)).
Realm of application: From 30 types of thematic areas of research, 70% of the keywords mainly were related with studies in the following topics: Conservation planning and management, Climate change, Species conservation, Epidemiology, Evolutionary biology, and Biological invasions. Each category represents a contribution between 4% and 13% of the categorized words (Figure 4(b));
Geographic region of the SDM: From the 25 types of geographic regions, the most frequent were South America and Mesoamerica with contributions of 37% and 28% categorized words, respectively (online Appendix D). The most frequent countries were Mexico (29%), Brazil (14%), Argentina (6%), and Colombia (5%; online Appendix D).
Ecosystem type of the SDM: From the 17 types of ecosystems, the most frequent were rainforest, mountain ecosystems, and tropical dry forest with contributions between 14% and 29% of the categorized words (online Appendix E).
Modeling Methods Used to Develop SDM in Latin American Research
From the detailed reading (title, abstract, and keywords) of the 1,000 papers in which Latin American institutions collaborated, we identified 673 papers (excluding 14 review papers that did not conduct any SDM modeling and were excluded from the bibliometric mapping) that used 36 methods (in three categories) to model species distributions (see online Appendix F). The first modeling method to model distributions in which Latin American institutions participated, was GARP (Sánchez-Cordero & Martínez-Meyer, 2000; Soberón, Golubov, & Sarukhán, 2001).
It is well recognized that there is no guarantee that a single modeling method will be optimal to model species distributions for all data sets and geographical realms (Mateo, Felicísimo, & Muñoz, 2011). However, It was only after 2006 that researchers began to compare SDM with more than two modeling methods (generalized additive models, GARP, BIOCLIM, generalized linear models—GLM, MaxEnt, DOMAIN, FloraMap, Weights of evidence, Mahalanobis Distance and Random Forests, among others; online Appendix G). Until 2018, only 50 articles (7.4%) compared between 3 and 10 modeling methods, from which only 6 compared more than 7 methods (Escobar, Qiao, Cabello, & Peterson, 2018; Ochoa-Ochoa, Flores-Villela, & Bezaury-Creel, 2016; Queiroz et al., 2013; Reiss, Cunze, Konig, Neumann, & Kroncke, 2011; Tessarolo, Rangel, Araújo, & Hortal, 2014; Tognelli, Roig-Juñent, Marvaldi, Flores, & Lobo, 2009; online Appendix F, G).
We found that 73.5% of the analyses used MaxEnt, followed by GARP (18.7%), GLM (6.4%), BIOCLIM (4.6%), Random Forests (5%), and generalized additive models (3.1%; Figure 5(a)). The most common combination of performed analyses are MaxEnt models with GARP (n = 47), BIOCLIM (n = 22), GLM (n = 17), and Random Forests (n = 16; Figure 5(a), online Appendix F, G). Our results highlight MaxEnt as the most commonly used software to model the distribution of species from Latin American institutions. This finding is further supported by MaxEnt download data showing that between December 12, 2016, and June 8, 2018, 178 days after the new open source version of MaxEnt became available (Phillips et al., 2017), 27,472 downloads were made from 156 countries, of which 19.7% were made in the United States, 9.2% in Mexico, 7.5% in Brazil, 6% in China, and 5.5% in Colombia (Figure 5(b); data from the American Museum of Natural History). Latin American countries made 8,861 downloads, 32.2% of total downloads, during this period. Also, after controlling for the population size of countries, 5 Latin American countries emerged among the top 10 countries with the highest number of downloads per million people during this period: Costa Rica, French Guiana, Colombia, Ecuador, and Chile.
Discussion
According to our results, research on SDM started in the early 90s and evolved with more than 5,533 published papers and 35 modeling tools used as of 2018 (Figure 1(a)). We propose the following strategies to boost the generation of new knowledge on SDM, to reduce the Wallacean shortfall and to overcome this great challenge of environmental degradation within the Latin American countries.
Interaction Opportunities Visualized From Institutional Collaboration Patterns
We conclude that the United States is the country with the highest scientific production in SDM, as it is for other scientific areas as shown in Table 1 (SCImago, 2019). However, the United States is important for collaborations between countries as well, but our study presents a different per country publication pattern than the one reported by Barbosa et al. (2012) for SDM of invasive species (Table 1).
Table 1.
Top Five Most Prolific Countries in Different Subject Areas and Categories of Knowledge.
Countries in Latin America collaborate in 18% of the global production of SDM academic publications including the development of new modeling tools (Figure 1(a)). Brazil and Mexico are among the top 10 countries worldwide and become a reference at the international level, especially among other Latin American countries. This correlates with the results of the SCImago Journal & Country Rank, a portal that provides scientific indicators to assess scientific domains, where at a global level Brazil is in the top 15 and Mexico in the top 29 for academic production in all subject areas category (Table 1; SCImago, 2019).
Within the Latin American region, the most productive institutions on SDM is the UNAM, which is also number one worldwide, followed by Federal University of Goiás, USP, and University of Brasilia, those institutions are ranked by SCImago (2019) in Latin America as the 3rd, 45th, 1st, and 25th positions, respectively.
The geographic scope of the SDM research in Latin America coincides with the four more scientifically productive countries Mexico, Brazil, Argentina, and Colombia (Figure 2(a, b); online Appendix D). Although the collaboration network in Latin America is complex, and involves 12 countries, research in the region might be reducing the Wallacean shortfall only on those countries. It would be desirable that future collaboration reduces the Wallacean shortfall not only in the most productive countries but expands to other less studied geographic areas.
It is interesting how the number of collaborations with other institutions has a contrasting pattern between the top 1 and 2 global ranking institutions. While UNAM collaborated with 225 other institutions, the Chinese Academy of Sciences collaborates with only 17 other institutions. This suggest that in contrast to China, collaboration is a crucial issue for ranking universities in Latin America. The Chinese Academy of Sciences is the second institution that globally has the most publications on SDM. Based on its research performance, innovation outputs and societal impact, the Chinese Academy of Sciences ranks first among global institutions as measured by their web visibility (SCImago, 2019). However, in Latin America, this institution only collaborates with Guatemala and Chile (online Appendix C).
The implementation of collaboration strategies between Latin American institutions and those outside Latin America should continue to be enhanced through fellowships, capacity building, sabbatical years between researchers, or shared advisors of postgraduate students. Latin American collaboration networks will tend to grow as more researchers return to their country of origin, maintaining collaboration networks created from postgraduate studies and fellowships, and beginning new collaborations with other Latin American countries.
Closing the Gap From Software User to Coauthor: A MaxEnt Study Case
MaxEnt is the most commonly used software to model the distribution of species from Latin American institutions (Figure 5(a)). The first nine documents published, by Latin American institutions, using MaxEnt software were from 2007 to 2008 (online Appendix F, G), between 1 and 2 years after the release of the first version of the software (Phillips, Anderson, & Schapire, 2006; Figure 1(a)). MaxEnt users see this tool as one of those that requires the fewest actions needed to implement the analysis being at the top among modeling tools in terms of usefulness, learning curve, system capabilities, and overall user satisfaction (Ahmed et al., 2015). As mentioned earlier, after the new open source version of MaxEnt was made available (Phillips et al., 2017), it was followed by an enormous number of downloads (Figure 5(b)); 32.3% of which took place in Latin America.
Also, half of the top 10 countries downloading the software (when controlling for population size) were in Latin America. So, if Latin America is indeed powerful within the global community of MaxEnt users, why do Latin American authors and institutions represent only 18% of the global scientific literature on SDM? Looking deeper into download trends, we can examine the 13,155 or 48% of global downloads that included optional institution information. Of this number, 3,385 or 38.2% of total downloads from Latin America (8,861) included institutional information. Of those, the vast majority (2,944 or 87%) of MaxEnt downloads were from academic institutions (universities or university centers/institutes). Downloads from government institutions including research institutes, agencies, national park system administrations, and ministries comprised 384 or 11%, nongovernmental organizations (NGOs) including conservation organizations, private not-for-profit research institutes, and foundations accounted for 102 or 3%. Only 5 downloads or 0.1% were from for-profit companies including consultants.
As mentioned earlier, the most prolific Latin American institutions in global scientific literature were universities (online Appendix C). The software download information also reflects this trend, but the number of MaxEnt downloads from government institutions is also notable, which is not as evident in the observed scientific literature trends. We might assume that if we examined beyond the scientific literature to include white papers and government reports, we might find additional written contributions from these institutions and probably several examples of translations of SDM to decision-making related to biodiversity conservation at NGOs and government agencies (see some examples later). We can also probably assume that there is a great deal of SDM development and research ongoing at universities that is not being published in the global scientific literature, or that is being used for training only. This research may be available instead as student theses or reports in university databases. There may also be further challenges to publishing results such as costs, language, and other factors.
Implementing Best Practices and Integrating SDM Into Monitoring Biodiversity Change
Strengthening of collaborative networks between academic institutions and institutions outside of academia in Latin America would further enhance the translation of SDM research to biodiversity conservation decision-making (as discussed later) and would likely increase the visibility of ongoing and future research using SDM in the global scientific literature. Open source SDM training opportunities and online networks in the Spanish language (such as the Colombian initiative BioModelos—see later—and more informally, the Facebook group “Modelado de Nichos Ecológicos y Distribuciones Geográficas”) are a robust and well-appropriated strategy to facilitate online training and the formation of communities of practice across researchers and practitioners from different sectors and different Latin American countries (Cuervo-Robayo et al., 2017).
We identified that some Latin American countries have great potential to become world leaders in the area of SDM, to the extent that the collaboration between institutions increases and diversifies. As SDM research areas increase in Latin America, good practices must also acquire by researchers, input data should be open access, codes should be shared with complete documentation, and prediction layers should be made available (Breiman, 2001; Guisan & Thuiller, 2005; Humphries & Huettmann, 2018). Latin American countries are moving into open science culture, the number of open access mandates adopted has grown from 1 in 2005 to 54 in 2018 (ROARMAP http://roarmap.eprints.org/, last visit on April 18, 2019).
Megadiverse countries in Latin America, have important reservoirs of mining resources such as gold, nickel, chromium, and copper, among others (USGS, 2018). Those resources attract higher pressure from big companies, that together with oil and timber extraction, push governments to give priority to extractive activities over preservation of biodiversity (Loyola, 2014; Villarroya et al., 2014). Under such scenarios, SDM could provide critical information to make a difference in the decision-making process by improving the understanding and monitoring of biodiversity, as well as focusing spatial conservation strategies and their effective implementation by responsible authorities (Scheldeman & van Zonneveld, 2010); for example, by combining SDMs with other key information as a part of a systematic conservation planning framework (e.g., as in Marxan: Ball, Possingham, & Watts, 2009; or Zonation: Moilanen et al., 2005). Enhancing the training of new generations of biologists and ecologists in computational techniques is needed so that machine learning methods can play an essential role in understanding whole ecosystems by holistic modeling (Humphries & Huettmann, 2018).
New technology can easily be misused when they are not correctly applied, as explicitly seen for SDM (Guillera-Arroita et al., 2015; Humphries & Huettmann, 2018) and model quality should be evaluated (Araújo et al., 2019). SDM are particularly prone to problems arising from a mismatch between data type and intended purpose. Hence for conservation practice is critical to know if the output from an SDM will be appropriate for the intended application (Guillera-Arroita et al., 2015). The majority of the SDM produced in Latin America comes from presence-background data, meaning that their outputs represent in most cases relative likelihoods of observation, with subsequent limitations in their applications. As such Latin America needs to move toward the collection of more informative survey data of the occupancy-detection type so that better inference of species distribution can be made (Guillera-Arroita et al., 2015).
SDM in Latin America have been used in multiple realms of application but with a considerable bias toward vertebrates and insects, and in three main ecosystems (rainforests, tropical dry forests, and mountain ecosystems). Researchers in Latin America have explored multiple applications for SDM; we encourage that this diverse level of application continues in an interdisciplinary manner. Is also important to notice that one of the most threatened ecosystems worldwide (tropical dry forest: Hoekstra, Boucher, Ricketts, & Roberts, 2005) is also one of the best represented in SDM research (online Appendix E). Also, the most critical global change drivers, land use change, biological invasions, and climate change (Sala et al., 2000) are among the most studied thematic from SDM perspective (Figure 4(b)). Those results show indeed the awareness by the research community of the importance of SDM for conservation in the regions.
For research results to have a higher incidence in the decision-making process, the importance of SDM needs to be also understood by decision-makers. There is no clear sustainable future in some megadiverse neotropical countries in which biodiversity is not a priority for central governments that are committed with leading economy countries (such as China and US), while local poverty increases in some regions, in which illegal drug business and big companies exhibit local power on land decisions (Brocket, 1990; Huettmann, 2015). As noticed by Humphries and Huettmann (2018), SDM study is rarely linked to effective conservation management, as they are not referred to in most policy or legal decisions, and SDM studies widely lack a reflective component that advances ethical and societal questions. How can we move to translate scientific results into effective conservation management? We propose that in Latin America, SDM must be placed in a broader context between monitoring data and specific management and conservation contexts. New biodiversity monitoring frameworks may provide robust tools to articulate user needs with scientific outputs, Biodiversity Observation Networks could be developed in Latin America and facilitate explicit knowledge to transit into tacit knowledge, addressing societal and economic needs, and thus increase research incidence on conservation (Navarro et al., 2017).
Building Communities of Practice Around SDM: The Case of BioModelos in Colombia
SDM ideally must rely on enough occurrence data as well as knowledge regarding species’ environmentally limiting factors and dispersal limitations (Anderson, 2012; Araújo & Peterson, 2012). Although occurrence data are increasingly available in open access platforms such as Global Biodiversity Information Facility (Meyer et al., 2015), albeit, with questionable quality (Anderson et al., 2016), knowledge on species ecology and evolution is also insufficient but crucial for the development of biologically meaningful models (Diniz-Filho, Loyola, Raia, Mooers, & Bini, 2013). To remedy this shortfall, the Colombian initiative BioModelos brings together modelers with experts who can inform model development and assess the reliability of models (Velásquez-Tibatá et al., 2019). We present this initiative here as an example of collaborative knowledge building for improving SDM.
In BioModelos, experts are arranged into groups according to thematic areas of interest, and each group has a defined set of species of interest. Experts contribute through an online platform (biomodelos.humboldt.org.co) to any of the following activities for each species in their group: (a) occurrence data cleaning; (b) delineation of accessible area; (c) identification of suitable land covers; (d) identification of areas of model over/underprediction; (e) selection of suitability thresholds to create binary models; and (f) qualitative evaluation of model accuracy. A core team develops and edits models according to expert inputs. Once this collaborative modeling process is completed, models become available for visualization and download in standard GIS formats along with metadata documenting the modeling process on the BioModelos webpage. Importantly, the metadata acknowledges the contribution of everyone involved in model development.
Thus far, 475 experts have joined at least one of the 20 expert groups in BioModelos, where they contribute to the development of models for 980 species (Velásquez-Tibatá et al., 2019). These models have been particularly crucial for the Colombian government in the evaluation of species extinction risk (Renjifo, Amaya-Villareal, Burbano-Girón, & Velásquez-Tibata, 2016) and the development of national conservation action plans (e.g., cycads and primates: http://www.minambiente.gov.co/index.php/bosques-biodiversidad-y-servicios-ecosistematicos/fauna-y-flora/programas-de-conservacion#documentos). Along those assessments and plans, qualitative assessments of the biological realism of spatial predictions by qualified experts are essential to minimize costly commission errors (Guisan et al., 2013; Lobo, Jiménez-Valverde, & Hortal, 2010) that cannot be measured unbiasedly with presence-only data (Lobo, Jiménez-Valverde, & Real, 2008). However, model users also report their use for basic research (33%) and educational activities (32%) and varied activities such as environmental impact assessments, bioprospecting, and applied research (Velásquez-Tibatá et al., 2019).
The implementation of BioModelos in Colombia shows that it is possible to go a long way toward addressing data incompleteness and quality issues by involving experts in model development. By using transparent workflows and making data freely available, this approach contributes to diminishing the duplication of efforts in data cleaning and modeling, thereby accelerating the use of modeling outcomes in basic and applied research. Building SDM in a collaborative way is the first step to create communities of practice that discuss and overcome the different caveats of SDM. A next step is to integrate stakeholders from governmental institutions, NGOs, and productive sectors to develop derived products from SDM that can be directly incorporated into decision-making.
Implications for Conservation
SDM is used to support decision-making in Latin America; however, we must increase the incidence of SDM derived products for decision-making. Boundary institutions between science and academia, such as CONABIO in Mexico (Koleff & Urquiza-Hass, 2011; Ochoa-Ochoa, Urbina-Cardona, Flores-Villela, Vázquez, & Bezaury-Creel, 2009; Urbina-Cardona & Flores-Villela, 2010), or Conservation International and Instituto Alexander von Humboldt in Colombia (Conservación Internacional-Colombia & Secretaría Distrital de Ambiente, 2010; Londoño-Murcia, Zárate, & Ruiz-Agudelo, 2011a; Londoño-Murcia et al., 2011b), have invested in the development of SDM as a critical input for (sub)national assessments of biodiversity patterns, conservation gap analysis, and conservation planning, as well as for the development of official cartography such as the National Ecosystem Map of Colombia and their use in biodiversity offsets (Ministerio de Ambiente y Desarrollo Sostenible, 2018) and species extinction risk assessments (Renjifo et al., 2016).
Non-governmental and other institutions in Latin America have used SDM for supporting species conservation programs and land-use planning. This is the case in Brazil, for example, in which the Ministry of Environment partners with the Chico Mendes Institute for Biodiversity Conservation to use SDMs to guide and propose actions to monitor and protect threatened species (in species recovery plans: Ferraz et al., 2012) and to identify priority areas for conservation and sustainable use of Brazil’s biodiversity (e.g., Ministério do Meio Ambiente, 2007). Use of SDM in Latin America has focused on the conservation of native species, but there are other areas such as control of invasive species (e.g., Sales et al., 2017), agriculture, and valuation of ecosystem services (Manhães et al., 2018) or ecological restoration (Zwiener et al., 2017), where SDM is still mainly an academic interest.
Although highly recognized, the potential of SDM for guiding field surveys (Raxworthy et al., 2003) so that data gaps are reduced has not yet been fully realized in Latin America to guide national monitoring or inventory schemes, perhaps due to persistent data limitations. In many cases, field work is done opportunistically, leading to some questioning: Is the use of SDM to guide field expeditions limited by a lack of understanding, a gap in communication of results between academia and government, or lack of confidence because of inherent uncertainties and bias? Alternatively, is there a deeper reason related to traditional institutional cultures? We argue that all issues are going on simultaneously.
To use data science products more often in decision-making, both scientists and practitioners may need to shift paradigms. Scientists should be comfortable and transparent when communicating the uncertainty and bias of their products, and not be afraid of delivering the information as the best knowledge that they have available. They need to be aware that decisions will not wait for perfect science; many decisions will be taken with or without scientific evidence, and as such may benefit from the best information available on species distributions at the time of the decision-making process. On the other hand, practitioners should understand that science is a process and derived products are not perfect. As proposed by Redford, Groves, Medellín, and Robinson (2012), scientists and decision-makers should build knowledge together and exchange information on how science works and how real-life decisions are taken so that suitable products are generated on time to fit user needs. In this sense, we propose that communities of practice around SDM could integrate different visions from academy and government, increasing the credibility and use of models in decision-making scenarios.
Final Remarks
The results of this paper provide clear guidelines for initiatives such as BioBridge that aim to enhance the technical cooperation between parties of the Convention on Biological Diversity, and the International Union of Biological Sciences, where one of the objectives is to initiate, facilitate, and coordinate research, capacity building, and other scientific activities that involve international and interdisciplinary cooperation.
Although the scientific community is aware of data gaps, such as the Wallacean or Linnaean shortfall (Lomolino, 2004), scientists on their own will not be able to address these gaps, as it requires shared effort from governments and politicians to invest in data acquisition, management, and integration at national levels. The implementation of information infrastructures, long-term monitoring, and citizen science data is a priority in Latin America. Biodiversity and Ecological Informatics research fields need further support, for example, through new, targeted graduate academic programs in Latin America so that research and collaboration networks can evolve. Upscaling results from scientific journals to decision-makers is an urgent need.
We hope that data gaps and the use of modeling techniques to address information for supporting decision-making have a place in the CBD post-2020 agenda. Strategic partnerships within Latin American countries between academia and governments can also help to establish and scale-up the need for and the high potential use of SDM in decision-making.
Acknowledgments
We thank P. Ersts and P. Galante for assistance with MaxEnt download data and to R. Anderson for helpful discussion. Prof. F. Huettmann and three anonymous reviewers provided helpful suggestions that improved this article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: We thank John Nope and Sergio Cuellar from Direction of Innovation, Vicerectory of Research, Pontificia Universidad Javeriana. R. L. research is funded by CNPq (grant 306694/2018–2). This article is a contribution of the INCT in Ecology, Evolution and Biodiversity Conservation founded by MCTIC/CNPq/FAPEG (grant 465610/2014–5).