The Geopolitics of Metadata: Knowing Panama Through the Biodiversity Heritage Library

In the first part of this blog series, I explained a portion of the analyses I performed during my time as an intern for the Biodiversity Heritage Library (July-August 2021). These analyses revolved around metadata patterns in BHL’s collection that highlight shortcomings in terms of diversification in the Library’s catalogue. In that post, by focusing on comparative metadata and the case of BHL México, I argued that an outreach plan that included the establishment of global partnerships between BHL and institutions in the Global South was a solid strategy to diversify the Library’s collections. This same argument is sustained by the second portion of the analyses I performed during my internship and that I present here, which deal primarily with patterns of associations and representation in the subject lists of BHL’s materials and the specific case of Central America and Panama.

The goal of the second part of my internship was thus to identify semantic patterns in subject lists that highlight the diversification—or lack thereof—in materials about Latin American biodiversity contained in BHL. To this end, I began my analyses with the extraction of the subject lists of those materials. In order to perform this extraction, and following, once more, the methodology employed by Chris Freeland (Freeland, steps 1–3), I cross-referenced BHL’s data file on items and that on subjects to match each title ID to the subjects it includes.[i] Finally, from the resulting dataset, I extracted the records of materials that included subjects related to Latin America to build six subsets:[ii]

  1. West Indies (WI): This subset contains all materials (228 title IDs) that include the subject/term West Indies in their subject list.
  2. Central-Latin-South America (CLS): This subset contains all materials (710 title IDs) that include at least of one of these subjects/terms in their subject list: Central America, Latin America, South America. Out of these three, South America is considerably more frequent at 528 occurrences,[iii] more than three times the occurrences of the other subjects, with 168 occurrences of Central America and 104 of Latin America.
  3. Central American Countries (CAC): This subset contains all materials (485 title IDs) that include at least one Central American country in their subject list: Belize, Guatemala, Honduras, Nicaragua, Costa Rica, Panama, El Salvador. Out of these countries, Panama is by far the most frequent subject (240 occurrences).
  4. South American Countries (except Brazil) (SAC): This subset contains all materials (1221 title IDs) that include at least one South American country (except Brazil) in their subject list: Argentina, Bolivia, Colombia, Chile, Ecuador, Guyana, (French) Guiana, Paraguay, Peru, Surinam/Suriname, Uruguay, Venezuela. Out of these subjects, Peru, Chile, and Argentina are the most frequent, with over 230 occurrences each, followed by Colombia (193) and Ecuador (154). The remaining countries occur less than 100 times each, with Uruguay being the least frequent (25 occurrences).
  5. Brazil (BR):[iv] This subset contains all materials (543 title IDs) that include the subject/term Brazil in their subject list.
  6. Mexico (MEX): This subset contains all materials (917 title IDs) that include the subject/term Mexico in their subject list.

As I explained in my previous blog post, the geopolitics of publication and housing of materials in BHL’s catalogue reflect an unequal and still colonial relationship between the Global South and North. In addition to this, I argue that a lack of diversification in terms of knowledge production results in unequal representations and valorization of the Global South and North that perpetuate colonial dynamics in biodiversity-related knowledge production. To test this thesis through subject lists, I generated frequency lists, co-occurrence networks, and hierarchical clusters using KH Coder 3 for each of the previously mentioned subsets.[v] By taking a closer look at the patterns of subject lists in the subsets described above and revealed by these text analyses, it becomes clearer how a colonial niche of biodiversity-related publications can lead to biased and colonial representations of biodiversity and communities from the Global South. That is the case, for instance, of materials in BHL that focus on the biodiversity of Central America and, specifically, Panama.

As previously mentioned, in the CAC subset, Panama is the most frequent subject with 240 occurrences, followed by the compound field note with 188 and Costa Rica with 99 (Figure 1). These numbers suggest that Panama is the most studied Central American country in BHL’s materials. Furthermore, when looking at the subject lists in which it appears, it is possible to note that it is often associated with the Canal Zone and the United States, terms that are significantly frequent in this subset as well (Figure 2).

List of subject terms and rate of frequency in data set

Figure 1 The 20 most frequent words in the CAC subset. Generated on KH Coder 3 in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

 

List of subject terms with Panama occurrences highlighted in red

Figure 2 Sample of subject lists that include the word Panama. Generated on KH Coder 3 in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

These initial observations in terms of word-frequencies and associations in the CAC subset led to the hypothesis that the frequency of Panama in this subset is due to the biological (Smithsonian Tropical Research Institute, ‘Why Is the Smithsonian in Panama?’) but also economic importance of the country and the Central American region, especially given its strong association with US-related subjects. For instance, in the co-occurrence network and hierarchical clustering of the CAC subset, it is possible to identify a strong relationship between Panama and the United States. The most intricated subgraph in the co-occurrence network is built around Panama and includes nonhuman species and biodiversity-related topics (mammalogy, ethology, Animalbehavior, Ornithology) alongside US-related subjects and institutions (UnitedStates, NationalMuseum), as well as the compound Canal Zone (Figure 3). Given these observations and to further test this hypothesis, I repeated the extraction and selection process to create a subset specifically for this country, the PAN subset.[vi] The goal of the creation of this additional subset was to verify whether these US-centric interests and findings around Panama also manifest in the rest of the metadata of records containing the subject Panama.

Color coded co-occurrence network map of subjects in a data set

Figure 3 Co-occurrence network of the CAC subset. The overlapping of subgraphs 01 and 06 shows the strong association between Panama and the United States. Generated on KH Coder 3 in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

As I explained in the first part of this blog series, mapping the places of publication of subject-based subsets of materials in BHL can help visualize the relationships between the Global South and North as both objects and subjects of knowledge production. When mapping the PAN subset, however, I noticed that almost 70% of the materials (125 out of 185 title IDs) had no identified place of publication. When looking more closely at each of these materials, I found that almost all of them[vii] were handwritten field notes. Therefore, I looked at each of these titles to identify the origin of these field notes. While their place of publication cannot be identified because these materials are not publications per se, identifying the affiliation of the field work and expeditions that originated them helped more clearly illuminate the geopolitics of biodiversity-related knowledge production about Panama. For instance, out of the 125 materials with no identified place of publication, 76 (60.8%) are field notes resulting from US-based expeditions to Central America,[viii] meaning that such expeditions constitute the predominant point of origin of materials in the PAN subset[ix] (Figure 4). Looking at these details for each of the records with no identified place of publication in the PAN subset is particularly important as the hierarchical clustering for the CAC subset also shows this significantly strong connection between the subjects Panama and field note(s) (Figure 5). Given that the majority of field notes with no identified place of publication in the metadata are the product of US-based expeditions to Central America, this connection further emphasizes the determining role that US interests play in the presence and representation of Panama in BHL.

Bar chart listing number of titles by place of publication in a data set

Figure 4 Number of titles per place of publication in the PAN subset (with cleaned data for missing fields). Generated on Tableau in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

Hierarchical cluster of subject terms in a data set

Figure 5 Hierarchical clusters for the CAC subset. Cluster 8 (pink) shows a strong association between Panama and field note(s). Generated on KH Coder 3 in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

Of the remaining materials, seven (5.6%) are affiliated with the Smithsonian Environmental Research Center (with its headquarters in Chesapeake Bay) and the Smithsonian Migratory Bird Center (in Washington, D.C.). The latter is particularly interesting, as it could contribute, at least in part, to the importance of the subject ornithology in connecting the subject Panama with the subgraph built around the term United States in the CAC subset (Figure 3, subgraph 06). Finally, the remaining 41 materials with no identified place of publication are affiliated to the Smithsonian Tropical Research Institute (STRI). What is particularly notable about the STRI is that it is located in Ancón, Panama (Smithsonian Tropical Research Institute, ‘Why Is the Smithsonian in Panama?’). This means that even if the STRI is affiliated with the Smithsonian and, therefore, the United States, the knowledge production about Panama sponsored by the Institute could be seen as an epistemic collaboration between researchers from Panama, the US, and beyond (Smithsonian Tropical Research Institute, Biological and Cultural Diversity of the Tropics 4). Nevertheless, even when considering the place of publication of STRI affiliated documents as being located in Ancón, the knowledge production about Panama and her biodiversity as housed in BHL continues to be greatly dominated by the United States (Figure 6).

Global map showing places of publication in a data set

Figure 6 Map showing places of publication (density per number of records) in the PAN subset (with cleaned data for missing fields). Generated on Tableau in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

Other than places of publication showing a predominance of the United States as the main producer of the knowledge about Panama available in BHL, the year of these publications[x] is particularly important to further evidence the impact of US politics and interests in the production of these materials, especially given the history of the US construction and administration of the Panama Canal throughout the 20th Century. After its beginnings as a French project in the late 19th Century, the construction of the Panama Canal was initiated by the US in 1904 and concluded in 1914 (Autoridad del Canal de Panamá). The concession of this region was a result of US participation in “the separatist movement in Colombia [after which] the separatists achieved in breaking the province of Panama from Colombia [and] the United States was awarded with what was to become the Panama Canal Zone” (‘Panama Canal Zone in World War II’). The US control of the Canal continued until 1979, when it was transferred to “the Panama Canal Commission, a joint agency of the United States and the Republic of Panama” (Padelford). It was not until 1999 that the Canal’s administration was left exclusively in the hands of Panama (Autoridad del Canal de Panamá).

Line chart indicating number of titles per year in data set

Figure 7 Distribution per year of titles in the PAN subset (with cleaned data for missing fields). Generated on Tableau in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

Mirroring this historical context, 161 out of the 185 records in the PAN subset (87%) were published between the years 1903 and 1996, that is, during the US administration of the Panama Canal, with the peak occurring in the late fifties (Figure 7). Furthermore, out of the 16 materials in the PAN subset published prior to the 20th Century, only six are affiliated with the US: four published in New York and two being the product of US-based expeditions. In contrast, the other 74 of the 76 materials that are the product of US-based expeditions (as previously explained) were published between 1910 and 1983, meaning that 97.4% of these expeditions occurred during US control of the Panama Canal (Figure 8). Moreover, the STRI itself, in spite of its global collaboration outlook, is “closely tied to the construction of the Panama Canal” as it resulted from “Smithsonian scientists and naturalists across the United States urg[ing] U.S. President Theodore Roosevelt to support a biological expedition to take an inventory of the future Canal Zone’s flora and fauna” (Smithsonian Tropical Research Institute, ‘Why Is the Smithsonian in Panama?’). Therefore, given these patterns of publication across time—alongside the importance of US expeditions and the STRI in the production of the knowledge contained in BHL and the semantic associations and word frequencies of subject lists in the PAN subset—it is possible to conclude that biodiversity-epistemologies about Panama in the Library are greatly subsumed to US history, politics, and economic interests.

Timeline listing year and place of publication

Figure 8 Chronological distribution per place of publication of titles in the PAN subset (with cleaned data for missing fields). Generated on Tableau in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

In addition to these categories of metadata showing the dominance of US-centric and Global-North-centric epistemologies and interests in BHL’s materials about Panama, it is essential to consider the language of the materials, another fundamental category in terms of the diversification not only of representation but also of access and audiences, as I have argued elsewhere.[xi] In this regard, out of the 185 records in the PAN subset, only seven (3.78%) are in languages other than English,[xii] of which only three (1.62%) are in Spanish, the official language of Panama (Zajícová 185)(Figure 10).

Color coded bar chart depicting number of titles per language in a data set

Figure 9 Number of titles per language in the PAN subset (with cleaned data for missing fields). Generated on Tableau in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

The three materials in the PAN subset that are written in Spanish are James Zetek’s Los moluscos de la República de Panamá (1918), the Smithsonian Tropical Research Institute’s Panamá, puente biológico: Las Charlas Smithsonian del Mes (2001),[xiii] and Rafael Tobías Marquís Oropeza’s Algunas palmeras industriales de la flora istmeña (1908). The first two, albeit being published in Panama, are affiliated to the US through their authors and holding institutions. These two materials are held by the Smithsonian Libraries, with the STRI also being the author of the second one. Likewise, James Zetek (1886-1959) was a US entomologist who worked for the US Department of Agriculture and was in charge of research expeditions and international relations with Central America, especially the Canal Zone in Panama, during both world wars (Snyder et al. 1230–31). This record, thus, further evidences the importance of the history of the Canal and US interests in the materials in the PAN subset. On the contrary, the third record is a unique example that counteracts US-centric tendencies. Marquís Oropeza was a Venezuelan scientific philosopher and agronomist who acted as the first Director of the Museo Nacional de Panamá (Moreno 169–70). Therefore, and even though this work is held by the New York Botanical Garden, Marquís Oropeza’s book is the only record in the PAN subset (out of 185) that is written in Spanish and where the knowledge production took place in Panama and independently from the US and Europe. Moreover, this work highlights biodiversity-related collaboration within the Global South, further acting as an example of epistemic agency in non-hegemonic spaces.

Nevertheless, the evident shortcoming of the record by Marquís Oropeza is the rareness of its case, which, given the observations made through the analysis of the CAC and PAN subsets explained so far, turns it into a one-case exception. The goal of the diversification of BHL’s collection should then be to make these cases the rule and not the anomaly, an objective that could be achieved through a more thorough incorporation of non-hegemonic materials and the establishment of more BHL nodes across the Global South. In this sense, the findings explained throughout this blog post reveal the important need for BHL to strengthen bonds with institutions in the Global South, not only to achieve a truly global repository but to diversify representation, a particularly pressing matter in topics related to colonial and (neo)imperial issues, such as the case of Panama. The overwhelming predominance of US-centric narratives and perspectives in the knowledge production about Panama and her biodiversity in BHL’s catalogue reveals the urge to diversify, decentralize, and decolonize bio-diverse epistemologies. As I argued in my previous blog post, a strong collaboration with institutions in the Global South, as exemplified by BHL México, can lead to more diverse collections, not only in terms of language but also of semantic associations. In this regard, for example, when comparing the co-occurrence networks of the CAC and MEX subsets, it is possible to note that the MEX subset presents an arguably more diversified network of subjects (Figure 10). For instance, it shows a wider variety of both nonhuman species and groups and of geographical regions. Furthermore, while subjects and locations related to the United States are still present (subgraphs 01, 03, and 08), their connection to the subject Mexico itself is less strong than in the case of the CAC subset and the subject Panama (Figure 3). Thus, given that a large number of materials in the MEX subset are the product of BHL México, I continue to argue that a more solid, profound, meaningful, and critical strategy for the establishment of BHL partnerships throughout the Global South can lead to a truly global, open, decolonial, and bio-diverse BHL, deeply in line with the Library’s goals and mission.

Color coded co-occurrence network map of subjects in a data set

Figure 10 Co-occurrence network of the MEX subset. Generated on KH Coder 3 in September 2021. Data from https://www.biodiversitylibrary.org/data as of July 1st, 2021.

 

[i] This process included all materials in BHL—regardless of subject—as of July 1st, 2021. The full list of title IDs and subjects can be found in BHL’s GitHub repository.

[ii] Tables for these subsets can be found in BHL’s GitHub repository. Due to the constraints of the extension and nature of this blog post, not all subsets are discussed in detail. However, the files are publicly available, and the author can be contacted at any point with any further questions or inquiries.

[iii] Word frequency tables for each subset can be found in BHL’s GitHub repository.

[iv] Both Brazil and Mexico were treated as separate subsets given that a large portion of materials about Brazilian and Mexican biodiversity in BHL comes from the BHL SciELO and BHL México projects respectively. Thus, considering these materials as separate subsets can illuminate the impact of global collaboration in the diversification of the Library’s collection.

[v] All graphs for each subset can be found in BHL’s GitHub repository.

[vi] All files (subject lists, word frequency, co-occurrence network, hierarchical clustering) for this subset can be found in BHL’s GitHub repository.

[vii] The one exception is Dr. William Mark Whitten’s Ph.D. thesis (1985) at the University of Florida.

[viii] The affiliations of these expeditions vary but are mostly associated with academic and research institutions throughout the United States, such as the New York Botanical Garden and Harvard University, and government dependencies, such as the US Navy and the US Department of Agriculture.

[ix] Tables for the PAN subset with both raw data and data cleaned for missing fields (places of publication, year, and language) can be found in BHL’s GitHub repository. In the file with cleaned data, previously missing fields are highlighted in yellow.

[x] Missing fields of year of publication in the PAN subset (see previous note) were fixed by looking into other metadata categories (such as the start date for periodicals and the publication details) and verified with the material itself.

[xi] I discussed the importance of multilingual representation in the Library for the diversification of access in more detail in a previous blog post and in my presentation for BHL Day 2021.

[xii] Missing fields of language in the PAN subset (see note ix) were mostly for albums and field notes. These fields were fixed by considering the language of notes, comments, introductions, and/or descriptions/captions for images or photographs, which were all in English.

[xiii] This is the one material affiliated to the STRI in the PAN subset that is not written in English, meaning that almost all materials affiliated with the STRI are in English, which, again, hinders the collaborative nature of their research endeavours.

Works cited

Autoridad del Canal de Panamá. ‘Reseña Histórica del Canal de Panamá’. Canal de Panamá, https://micanaldepanama.com/historia-del-canal/resena-historica-del-canal-de-panama/. Accessed 5 Oct. 2021.

Freeland, Chris. ‘BHL Poster for AETFAT2010’. ChrisFreeland, 19 Apr. 2010, http://blog.chrisfreeland.com/2010/04/.

Moreno, Hiram A. ‘Tras las elusivas huellas de Rafael T. Marquís Oropeza. El primer Director del Museo Nacional de Panamá’. Canto Rodado, vol. 10, 2015, pp. 163–75.

Padelford, Norman J. ‘Panama Canal’. Encyclopedia Britannica, 2005, https://www.britannica.com/topic/Panama-Canal.

‘Panama Canal Zone in World War II’. WW2DB – World War II Database, https://ww2db.com/country/panama_canal_zone. Accessed 5 Oct. 2021.

Smithsonian Tropical Research Institute. Biological and Cultural Diversity of the Tropics: Extraordinary Opportunities and Challenges in the Midst of the Anthropocene. Strategic Plan 2019-2024. Smithsonian Tropical Research Institute, 2019, https://stri.si.edu/sites/default/files/stri_strategic_plan_aug_2019.pdf.

—. ‘Why Is the Smithsonian in Panama?’ Smithsonian Tropical Research Institute, Smithsonian Tropical Research Institute, 19 Dec. 2016, https://stri.si.edu/why-panama.

Snyder, Thomas, et al. ‘James Zetek. 1886-1959’. Journal of Economic Entomology, vol. 52, no. 6, 1959, pp. 1230–32, https://doi.org/10.1093/jee/52.6.1230.

Zajícová, Lenka. ‘Lenguas Indígenas En La Legislación de Los Países Hispanoamericanos’. Onomázein. Revista de Lingüística, Filología y Traducción, 2017, pp. 171–203, https://doi.org/DOI: 10.7764/onomazein.amerindias.10.

photo of a woman in a pale purple shirt with dark hair

Lidia Ponce de la Vega is a Ph.D. Candidate in Hispanic Studies at McGill University. She holds an Honours Bachelor of Arts (Gabino Barreda Medal) in Hispanic Language and Literature from the National Autonomous University of Mexico, and a Master of Arts in Hispanic Studies from McGill University. In her research, she explores topics of digital archives, archival practices, and decolonisation of online epistemologies in their intersection with ecocriticism and interspecies relationships.