The Topic Overview tab contains a visual representation of the topic model used for this study. A topic model is a statistical model that uses machine learning to determine the probability of correspondence between specified topics in a collection of documents. The model generates these topics on the basis of word co-occurrence. After running the model, we categorized topics into four categories: general research topics (using NSF categories), specific subfield topics, water budget topics, and method topics. These topics allow us to interpret which fields of water science have the most and the least comprehensive research—or in other words, which are the “bright” and “blind” spots of water science in Latin America.

Important note: the Spanish and Portuguese topic models rely on far smaller corpora than the English topic model below. Because of this, the other two models are not as comprehensive; other visualizations on the platform thus rely on the topics generated from the English corpus.

FAQ Page

Topic Overview

Topic Labels

This searchable table contains labeling information for individual topics in the model. Each topic label corresponds to a specific topic shown above. In total, there are 5 NSF general research topics (e.g. physical sciences), 43 NSF specific topics representing subfields of research (e.g. geochemistry). Each topic is labeled by a number, theme, National Science Foundation (NSF) Specific topic, NSF General topic, and contains a brief description based on spatial scale, water budget (e.g. reservoirs), or methods (e.g. remote sensing). Irrelevant topics (“noise”) are unlabeled.

Topic Number Topic Label Theme NSF Specific NSF General Description
Topic Number Topic Label Theme NSF Specific NSF General Description

Document to Topic

This tab displays the estimated number of articles in the corpus related to a topic and the country. Select a topic from the drop down menu below to see a predicted distribution of articles about a given country.


Article Listings

This searchable table contains information about the articles used in the model. With it, users can construct queries to find information about authors, publishers, and, in some cases, specific geographic features or areas. Irrelevant topics (“noise”) are unlabeled.

Author(s) Title Year Source DOI Predicted Country Top Topic Topic Label
Author(s) Title Year Source DOI Predicted Country Top Topic Topic Label