Main aim of the study
The main objective of this analysis is to characterize and classify the research areas of the Citizen Science projects harvested in the CS Track project database.
This research deepens the previous analysis published on What are the predominant research areas in citizen science projects?
Period addressed by the study
The following analysis was conducted using data retrieved from the CS Track database on 2022/09/15. At this point, the database consisted of 4949 CS Project records (this includes English and non-English descriptions).
The implementation of the CS Track database has involved a gradual process. 4,949 projects number of projects have been documented. Figure 1 shows an overview of the database, with the main data on website countries, project description languages, number of platforms, distribution of website types and distribution of website countries.
Moreover, it should be noted that for some projects the related descriptions were derived from a number of platforms, not from a single one. Hence, when conducting this analysis, multiple platform assignments were taken into account as described in the example below.
- Wp2 ID (Platform ID): [“9” “89”]
- Project Title: Fossilfinder
- Note: In the above example, “Fossilfinder” project description has been retrieved from both platform 9 and 89 (composite assignment of platforms). In this case, when conducting further analysis at the level of the platforms the project “Fossilfinder” was considered to be derived from both platform 9 and 89.
In total, there were 94 projects in which the descriptions were retrieved from more than one platform.
Moreover, there were 5 CS projects in the database namely: 1) Community Based System Dynamics (CBSD); 2) You + ME Registry and Biobank; 3) STEM+A@Astronomy; 4) SOCIÉTÉ FRANÇAISE POUR L’ÉTUDE ET LA PROTECTION DES MAMMIFÈRES (SFEPM); 5) Where? Where? Wedgie!) without a platform assignment. Those projects were not considered for the following analysis.
Research Methods applied
CS projects are classified (following the algorithm to classify CS information into research areas proposed in the context of WP3) considering the following 5 main research areas:
- Arts & Humanities
- Life Sciences & Biomedicine
- Physical Sciences
- Social Sciences
Each of the 5 main research areas consists of a number of related sub-research areas. The complete detail of the taxonomy can be found in this document. https://images.webofknowledge.com/images/help/WOS/hp_research_areas_easca.html
In the CS Track database, the research area assignment for each project has been done as it shows in these examples:
- Project Title: Penn State Astrobiology Citizen Science Project
- Project Description: [“We want to study the biogeography of microorganisms by taking water samples from domestic water heaters. Participants will acquire a water sample from their kitchen tap and answer 20 questions. The process will take ~30 minutes. We are recruiting 2-3 households per state. By looking at the genetic differences from isolates of similar microbes from across the globe, researchers are currently trying to understand the degree to which populations of microbes are isolated and whether this isolation suggests an allopatric speciation model for prokaryotes. We are still looking for participants in: AL, AK, DE, DC, KS, KY, ME, MA, NH, NM, ND, RI, SC, SD, TN, VT.” “Sign up to participate: http://www.scienceforcitizens.net/PSARC”]
- Research Areas: [“Physical Sciences, Water Resources, 0.6778448864490314”]
- Interpretation: In the above example, Penn State Astrobiology project has been assigned a single main research area which is “Physical Sciences” and a sub research area called “Water Resources”. The similarity score for this assignment is given as 0.67.
- Project Title: Great Lakes Worm Watch
- Project Description: [“The Great Lakes Worm Watch needs citizen scientists to conduct earthworm surveys in forests and other habitats anywhere in North America.” “The project website provides instructions and data sheets for conducting your own earthworm, habitat, and soil surveys in the “Conduct your Own Surveys” section: http://greatlakeswormwatch.org/team/conduct.html” “If you feel you need more help in designing a study, you can contact the project coordinators with particular questions at:”]
- Research Areas: [“Social Sciences, Archaeology, 0.35307771700172963” “Life Sciences & Biomedicine, Limnology, 0.2888119357350136”]
- Interpretation: In the above example, the Great Lakes Worm Watch project has been assigned two research areas and sub-research areas. However, “Social Sciences” [main research area] “Archaeology” [sub research area] received a higher similarity score of 0.35 when compared to the other assignment “Life Sciences & Biomedicine” [main research area] and “Limnology” [sub research area] which received a score of 0.29. * It should be noted that in the following sections when presenting the results of the research areas allocation, we only considered the highest similarity assignment.
The research area classification results are reported considering the following three questions:
- What is the distribution of research areas at the project level, considering the 5 main research areas listed?
- In each research area, what is the most common sub-research area?
- What is the distribution of research areas at the platform level?
The data was preprocessed in order to answer the aforementioned questions. It was noted that 100 records consisted of missing values in “Research Areas”. Therefore, the following analysis ultimately considered 4849 records.
Summary of results/findings
What is the distribution of research areas at the project level, considering the 5 main research areas listed?
In answering this question, Figure 2 below indicates the research area assignment considering 4849 projects.
As can be seen in Figure 2 the majority of projects have been assigned to the “Life Sciences & Biomedicine” category (2492 projects), followed by the “Technology” category (965 projects) and the “social sciences” category (671 projects). Several projects have also been assigned to the “Physical Sciences” category (434 projects) and “Arts & Humanities” category (257 projects). There are also 30 projects that have not been assigned to any of the 5 main research areas in the dataset analysed (and were indicated using ).
In each research area, what is the most common sub-research area?
In answering this second question, we extracted the sub-research area with the highest similarity score (See example 2 above) for each of the 5 main research areas. Due to the high number of sub-research areas associated with each research area, in this section, we only provide the top 3 sub-research areas related to each research area.
As can be seen in Table 1 when considering the “Life Sciences & Biomedicine” research area, numerous projects were seen to relate to the “Biodiversity & Conservation” type (682 projects). In the “Technology” category most projects were related to “Remote Sensing” (393 projects) and in “Social Sciences” a high number of projects are related to the “Education & Educational Research” subtype (121 projects). When considering the “Physical Sciences” numerous projects were seen to be related to the “Water Resources” sub-research area (165 projects) and finally in the “Arts & Humanities” research area a high number of projects were identified as related to the “History & Philosophy of Science” sub research area (122 projects).
What is the distribution of research areas at the CS platform level?
In the following, we report the percentage of research area allocation considering the platforms. It should be noted that in total, the CS project descriptions were derived from 59 CS platforms. Hence, we chose to report the results considering a selected list of 5 platforms as shown in Table 2 the criteria for the selection were:
- CS Platforms that allow European citizens to participate online
- CS Platforms that cover Europe area as a whole
- CS platforms for specific European countries
- CS platforms for specific European regions
- CS platforms that are involved actively in the promotion of CS (to measure it, we explored how often they actualize the content)
|Name of the Platform||Platform URL|
|EU citizen science||https://eu-citizen.science/projects|
|Citizen Science Vlaanderen||https://www.scivil.be/en/projects|
|Ciencia Ciudadana España||https://ciencia-ciudadana.es/proyecto-cc/|
In the following, we present the results of the research area assignment to CS projects considering the 5 platforms listed in Table 2 As presented below in Table 3 it can be observed that all 5 platforms consist of a high number of projects that are related to the Life Sciences & Biomedicine category. In general, the platforms consist of a smaller number of projects related to the Physical sciences and Arts and Humanities categories.
|Name of the platform||No. of projects related to Life Sciences & Biomedicine||No. of projects related to Technology||No. of projects related to Social Sciences||No. of projects related to Physical Sciences||No. of projects related to Arts & Humanities|
|EU citizen science||90||34||24||12||12|
|Citizen Science Vlaanderen||10||4||2||2||1|
|Ciencia Ciudadana España||44||66||46||12||16|
All the results presented can be downloaded in these open repositories:
- Zenodo URL: https://zenodo.org/record/7310341#.Y2zhgXaZNPY
- Link to GitHub: https://github.com/CS-Track-Code/project-categorization/blob/main/research_areas_assignment.ipynb