SeedLearn
🌱 SeedLearn
SeedLearn is an interdisciplinary project developing AI tools to identify tropical tree seedlings using field images, botanical knowledge, and ecological data.
Overview
Seedlings are one of the most difficult life stages to identify in tropical forests, yet they are critical for understanding forest regeneration, biodiversity, and restoration outcomes. SeedLearn addresses this challenge by combining expert-curated ecological datasets with multimodal artificial intelligence approaches.
Capparidastrum frondosum (Capparaceae)

Research Team
SeedLearn is an interdisciplinary collaboration integrating ecology, artificial intelligence, and computer vision.
- Nohemi Huanca-Nunez — tropical forest ecology and integration of ecological knowledge into AI systems
- Liza Comita — tropical forest ecology and long-term forest datasets
- Helene Muller-Landau — forest ecology and trait data integration
- Arman Cohan — computer vision and AI methods
- Holly Rushmeier — computer graphics and visual computing
- Mitch Horn — AI and data science development and modeling pipeline
- Kaili Liu — multimodal AI and knowledge integration
- Luke Browne — ecological data processing and data integration
Why This Matters
Tropical forests can contain over 300 species per hectare, and many seedlings look nearly identical despite belonging to different species with distinct ecological roles.
This creates a major bottleneck for:
- biodiversity monitoring
- forest restoration
- ecological research
Current AI systems are typically trained on internet images and are not designed for these highly complex and data-limited environments. SeedLearn aims to bridge this gap.
The Challenge

Acanthaceae
Melastomataceae

Fabaceae

Rubiaceae
Examples of tropical tree seedlings included in the SeedLearn image dataset.
Dataset
The project is built on a curated dataset of tropical seedling images collected through long-term ecological research.
- thousands of images of individual seedlings
- multiple images per individual
- broad taxonomic coverage across species, genera, and families
- expert-validated identifications
These data provide a unique foundation for developing and evaluating AI models in real-world ecological settings.
Current Progress
- curated and organized a large seedling image dataset
- developed initial AI modeling pipelines
- ongoing model development and evaluation
Project Support
This project is supported by the 2025 Yale AI Seed Grant, which enabled the initial development of the SeedLearn pipeline for AI-based seedling identification.
Contact
If you are interested in collaboration, datasets, or applications of this work, please feel free to reach out.
Nohemi Huanca-Nunez
nohemi.huanca@yale.edu
