Skip to contents

Given a binomial species name, this function retrieves optional metadata from Wikipedia (taxonomic summary, taxonomy, image, color palette) and joins relevant plant/trait data from a TRY-style or user-provided trait table. Fuzzy matching is used for both TRY and local tables to handle minor spelling or naming mismatches.

Usage

get_trait_data(
  species,
  remove_bg = FALSE,
  do_palette = TRUE,
  do_taxonomy = TRUE,
  do_summary = TRUE,
  do_image = TRUE,
  bg_thresh = 80,
  green_delta = 20,
  n_palette = 5,
  preview = FALSE,
  save_folder = NULL,
  use_try = FALSE,
  try_data = NULL,
  trait_species_col = "AccSpeciesName",
  local_trait_df = NULL,
  local_species_col = "species",
  max_dist = 1
)

Arguments

species

Character. Species name (binomial, e.g. "Acacia karroo").

remove_bg

Logical. Remove green/white backgrounds from Wikipedia image? (default: TRUE)

do_palette, do_taxonomy, do_summary, do_image

Logical. Control which metadata to scrape (default: TRUE for all).

bg_thresh

Integer. Brightness threshold for white background removal (default: 80).

green_delta

Integer. How much greener is "green" than R/B? (default: 20).

n_palette

Integer. Number of colors to extract for palette (default: 5).

preview

Logical. Show image after processing? (default: TRUE)

save_folder

Character or NULL. If non-NULL, will save processed PNG image here.

use_try

Logical. If TRUE, join plant traits using a TRY-format database/table (default: FALSE).

try_data

Character (path) or data.frame. Path to TRY file, or data frame containing trait data.

trait_species_col

Name of species column in TRY trait table (default: "AccSpeciesName").

local_trait_df

Optional. Data.frame of local trait data (can be any species-trait table).

local_species_col

Name of species column in local trait table (default: "species").

max_dist

Numeric. Maximum distance for fuzzy join (Levenshtein/Jaro-Winkler; default: 1).

Value

A tibble (one row) with columns: species, optional metadata, and all trait columns found.

Details

  • For TRY tables, TraitName is used for wide trait columns. For local tables, all columns except the species column are returned.

  • Fuzzy matching is used to allow for spelling or formatting mismatches.

  • Image-based color palette extraction uses simple k-means clustering; backgrounds can be removed using a color threshold.

  • Requires: dplyr, purrr, tibble, optionally fuzzyjoin, rvest, httr, stringr, jsonlite, magick, abind.

  • You can control which metadata are scraped for speed.

Examples

if (FALSE) { # \dontrun{
# Example using TRY table:
get_trait_data("Acacia karroo", use_try = TRUE, try_data = try_traits, trait_species_col = "SpeciesName")

# Example using local trait table:
get_trait_data("Acraea horta", local_trait_df = traits, local_species_col = "species")

# Scrape only metadata (no traits):
get_trait_data("Acacia karroo", use_try = FALSE)
} # }