Weighted averaging over different models based on different pseudo-absences datasets

Question

Weighted averaging over different models based on different pseudo-absences datasets

23 Views Asked by Arnaud Boulenger At 07 February 2024 at 13:44

I'm working with Species Distribution Modelling, on the seagrass Posidonia oceanica in the Mediterranean Sea. I computed different algorithms with ten sets of pseudo-absences randomly selected. For the MARS algorithm, I would like to create a weighted averaged model based on the models created with the ten different pseudo-absences datasets (Posidonia.oceanica_PA1_RUN1_MARS...Posidonia.oceanica_PA10_RUN1_MARS).

How could I do that ?

Here is the description of the first steps of my code :

# Get data points
setwd("E:/Modélisation - Eric Goberville/PhD/Practical_work_2023/Results/Species_data")
points <- read.csv(file = "clean_Posidonia_oceanica.csv", header = T)
points <- cbind(points[,2:4], rep.int(1, length(nrow(points)))) #Adds another column indicating these are presence points
colnames(points) <- c("Species", "Y", "X", "Response")
head(points)

# Get environmental variables
library(raster)
setwd("E:/Modélisation - Eric Goberville/PhD/Practical_work_2023/Results/Environment_data/Current")
current.envt <- stack(list.files(pattern = ".asc"))
#("Current_BO_sstmin.asc","Current_BO2_lightbotmax_bdmax.asc","Current_BO2_lightbotmin_bdmax.asc","Current_BO2_salinitymax_bdmax.asc","Current_BO22_ph.asc") 
#setwd("C:/Users/ericg/OneDrive/Bureau/Practical_work_2023/Results/Environment_data/RCP26_2050")
#RCP26.envt <- rast(list.files(pattern = ".asc"))


# 1. Formatting the data --------------------------------------------------
# This package has functions for formatting the data specifically for this analysis. If you don't have true absence data, this package has several methods for producing background data, or pseudo-absences.
library(biomod2)
myRespName="Posidonia.oceanica"
bmData_1.5 <- BIOMOD_FormatingData(resp.var= points[,4],
                     expl.var= current.envt, # a matrix, data.frame, SpatialPointsDataFrame or RasterStack containing your explanatory variables that will be used to build your models
                     resp.xy = points[,3:2], # X and Y coordinates of resp.var
                     resp.name = myRespName,
                     PA.nb.rep = 10, # number of repetitions for pseudo-absence selection
                     PA.nb.absences = 1.5*nrow(points), # number of pseudo-absence selected for each repetition
                     PA.strategy = 'random', # strategy for selecting the Pseudo Absences
                     filter.raster = FALSE, # filtering when several points in the same raster cell
                     na.rm = TRUE)

bmData_1.5
library(tidyterra)
library(ggtext)
plot(bmData)

# Extract information about the presence and pseudo-absences datasets
Records.model_1.5 <- data.frame(bmData_1.5@coord, [email protected], Presence = [email protected])
Presence.species_1.5 <- Records.model_1.5[which(Records.model_1.5$Presence == 1),]
dim(Presence.species_1.5)
PA.species_1.5 <- Records.model_1.5[which(is.na(Records.model_1.5$Presence)),]
dim(PA.species_1.5)

#setwd("E:/Modélisation - Eric Goberville/PhD/Practical_work_2023/Results/Species_data")
#write.table(Presence.species[,-ncol(Presence.species)], file ="Presence_model_Posidonia_oceanica.csv",row.names=FALSE, sep=",")

# 2. Defining Models Options using default options ------------------------

myBiomodOption <- BIOMOD_ModelingOptions()


# 3. Computing the models -------------------------------------------------

setwd("E:/Modélisation - Eric Goberville/PhD/Practical_work_2023/1.5_PA")
mySDMModel <- BIOMOD_Modeling(bm.format=bmData_1.5,
                              models= c('GLM', 'GBM', 'GAM', 'CTA', 'ANN', 'SRE', 'FDA', 'MARS', 'RF', 'MAXNET', 'XGBOOST'),
                              bm.options=myBiomodOption,
                              CV.nb.rep = 1, # Number of Evaluation run
                              CV.do.full.models = FALSE, # if true, models calibrated and evaluated with the whole dataset are done
                              CV.perc = 0.7, # % of data used to calibrate the models, the remaining part will be used for testing
                              var.import = 1, # Number of permutation to estimate variable importance
                              metric.eval = c('ROC','TSS','KAPPA'), # names of evaluation metrics
                              nb.cpu = 1, 
                              modeling.id = myRespName)


# When this step is over, have a look at some outputs and models evaluations
mySDMModel

Original Q&A

Weighted averaging over different models based on different pseudo-absences datasets

There are 0 best solutions below

Related Questions in BIOMOD2

Trending Questions

Popular # Hahtags

Popular Questions