Pairing Test and Control Plots by Euclidean Distance of a Vector in R

23 Views Asked by At

Most generally, I am creating a procedure to pair test and control plots objectively. The purpose is to regularly compare change within the test area to change within similar test areas over the course of a project's life. To do this, it is important that I start with control plots that are as close to each sample plot as possible. To determine potential match quality, I am using existing monitoring values from three time points. These values all share the same units. The goal is to be able to decide the number of control plots (k) I want per test plot, and to be able to easily assign the nearest k control plots to each test plot using Euclidean distance as accurately as possible, without replacement.

Some general rules 1) The number of test plots will vary but will always be much smaller than the number of potential control plots. 2) The number of control plots per test plot will vary depending on . 3) Control plots cannot be shared between test plots.

The format of both the test and control dataframes are:

plot mean.2013 mean.2018 mean.2022
G24 3396.375 4702.250 5616.500
H23 4018.909 5763.500 6521.682

I was initially pointed to the k nearest neighbor algorithm, but it seems like that may overshoot what I want to achieve. What is the best way to set this up?

0

There are 0 best solutions below