How to make four combination of points with the same ID in a data frame in R?

110 Views Asked by At

I am currently trying to make squares (polygons) from a data frame in R and in order to do that (according this guide), I need to have a data frame with 4 sets of paired points as their lon-lat coordinates.

Using this example:

sample_df <- data.frame(id = c(1,2),
                        t = c('2020-01-01','2020-01-01'),
                        intensity = c(1.3,0.6),
                        x1 = c(113.75,114.00),
                        x2 = c(114.00,114.25),
                        y1 = c(8.75,8.75),
                        y2 = c(9.00,9.00))
id t intensity x1 x2 y1 y2
1 2020-01-01 1.3 113.75 114.00 8.75 9.00
2 2020-01-01 0.6 114.00 114.25 8.75 9.00

What I would like to achieve is to create a data frame that retains the t and the intensity columns distributed to the id column multiplied into 4 pairs of x1 paired to y1, x2 paired to y1, x2 paired to y2, and x1 paired to y2 values as lon and lat columns.

The expected output would be a data frame looking something like this:

id t intensity lon lat
1 2020-01-01 1.3 113.75 8.75
1 2020-01-01 1.3 114.00 8.75
1 2020-01-01 1.3 114.00 9.00
1 2020-01-01 1.3 113.75 9.00
2 2020-01-01 0.6 114.00 8.75
2 2020-01-01 0.6 114.25 8.75
2 2020-01-01 0.6 114.25 9.00
2 2020-01-01 0.6 114.00 9.00

I am currently stuck but I am playing around the mutate() function of the dplyr package, or the melt() of reshape2.

I would be greatly thankful for your inputs.

2

There are 2 best solutions below

2
thelatemail On BEST ANSWER

This is 2 reshape/pivots I believe:

library(tidyr)
library(dplyr)
sample_df %>%
    pivot_longer(c("x1","x2"), names_to=NULL, values_to="lon") %>%
    pivot_longer(c("y1","y2"), names_to=NULL, values_to="lat")

## A tibble: 8 × 5
#     id t          intensity   lon   lat
#  <dbl> <chr>          <dbl> <dbl> <dbl>
#1     1 2020-01-01       1.3  114.  8.75
#2     1 2020-01-01       1.3  114.  9   
#3     1 2020-01-01       1.3  114   8.75
#4     1 2020-01-01       1.3  114   9   
#5     2 2020-01-01       0.6  114   8.75
#6     2 2020-01-01       0.6  114   9   
#7     2 2020-01-01       0.6  114.  8.75
#8     2 2020-01-01       0.6  114.  9   

The lon variable is right - the printing method for tibbles is really odd however and shows (113.75 or 114.25) and 114.0 as 114. and 114 respectively.

5
Baraliuh On

I would unnest a nested tibble (which is generally a phenomenal approach for making combinatorial combinations):

library(tidyr)
x1 = c(113.75,114.00)
x2 = c(114.00,114.25)
y1 = c(8.75,8.75)
y2 = c(9.00,9.00)
tibble(
  id = c(1,2),
  t = c('2020-01-01','2020-01-01'),
  intensity = c(1.3,0.6),
  long = list(c(x1, x2)),
  lat = list(c(y1, y2))
) %>% 
  unnest(c(long, lat))
#> # A tibble: 8 × 5
#>      id t          intensity  long   lat
#>   <dbl> <chr>          <dbl> <dbl> <dbl>
#> 1     1 2020-01-01       1.3  114.  8.75
#> 2     1 2020-01-01       1.3  114   8.75
#> 3     1 2020-01-01       1.3  114   9   
#> 4     1 2020-01-01       1.3  114.  9   
#> 5     2 2020-01-01       0.6  114.  8.75
#> 6     2 2020-01-01       0.6  114   8.75
#> 7     2 2020-01-01       0.6  114   9   
#> 8     2 2020-01-01       0.6  114.  9

Created on 2023-05-04 with reprex v2.0.2