The data is Citi Bikes NYC data from January 2019 to December 2019, which can be viewed here: https://s3.amazonaws.com/tripdata/index.html
You do not need to download the entire dataset you can download just one months
The following is an example of some of the columns of the data frame
| start.station.latitude | start.station.longitude | end.station.latitude | end.station.longitude | usertype |
|---|---|---|---|---|
| 40.77897 | -73.97375 | 40.78822 | -73.97042 | Subscriber |
| 40.75187 | -73.97771 | 40.74780 | -73.97344 | Customer |
The following is the code:
coordinates_table <- ridedata_clean %>% filter(start.station.latitude != end.station.latitude & start.station.longitude != end.station.longitude ) %>%
group_by(start.station.latitude,start.station.longitude,end.station.latitude,end.station.longitude,usertype) %>%
summarise(total = n(), .groups = "drop") %>% filter(total > 250)
Subscriber <- coordinates_table %>% filter(usertype == "Subscriber")
Customer <- coordinates_table %>% filter(usertype == "Customer")
nyc_bb <- c(left= -74.04, bottom = 40.93, right=-73.78, top =40.78)
nyc_stamen <- get_stamenmap( bbox = nyc_bb, zoom = 12, maptype = "toner")
ggmap(nyc_stamen, darken = c(0.8, "white")) +
geom_curve(Customer,
mapping = aes(x= start.station.longitude, y= start.station.latitude, xend = end.station.longitude,
yend = end.station.latitude, alpha = total, color =usertype), size = 0.5
, curvature =.2, arrow= arrow(length = unit(0.2,"cm"), ends = "first", type = "closed"))+
coord_cartesian()+labs(title = "most popular routes by Customers",
x=NULL,y=NULL,
caption = "Data by Citi Bikes and Map by ggmap ") +
theme(legend.position = "none")
The following is the error: I am getting the following error while running the above code : Coordinate system already present. Adding new coordinate system, which will replace the existing one. Error in grid.Call.graphics(C_raster, x$raster, x$x, x$y, x$width, x$height, : Empty raster