Hello, Im new to this fascinating world of r, I have not been able to skip the urls that do not exist, how can I handle it? and don't mark as and error, thanks for your help.
title: "error" author: "FJSG" date: "27/6/2020" output: html_document
knitr::opts_chunk$set(echo = TRUE)
library(xml2)
library(rvest)
library(tidyverse)
library(lubridate)
zora_core <- read_html("https://zora.medium.com/the-zora-music-canon-5a29296c6112")
Los_100 <- data.frame(album = html_nodes(zora_core, "h1:not(#96c9)") %>%
html_text() %>%
str_trim(side = "both"),
interprete = html_nodes(zora_core, "strong em , p#73e0 strong") %>%
html_text() %>%
str_remove_all("^by") %>%
str_extract("[a-zA-Z].+(?=[(])") %>% str_trim(side = "both"),
año = html_nodes(zora_core, "strong em , p#73e0 strong") %>%
html_text %>%
str_extract("([[:digit:]]){4}"),
liga = paste0("https://en.wikipedia.org/wiki/",html_nodes(zora_core, "strong em , p#73e0 strong") %>%
html_text() %>%
str_remove_all("^by") %>%
str_extract("[a-zA-Z].+(?=[(])") %>% str_trim(side = "both") %>% str_replace_all(" ","_")))
carga <- function(url){
perfil_raw <- read_html(url)
data.frame(interprete = html_node(perfil_raw, "h1#firstHeading") %>%
html_text() %>% str_trim(side = "both"))
}
lista <- Los_100$liga[1:16] # THE url for the position 16 don´t exist how to avoid that datos_personales <- map_df(lista,carga)
It's useful to learn about error-handling in R, but when working with http requests it becomes essential.
In your case, it is best to wrap
cargain atryCatch. This runs an expression that you pass as the first argument and if an error is thrown, it is caught and passed to the second argument oftryCatch, which is a function.If an error is thrown we need to return a data frame with a single column called
interpreteso thatmap_dfcan bind it together with the other results:Apart from error handling, I think your code is very good for someone just beginning in R. It achieves a lot in a few lines of code and is perfectly readable. Good work!