How to extract string between other string and number in R?

90 Views Asked by At

I have a vector:

a <- c("{.package:base73","for.package:base476",":.package:base9","print.package:graphics834")

and I need to extract string between package: and the first number.

So the output should be: c("base", "base", "base", "graphics")

I tried this but it doesn't work as i wish:

b <- "{.package:base71"
gsub(".*package:(.+)[1-9]+", "\\1", b)

Edit: I want to use only base package*

3

There are 3 best solutions below

3
Derf On BEST ANSWER

Just a minimal change to code of OP,

gsub(".*package:|[1-9]+", "\\1", a)

#[1] "base"     "base"     "base"     "graphics"
2
knitz3 On

Using a capture group (), at minimum one character in between .+, non-greedy match ? and a digit directly following \d:

library(stringr)
str_extract(a, "package:(.+?)\\d", group = 1)
[1] "base"     "base"     "base"     "graphics"
0
Wiktor Stribiżew On

You can directly extract the non-digit chars after package: with the base R code using regmatches/regexec

a <- c("{.package:base73","for.package:base476",":.package:base9","print.package:graphics834")
unlist(regmatches(a, regexec("package:\\K\\D+", a, perl=TRUE)))

See this R demo and the regex demo.

Details:

  • package: - a literal tgext
  • \K - omit the text matched so far
  • \D+ - matches one or more non-digit chars.