How to extract a substring using a regex in R

2130 views r
6

I have a list of Strings. Every entry looks like: ENSG00001234.2 I only need the String between "ENSG" and "."

The result should be: 00001234

How can I use a regex for this in R?

Thank you!

answered question

2 Answers

1

We can use sub

sub("ENSG([0-9]+)\\..*", "\\1", str1)
#[1] "00001234"

Or using str_extract

library(stringr)
str_extract(str1, "(?<=ENSG)[0-9]+")
#[1] "00001234"

data

str1 <- "ENSG00001234.2"

posted this
9

Without regex you could use substr

x <- c("ENSG00001234.2")
substr(x, 5, 12)
# [1] "00001234"

posted this

Have an answer?

JD

Please login first before posting an answer.