Extract string before “|” [duplicate]

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

This question already has an answer here:

Remove part of string after “.”

3 answers

I have a data set wherein a column looks like this:

ABC|DEF|GHI,  

ABCD|EFG|HIJK,  

ABCDE|FGHI|JKL,  

DEF|GHIJ|KLM,  

GHI|JKLM|NO|PQRS,  

BCDE|FGHI|JKL

.... and so on

I need to extract the characters that appear before the first | symbol.

In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr().

The syntax is - substr(x, <start>,<stop>)

In my case, start will always be 1. For stop, we need to search by |. How can we achieve this? Are there alternate ways to do this?

edited Aug 28 '16 at 18:29

lmo

32.1k93651

asked Jul 10 '16 at 12:20

Shounak Chakraborty

11114

marked as duplicate by akrun r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jul 10 '16 at 22:14

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

– alexis_laz
Jul 10 '16 at 12:42

add a comment |

This question already has an answer here:

Remove part of string after “.”

3 answers

I have a data set wherein a column looks like this:

ABC|DEF|GHI,  

ABCD|EFG|HIJK,  

ABCDE|FGHI|JKL,  

DEF|GHIJ|KLM,  

GHI|JKLM|NO|PQRS,  

BCDE|FGHI|JKL

.... and so on

I need to extract the characters that appear before the first | symbol.

In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr().

The syntax is - substr(x, <start>,<stop>)

In my case, start will always be 1. For stop, we need to search by |. How can we achieve this? Are there alternate ways to do this?

edited Aug 28 '16 at 18:29

lmo

32.1k93651

asked Jul 10 '16 at 12:20

Shounak Chakraborty

11114

marked as duplicate by akrun r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jul 10 '16 at 22:14

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

– alexis_laz
Jul 10 '16 at 12:42

add a comment |

This question already has an answer here:

Remove part of string after “.”

3 answers

I have a data set wherein a column looks like this:

ABC|DEF|GHI,  

ABCD|EFG|HIJK,  

ABCDE|FGHI|JKL,  

DEF|GHIJ|KLM,  

GHI|JKLM|NO|PQRS,  

BCDE|FGHI|JKL

.... and so on

I need to extract the characters that appear before the first | symbol.

In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr().

The syntax is - substr(x, <start>,<stop>)

In my case, start will always be 1. For stop, we need to search by |. How can we achieve this? Are there alternate ways to do this?

edited Aug 28 '16 at 18:29

lmo

32.1k93651

asked Jul 10 '16 at 12:20

Shounak Chakraborty

11114

This question already has an answer here:

Remove part of string after “.”

3 answers

I have a data set wherein a column looks like this:

ABC|DEF|GHI,  

ABCD|EFG|HIJK,  

ABCDE|FGHI|JKL,  

DEF|GHIJ|KLM,  

GHI|JKLM|NO|PQRS,  

BCDE|FGHI|JKL

.... and so on

I need to extract the characters that appear before the first | symbol.

In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr().

The syntax is - substr(x, <start>,<stop>)

In my case, start will always be 1. For stop, we need to search by |. How can we achieve this? Are there alternate ways to do this?

This question already has an answer here:

Remove part of string after “.”

3 answers

r extract substr

edited Aug 28 '16 at 18:29

lmo

32.1k93651

asked Jul 10 '16 at 12:20

Shounak Chakraborty

11114

edited Aug 28 '16 at 18:29

lmo

32.1k93651

asked Jul 10 '16 at 12:20

Shounak Chakraborty

11114

edited Aug 28 '16 at 18:29

lmo

32.1k93651

edited Aug 28 '16 at 18:29

lmo

32.1k93651

edited Aug 28 '16 at 18:29

lmo

32.1k93651

asked Jul 10 '16 at 12:20

Shounak Chakraborty

11114

asked Jul 10 '16 at 12:20

Shounak Chakraborty

11114

asked Jul 10 '16 at 12:20

Shounak Chakraborty

11114

marked as duplicate by akrun r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jul 10 '16 at 22:14

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

marked as duplicate by akrun r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jul 10 '16 at 22:14

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

– alexis_laz
Jul 10 '16 at 12:42

add a comment |

?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

– alexis_laz
Jul 10 '16 at 12:42

?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

– alexis_laz
Jul 10 '16 at 12:42

add a comment |

3 Answers
3

active

oldest

votes

Another option word function of stringr package

library(stringr)

word(df1$V1,1,sep = "\|")

Data

df1 <- read.table(text = "ABC|DEF|GHI,  

ABCD|EFG|HIJK,  

ABCDE|FGHI|JKL,  

DEF|GHIJ|KLM,  

GHI|JKLM|NO|PQRS,  

BCDE|FGHI|JKL")

answered Jul 10 '16 at 18:11

user2100721

2,93911426

I especially like this package's ability to get, for example, the first two "words".

– Nova
Sep 24 '18 at 13:48

add a comment |

We can use sub

sub("\|.*", "", str1)

#[1] "ABC"

Or with strsplit

strsplit(str1, "[|]")[[1]][1]

#[1] "ABC"

Update

If we use the data from @hrbrmstr

sub("\|.*", "", df$V1)

#[1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

These are all base R methods. No external packages used.

data

str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"

edited Jul 10 '16 at 20:19

answered Jul 10 '16 at 12:21

akrun

421k13209284

add a comment |

with stringi:

library(stringi)



df <- read.table(text="ABC|DEF|GHI,1

ABCD|EFG|HIJK,2

ABCDE|FGHI|JKL,3  

DEF|GHIJ|KLM,4

GHI|JKLM|NO|PQRS,5

BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)



stri_match_first_regex(df$V1, "(.*?)\|")[,2]

## [1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

answered Jul 10 '16 at 14:43

hrbrmstr

62.1k694155

add a comment |

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Another option word function of stringr package

library(stringr)

word(df1$V1,1,sep = "\|")

Data

df1 <- read.table(text = "ABC|DEF|GHI,  

ABCD|EFG|HIJK,  

ABCDE|FGHI|JKL,  

DEF|GHIJ|KLM,  

GHI|JKLM|NO|PQRS,  

BCDE|FGHI|JKL")

answered Jul 10 '16 at 18:11

user2100721

2,93911426

I especially like this package's ability to get, for example, the first two "words".

– Nova
Sep 24 '18 at 13:48

add a comment |

Another option word function of stringr package

library(stringr)

word(df1$V1,1,sep = "\|")

Data

df1 <- read.table(text = "ABC|DEF|GHI,  

ABCD|EFG|HIJK,  

ABCDE|FGHI|JKL,  

DEF|GHIJ|KLM,  

GHI|JKLM|NO|PQRS,  

BCDE|FGHI|JKL")

answered Jul 10 '16 at 18:11

user2100721

2,93911426

I especially like this package's ability to get, for example, the first two "words".

– Nova
Sep 24 '18 at 13:48

add a comment |

Another option word function of stringr package

library(stringr)

word(df1$V1,1,sep = "\|")

Data

df1 <- read.table(text = "ABC|DEF|GHI,  

ABCD|EFG|HIJK,  

ABCDE|FGHI|JKL,  

DEF|GHIJ|KLM,  

GHI|JKLM|NO|PQRS,  

BCDE|FGHI|JKL")

answered Jul 10 '16 at 18:11

user2100721

2,93911426

Another option word function of stringr package

library(stringr)

word(df1$V1,1,sep = "\|")

Data

df1 <- read.table(text = "ABC|DEF|GHI,  

ABCD|EFG|HIJK,  

ABCDE|FGHI|JKL,  

DEF|GHIJ|KLM,  

GHI|JKLM|NO|PQRS,  

BCDE|FGHI|JKL")

answered Jul 10 '16 at 18:11

user2100721

2,93911426

answered Jul 10 '16 at 18:11

user2100721

2,93911426

answered Jul 10 '16 at 18:11

user2100721

2,93911426

answered Jul 10 '16 at 18:11

user2100721

2,93911426

I especially like this package's ability to get, for example, the first two "words".

– Nova
Sep 24 '18 at 13:48

add a comment |

I especially like this package's ability to get, for example, the first two "words".

– Nova
Sep 24 '18 at 13:48

I especially like this package's ability to get, for example, the first two "words".

– Nova
Sep 24 '18 at 13:48

add a comment |

We can use sub

sub("\|.*", "", str1)

#[1] "ABC"

Or with strsplit

strsplit(str1, "[|]")[[1]][1]

#[1] "ABC"

Update

If we use the data from @hrbrmstr

sub("\|.*", "", df$V1)

#[1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

These are all base R methods. No external packages used.

data

str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"

edited Jul 10 '16 at 20:19

answered Jul 10 '16 at 12:21

akrun

421k13209284

add a comment |

We can use sub

sub("\|.*", "", str1)

#[1] "ABC"

Or with strsplit

strsplit(str1, "[|]")[[1]][1]

#[1] "ABC"

Update

If we use the data from @hrbrmstr

sub("\|.*", "", df$V1)

#[1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

These are all base R methods. No external packages used.

data

str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"

edited Jul 10 '16 at 20:19

answered Jul 10 '16 at 12:21

akrun

421k13209284

add a comment |

We can use sub

sub("\|.*", "", str1)

#[1] "ABC"

Or with strsplit

strsplit(str1, "[|]")[[1]][1]

#[1] "ABC"

Update

If we use the data from @hrbrmstr

sub("\|.*", "", df$V1)

#[1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

These are all base R methods. No external packages used.

data

str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"

edited Jul 10 '16 at 20:19

answered Jul 10 '16 at 12:21

akrun

421k13209284

We can use sub

sub("\|.*", "", str1)

#[1] "ABC"

Or with strsplit

strsplit(str1, "[|]")[[1]][1]

#[1] "ABC"

Update

If we use the data from @hrbrmstr

sub("\|.*", "", df$V1)

#[1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

These are all base R methods. No external packages used.

data

str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"

edited Jul 10 '16 at 20:19

answered Jul 10 '16 at 12:21

akrun

421k13209284

edited Jul 10 '16 at 20:19

answered Jul 10 '16 at 12:21

akrun

421k13209284

answered Jul 10 '16 at 12:21

akrun

421k13209284

answered Jul 10 '16 at 12:21

akrun

421k13209284

add a comment |

with stringi:

library(stringi)



df <- read.table(text="ABC|DEF|GHI,1

ABCD|EFG|HIJK,2

ABCDE|FGHI|JKL,3  

DEF|GHIJ|KLM,4

GHI|JKLM|NO|PQRS,5

BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)



stri_match_first_regex(df$V1, "(.*?)\|")[,2]

## [1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

answered Jul 10 '16 at 14:43

hrbrmstr

62.1k694155

add a comment |

with stringi:

library(stringi)



df <- read.table(text="ABC|DEF|GHI,1

ABCD|EFG|HIJK,2

ABCDE|FGHI|JKL,3  

DEF|GHIJ|KLM,4

GHI|JKLM|NO|PQRS,5

BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)



stri_match_first_regex(df$V1, "(.*?)\|")[,2]

## [1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

answered Jul 10 '16 at 14:43

hrbrmstr

62.1k694155

add a comment |

with stringi:

library(stringi)



df <- read.table(text="ABC|DEF|GHI,1

ABCD|EFG|HIJK,2

ABCDE|FGHI|JKL,3  

DEF|GHIJ|KLM,4

GHI|JKLM|NO|PQRS,5

BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)



stri_match_first_regex(df$V1, "(.*?)\|")[,2]

## [1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

answered Jul 10 '16 at 14:43

hrbrmstr

62.1k694155

with stringi:

library(stringi)



df <- read.table(text="ABC|DEF|GHI,1

ABCD|EFG|HIJK,2

ABCDE|FGHI|JKL,3  

DEF|GHIJ|KLM,4

GHI|JKLM|NO|PQRS,5

BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)



stri_match_first_regex(df$V1, "(.*?)\|")[,2]

## [1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE"

answered Jul 10 '16 at 14:43

hrbrmstr

62.1k694155

answered Jul 10 '16 at 14:43

hrbrmstr

62.1k694155

answered Jul 10 '16 at 14:43

hrbrmstr

62.1k694155

answered Jul 10 '16 at 14:43

hrbrmstr

62.1k694155

add a comment |

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu