Extract string before “|” [duplicate]





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







2
















This question already has an answer here:




  • Remove part of string after “.”

    3 answers




I have a data set wherein a column looks like this:



ABC|DEF|GHI,  
ABCD|EFG|HIJK,
ABCDE|FGHI|JKL,
DEF|GHIJ|KLM,
GHI|JKLM|NO|PQRS,
BCDE|FGHI|JKL


.... and so on



I need to extract the characters that appear before the first | symbol.



In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr().



The syntax is - substr(x, <start>,<stop>)



In my case, start will always be 1. For stop, we need to search by |. How can we achieve this? Are there alternate ways to do this?










share|improve this question















marked as duplicate by akrun r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jul 10 '16 at 22:14


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • ?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

    – alexis_laz
    Jul 10 '16 at 12:42


















2
















This question already has an answer here:




  • Remove part of string after “.”

    3 answers




I have a data set wherein a column looks like this:



ABC|DEF|GHI,  
ABCD|EFG|HIJK,
ABCDE|FGHI|JKL,
DEF|GHIJ|KLM,
GHI|JKLM|NO|PQRS,
BCDE|FGHI|JKL


.... and so on



I need to extract the characters that appear before the first | symbol.



In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr().



The syntax is - substr(x, <start>,<stop>)



In my case, start will always be 1. For stop, we need to search by |. How can we achieve this? Are there alternate ways to do this?










share|improve this question















marked as duplicate by akrun r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jul 10 '16 at 22:14


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • ?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

    – alexis_laz
    Jul 10 '16 at 12:42














2












2








2


2







This question already has an answer here:




  • Remove part of string after “.”

    3 answers




I have a data set wherein a column looks like this:



ABC|DEF|GHI,  
ABCD|EFG|HIJK,
ABCDE|FGHI|JKL,
DEF|GHIJ|KLM,
GHI|JKLM|NO|PQRS,
BCDE|FGHI|JKL


.... and so on



I need to extract the characters that appear before the first | symbol.



In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr().



The syntax is - substr(x, <start>,<stop>)



In my case, start will always be 1. For stop, we need to search by |. How can we achieve this? Are there alternate ways to do this?










share|improve this question

















This question already has an answer here:




  • Remove part of string after “.”

    3 answers




I have a data set wherein a column looks like this:



ABC|DEF|GHI,  
ABCD|EFG|HIJK,
ABCDE|FGHI|JKL,
DEF|GHIJ|KLM,
GHI|JKLM|NO|PQRS,
BCDE|FGHI|JKL


.... and so on



I need to extract the characters that appear before the first | symbol.



In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr().



The syntax is - substr(x, <start>,<stop>)



In my case, start will always be 1. For stop, we need to search by |. How can we achieve this? Are there alternate ways to do this?





This question already has an answer here:




  • Remove part of string after “.”

    3 answers








r extract substr






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Aug 28 '16 at 18:29









lmo

32.1k93651




32.1k93651










asked Jul 10 '16 at 12:20









Shounak ChakrabortyShounak Chakraborty

11114




11114




marked as duplicate by akrun r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jul 10 '16 at 22:14


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by akrun r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jul 10 '16 at 22:14


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • ?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

    – alexis_laz
    Jul 10 '16 at 12:42



















  • ?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

    – alexis_laz
    Jul 10 '16 at 12:42

















?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

– alexis_laz
Jul 10 '16 at 12:42





?regexpr returns the index of the first match that can be used as your "stop" argument -- regexpr("|", x, fixed = TRUE) - 1

– alexis_laz
Jul 10 '16 at 12:42












3 Answers
3






active

oldest

votes


















8














Another option word function of stringr package



library(stringr)
word(df1$V1,1,sep = "\|")


Data



df1 <- read.table(text = "ABC|DEF|GHI,  
ABCD|EFG|HIJK,
ABCDE|FGHI|JKL,
DEF|GHIJ|KLM,
GHI|JKLM|NO|PQRS,
BCDE|FGHI|JKL")





share|improve this answer
























  • I especially like this package's ability to get, for example, the first two "words".

    – Nova
    Sep 24 '18 at 13:48



















7














We can use sub



sub("\|.*", "", str1)
#[1] "ABC"


Or with strsplit



strsplit(str1, "[|]")[[1]][1]
#[1] "ABC"


Update



If we use the data from @hrbrmstr



sub("\|.*", "", df$V1)
#[1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"


These are all base R methods. No external packages used.



data



str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"





share|improve this answer

































    3














    with stringi:



    library(stringi)

    df <- read.table(text="ABC|DEF|GHI,1
    ABCD|EFG|HIJK,2
    ABCDE|FGHI|JKL,3
    DEF|GHIJ|KLM,4
    GHI|JKLM|NO|PQRS,5
    BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)

    stri_match_first_regex(df$V1, "(.*?)\|")[,2]
    ## [1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"





    share|improve this answer






























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      8














      Another option word function of stringr package



      library(stringr)
      word(df1$V1,1,sep = "\|")


      Data



      df1 <- read.table(text = "ABC|DEF|GHI,  
      ABCD|EFG|HIJK,
      ABCDE|FGHI|JKL,
      DEF|GHIJ|KLM,
      GHI|JKLM|NO|PQRS,
      BCDE|FGHI|JKL")





      share|improve this answer
























      • I especially like this package's ability to get, for example, the first two "words".

        – Nova
        Sep 24 '18 at 13:48
















      8














      Another option word function of stringr package



      library(stringr)
      word(df1$V1,1,sep = "\|")


      Data



      df1 <- read.table(text = "ABC|DEF|GHI,  
      ABCD|EFG|HIJK,
      ABCDE|FGHI|JKL,
      DEF|GHIJ|KLM,
      GHI|JKLM|NO|PQRS,
      BCDE|FGHI|JKL")





      share|improve this answer
























      • I especially like this package's ability to get, for example, the first two "words".

        – Nova
        Sep 24 '18 at 13:48














      8












      8








      8







      Another option word function of stringr package



      library(stringr)
      word(df1$V1,1,sep = "\|")


      Data



      df1 <- read.table(text = "ABC|DEF|GHI,  
      ABCD|EFG|HIJK,
      ABCDE|FGHI|JKL,
      DEF|GHIJ|KLM,
      GHI|JKLM|NO|PQRS,
      BCDE|FGHI|JKL")





      share|improve this answer













      Another option word function of stringr package



      library(stringr)
      word(df1$V1,1,sep = "\|")


      Data



      df1 <- read.table(text = "ABC|DEF|GHI,  
      ABCD|EFG|HIJK,
      ABCDE|FGHI|JKL,
      DEF|GHIJ|KLM,
      GHI|JKLM|NO|PQRS,
      BCDE|FGHI|JKL")






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Jul 10 '16 at 18:11









      user2100721user2100721

      2,93911426




      2,93911426













      • I especially like this package's ability to get, for example, the first two "words".

        – Nova
        Sep 24 '18 at 13:48



















      • I especially like this package's ability to get, for example, the first two "words".

        – Nova
        Sep 24 '18 at 13:48

















      I especially like this package's ability to get, for example, the first two "words".

      – Nova
      Sep 24 '18 at 13:48





      I especially like this package's ability to get, for example, the first two "words".

      – Nova
      Sep 24 '18 at 13:48













      7














      We can use sub



      sub("\|.*", "", str1)
      #[1] "ABC"


      Or with strsplit



      strsplit(str1, "[|]")[[1]][1]
      #[1] "ABC"


      Update



      If we use the data from @hrbrmstr



      sub("\|.*", "", df$V1)
      #[1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"


      These are all base R methods. No external packages used.



      data



      str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"





      share|improve this answer






























        7














        We can use sub



        sub("\|.*", "", str1)
        #[1] "ABC"


        Or with strsplit



        strsplit(str1, "[|]")[[1]][1]
        #[1] "ABC"


        Update



        If we use the data from @hrbrmstr



        sub("\|.*", "", df$V1)
        #[1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"


        These are all base R methods. No external packages used.



        data



        str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"





        share|improve this answer




























          7












          7








          7







          We can use sub



          sub("\|.*", "", str1)
          #[1] "ABC"


          Or with strsplit



          strsplit(str1, "[|]")[[1]][1]
          #[1] "ABC"


          Update



          If we use the data from @hrbrmstr



          sub("\|.*", "", df$V1)
          #[1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"


          These are all base R methods. No external packages used.



          data



          str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"





          share|improve this answer















          We can use sub



          sub("\|.*", "", str1)
          #[1] "ABC"


          Or with strsplit



          strsplit(str1, "[|]")[[1]][1]
          #[1] "ABC"


          Update



          If we use the data from @hrbrmstr



          sub("\|.*", "", df$V1)
          #[1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"


          These are all base R methods. No external packages used.



          data



          str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jul 10 '16 at 20:19

























          answered Jul 10 '16 at 12:21









          akrunakrun

          421k13209284




          421k13209284























              3














              with stringi:



              library(stringi)

              df <- read.table(text="ABC|DEF|GHI,1
              ABCD|EFG|HIJK,2
              ABCDE|FGHI|JKL,3
              DEF|GHIJ|KLM,4
              GHI|JKLM|NO|PQRS,5
              BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)

              stri_match_first_regex(df$V1, "(.*?)\|")[,2]
              ## [1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"





              share|improve this answer




























                3














                with stringi:



                library(stringi)

                df <- read.table(text="ABC|DEF|GHI,1
                ABCD|EFG|HIJK,2
                ABCDE|FGHI|JKL,3
                DEF|GHIJ|KLM,4
                GHI|JKLM|NO|PQRS,5
                BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)

                stri_match_first_regex(df$V1, "(.*?)\|")[,2]
                ## [1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"





                share|improve this answer


























                  3












                  3








                  3







                  with stringi:



                  library(stringi)

                  df <- read.table(text="ABC|DEF|GHI,1
                  ABCD|EFG|HIJK,2
                  ABCDE|FGHI|JKL,3
                  DEF|GHIJ|KLM,4
                  GHI|JKLM|NO|PQRS,5
                  BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)

                  stri_match_first_regex(df$V1, "(.*?)\|")[,2]
                  ## [1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"





                  share|improve this answer













                  with stringi:



                  library(stringi)

                  df <- read.table(text="ABC|DEF|GHI,1
                  ABCD|EFG|HIJK,2
                  ABCDE|FGHI|JKL,3
                  DEF|GHIJ|KLM,4
                  GHI|JKLM|NO|PQRS,5
                  BCDE|FGHI|JKL,6", sep=",", header=FALSE, stringsAsFactors=FALSE)

                  stri_match_first_regex(df$V1, "(.*?)\|")[,2]
                  ## [1] "ABC" "ABCD" "ABCDE" "DEF" "GHI" "BCDE"






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jul 10 '16 at 14:43









                  hrbrmstrhrbrmstr

                  62.1k694155




                  62.1k694155















                      Popular posts from this blog

                      MongoDB - Not Authorized To Execute Command

                      in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

                      How to fix TextFormField cause rebuild widget in Flutter