R - splitting a column of varied string length in a data frame into multiple columns of just one character












1















I have a data frame like this:



Name     S1     S2     S3     Symbol
n_12 2.3 6.1 0 A
n_13 3.4 3.7 0 ACM
n_14 1.3 1.0 0 BN
n_23 2.0 4.1 0 NOPXY


And I am looking to split the last column, Symbol, into multiple columns, each with one character or nothing.



    Name     S1     S2     S3     Sy1     Sy2     Sy3     Sy4     Sy5
n_12 2.3 6.1 0 A
n_13 3.4 3.7 0 A C M
n_14 1.3 1.0 0 B N
n_23 2.0 4.1 0 N O P X Y


Thank you for any and all help with this.










share|improve this question





























    1















    I have a data frame like this:



    Name     S1     S2     S3     Symbol
    n_12 2.3 6.1 0 A
    n_13 3.4 3.7 0 ACM
    n_14 1.3 1.0 0 BN
    n_23 2.0 4.1 0 NOPXY


    And I am looking to split the last column, Symbol, into multiple columns, each with one character or nothing.



        Name     S1     S2     S3     Sy1     Sy2     Sy3     Sy4     Sy5
    n_12 2.3 6.1 0 A
    n_13 3.4 3.7 0 A C M
    n_14 1.3 1.0 0 B N
    n_23 2.0 4.1 0 N O P X Y


    Thank you for any and all help with this.










    share|improve this question



























      1












      1








      1








      I have a data frame like this:



      Name     S1     S2     S3     Symbol
      n_12 2.3 6.1 0 A
      n_13 3.4 3.7 0 ACM
      n_14 1.3 1.0 0 BN
      n_23 2.0 4.1 0 NOPXY


      And I am looking to split the last column, Symbol, into multiple columns, each with one character or nothing.



          Name     S1     S2     S3     Sy1     Sy2     Sy3     Sy4     Sy5
      n_12 2.3 6.1 0 A
      n_13 3.4 3.7 0 A C M
      n_14 1.3 1.0 0 B N
      n_23 2.0 4.1 0 N O P X Y


      Thank you for any and all help with this.










      share|improve this question
















      I have a data frame like this:



      Name     S1     S2     S3     Symbol
      n_12 2.3 6.1 0 A
      n_13 3.4 3.7 0 ACM
      n_14 1.3 1.0 0 BN
      n_23 2.0 4.1 0 NOPXY


      And I am looking to split the last column, Symbol, into multiple columns, each with one character or nothing.



          Name     S1     S2     S3     Sy1     Sy2     Sy3     Sy4     Sy5
      n_12 2.3 6.1 0 A
      n_13 3.4 3.7 0 A C M
      n_14 1.3 1.0 0 B N
      n_23 2.0 4.1 0 N O P X Y


      Thank you for any and all help with this.







      r split






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Oct 16 '18 at 23:16









      divibisan

      4,89681833




      4,89681833










      asked Oct 16 '18 at 22:56









      learning5023learning5023

      122




      122
























          3 Answers
          3






          active

          oldest

          votes


















          6














          One way to do this is with tidyr::separate which splits a single column containing a string into multiple columns containing substrings.



          df
          Name S1 S2 S3 Symbol
          1 n_12 2.3 6.1 0 A
          2 n_13 3.4 3.7 0 ACM
          3 n_14 1.3 1.0 0 BN
          4 n_23 2.0 4.1 0 NOPXY


          The sep= argument for separate accepts either a regex, or a numeric vector listing the positions in the string to split on. Since we want to split after every character, we want to give a numeric sequence from 1 to the length of the longest string (-1, since we don't need to split after the last character). The length of the longest string is calculated with max(nchar(.$Symbol)). Thanks to Rich Scriven for pointing out that nchar is vectorized and so doesn't need to be called with sapply.



          We then make a character vector with the names of the columns to split Symbol into. In your case, we can just paste 'Sy' to that same numeric sequence to get c('Sy1', 'Sy2' ...)



          df %>%
          tidyr::separate(Symbol,
          sep = seq_len(max(nchar(.$Symbol)) - 1),
          into = paste0('Sy', seq_len(max(nchar(.$Symbol)))))

          Name S1 S2 S3 Sy1 Sy2 Sy3 Sy4 Sy5
          1 n_12 2.3 6.1 0 A
          2 n_13 3.4 3.7 0 A C M
          3 n_14 1.3 1.0 0 B N
          4 n_23 2.0 4.1 0 N O P X Y




          If you get the following error:



          Error in nchar(.$Symbol) : 'nchar()' requires a character vector


          then it is likely that df$Symbol is of type factor (the default when creating or loading a data.frame) not character.



          You can either provide read.table or data.frame with the argument stringsAsFactor=F to keep the Symbol variable from being converted to factor, or convert it back to character.



          Tidyverse option (which can be inserted into the pipe just before the call to tidyr::separate:



          df <- df %>%
          dplyr::mutate(Symbol = as.character(Symbol))


          or with base R:



          df$Symbol <- as.character(df$Symbol)





          share|improve this answer


























          • That's because because Symbol is of type factor not character. You should use the stringsAsFactor=F when loading/creating the data.frame, or convert it with dplyr::mutate(Symbol = as.character(Symbol)) or df$Symbol <- as.character(df$Symbol)

            – divibisan
            Oct 17 '18 at 15:58













          • Ah, I see- that works - thank you so much!

            – learning5023
            Oct 17 '18 at 16:03



















          3














          Here's a base R version using strcapture:



          ns <- max(nchar(dat$Symbol))
          cbind(
          dat,
          strcapture(
          paste(rep("(.)", ns), collapse=""),
          format(dat$Symbol, width=ns),
          proto=setNames(rep(list(""), ns), paste0("Sy",1:ns))
          )
          )


          A late base R addition using substring, which loops over each of the inputs, including the start and ends of each substring:



          dat[paste0("Sy",seq(ns))] <- matrix(substring(rep(dat$Symbol,each=ns),
          seq(ns), seq(ns)), ncol=ns, byrow=TRUE)


          # Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
          #1 n_12 2.3 6.1 0 A A
          #2 n_13 3.4 3.7 0 ACM A C M
          #3 n_14 1.3 1.0 0 BN B N
          #4 n_23 2.0 4.1 0 NOPXY N O P X Y





          share|improve this answer

































            1














            Here's an R base using brute force:



            string <- strsplit(df$Symbol, "")
            ind <- max(lengths(string))
            out <- data.frame(df, do.call(rbind, lapply(string, function(x) {
            if(length(x) != ind){
            c(x[1:length(x)], x[(length(x)+1):ind] )
            }else{
            x
            }
            })))
            names(out) <- sub("X(\d)", "Sy\1", names(out))
            print(out, na.print = "")

            Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
            1 n_12 2.3 6.1 0 A A
            2 n_13 3.4 3.7 0 ACM A C M
            3 n_14 1.3 1.0 0 BN B N
            4 n_23 2.0 4.1 0 NOPXY N O P X Y





            share|improve this answer























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52845046%2fr-splitting-a-column-of-varied-string-length-in-a-data-frame-into-multiple-col%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              6














              One way to do this is with tidyr::separate which splits a single column containing a string into multiple columns containing substrings.



              df
              Name S1 S2 S3 Symbol
              1 n_12 2.3 6.1 0 A
              2 n_13 3.4 3.7 0 ACM
              3 n_14 1.3 1.0 0 BN
              4 n_23 2.0 4.1 0 NOPXY


              The sep= argument for separate accepts either a regex, or a numeric vector listing the positions in the string to split on. Since we want to split after every character, we want to give a numeric sequence from 1 to the length of the longest string (-1, since we don't need to split after the last character). The length of the longest string is calculated with max(nchar(.$Symbol)). Thanks to Rich Scriven for pointing out that nchar is vectorized and so doesn't need to be called with sapply.



              We then make a character vector with the names of the columns to split Symbol into. In your case, we can just paste 'Sy' to that same numeric sequence to get c('Sy1', 'Sy2' ...)



              df %>%
              tidyr::separate(Symbol,
              sep = seq_len(max(nchar(.$Symbol)) - 1),
              into = paste0('Sy', seq_len(max(nchar(.$Symbol)))))

              Name S1 S2 S3 Sy1 Sy2 Sy3 Sy4 Sy5
              1 n_12 2.3 6.1 0 A
              2 n_13 3.4 3.7 0 A C M
              3 n_14 1.3 1.0 0 B N
              4 n_23 2.0 4.1 0 N O P X Y




              If you get the following error:



              Error in nchar(.$Symbol) : 'nchar()' requires a character vector


              then it is likely that df$Symbol is of type factor (the default when creating or loading a data.frame) not character.



              You can either provide read.table or data.frame with the argument stringsAsFactor=F to keep the Symbol variable from being converted to factor, or convert it back to character.



              Tidyverse option (which can be inserted into the pipe just before the call to tidyr::separate:



              df <- df %>%
              dplyr::mutate(Symbol = as.character(Symbol))


              or with base R:



              df$Symbol <- as.character(df$Symbol)





              share|improve this answer


























              • That's because because Symbol is of type factor not character. You should use the stringsAsFactor=F when loading/creating the data.frame, or convert it with dplyr::mutate(Symbol = as.character(Symbol)) or df$Symbol <- as.character(df$Symbol)

                – divibisan
                Oct 17 '18 at 15:58













              • Ah, I see- that works - thank you so much!

                – learning5023
                Oct 17 '18 at 16:03
















              6














              One way to do this is with tidyr::separate which splits a single column containing a string into multiple columns containing substrings.



              df
              Name S1 S2 S3 Symbol
              1 n_12 2.3 6.1 0 A
              2 n_13 3.4 3.7 0 ACM
              3 n_14 1.3 1.0 0 BN
              4 n_23 2.0 4.1 0 NOPXY


              The sep= argument for separate accepts either a regex, or a numeric vector listing the positions in the string to split on. Since we want to split after every character, we want to give a numeric sequence from 1 to the length of the longest string (-1, since we don't need to split after the last character). The length of the longest string is calculated with max(nchar(.$Symbol)). Thanks to Rich Scriven for pointing out that nchar is vectorized and so doesn't need to be called with sapply.



              We then make a character vector with the names of the columns to split Symbol into. In your case, we can just paste 'Sy' to that same numeric sequence to get c('Sy1', 'Sy2' ...)



              df %>%
              tidyr::separate(Symbol,
              sep = seq_len(max(nchar(.$Symbol)) - 1),
              into = paste0('Sy', seq_len(max(nchar(.$Symbol)))))

              Name S1 S2 S3 Sy1 Sy2 Sy3 Sy4 Sy5
              1 n_12 2.3 6.1 0 A
              2 n_13 3.4 3.7 0 A C M
              3 n_14 1.3 1.0 0 B N
              4 n_23 2.0 4.1 0 N O P X Y




              If you get the following error:



              Error in nchar(.$Symbol) : 'nchar()' requires a character vector


              then it is likely that df$Symbol is of type factor (the default when creating or loading a data.frame) not character.



              You can either provide read.table or data.frame with the argument stringsAsFactor=F to keep the Symbol variable from being converted to factor, or convert it back to character.



              Tidyverse option (which can be inserted into the pipe just before the call to tidyr::separate:



              df <- df %>%
              dplyr::mutate(Symbol = as.character(Symbol))


              or with base R:



              df$Symbol <- as.character(df$Symbol)





              share|improve this answer


























              • That's because because Symbol is of type factor not character. You should use the stringsAsFactor=F when loading/creating the data.frame, or convert it with dplyr::mutate(Symbol = as.character(Symbol)) or df$Symbol <- as.character(df$Symbol)

                – divibisan
                Oct 17 '18 at 15:58













              • Ah, I see- that works - thank you so much!

                – learning5023
                Oct 17 '18 at 16:03














              6












              6








              6







              One way to do this is with tidyr::separate which splits a single column containing a string into multiple columns containing substrings.



              df
              Name S1 S2 S3 Symbol
              1 n_12 2.3 6.1 0 A
              2 n_13 3.4 3.7 0 ACM
              3 n_14 1.3 1.0 0 BN
              4 n_23 2.0 4.1 0 NOPXY


              The sep= argument for separate accepts either a regex, or a numeric vector listing the positions in the string to split on. Since we want to split after every character, we want to give a numeric sequence from 1 to the length of the longest string (-1, since we don't need to split after the last character). The length of the longest string is calculated with max(nchar(.$Symbol)). Thanks to Rich Scriven for pointing out that nchar is vectorized and so doesn't need to be called with sapply.



              We then make a character vector with the names of the columns to split Symbol into. In your case, we can just paste 'Sy' to that same numeric sequence to get c('Sy1', 'Sy2' ...)



              df %>%
              tidyr::separate(Symbol,
              sep = seq_len(max(nchar(.$Symbol)) - 1),
              into = paste0('Sy', seq_len(max(nchar(.$Symbol)))))

              Name S1 S2 S3 Sy1 Sy2 Sy3 Sy4 Sy5
              1 n_12 2.3 6.1 0 A
              2 n_13 3.4 3.7 0 A C M
              3 n_14 1.3 1.0 0 B N
              4 n_23 2.0 4.1 0 N O P X Y




              If you get the following error:



              Error in nchar(.$Symbol) : 'nchar()' requires a character vector


              then it is likely that df$Symbol is of type factor (the default when creating or loading a data.frame) not character.



              You can either provide read.table or data.frame with the argument stringsAsFactor=F to keep the Symbol variable from being converted to factor, or convert it back to character.



              Tidyverse option (which can be inserted into the pipe just before the call to tidyr::separate:



              df <- df %>%
              dplyr::mutate(Symbol = as.character(Symbol))


              or with base R:



              df$Symbol <- as.character(df$Symbol)





              share|improve this answer















              One way to do this is with tidyr::separate which splits a single column containing a string into multiple columns containing substrings.



              df
              Name S1 S2 S3 Symbol
              1 n_12 2.3 6.1 0 A
              2 n_13 3.4 3.7 0 ACM
              3 n_14 1.3 1.0 0 BN
              4 n_23 2.0 4.1 0 NOPXY


              The sep= argument for separate accepts either a regex, or a numeric vector listing the positions in the string to split on. Since we want to split after every character, we want to give a numeric sequence from 1 to the length of the longest string (-1, since we don't need to split after the last character). The length of the longest string is calculated with max(nchar(.$Symbol)). Thanks to Rich Scriven for pointing out that nchar is vectorized and so doesn't need to be called with sapply.



              We then make a character vector with the names of the columns to split Symbol into. In your case, we can just paste 'Sy' to that same numeric sequence to get c('Sy1', 'Sy2' ...)



              df %>%
              tidyr::separate(Symbol,
              sep = seq_len(max(nchar(.$Symbol)) - 1),
              into = paste0('Sy', seq_len(max(nchar(.$Symbol)))))

              Name S1 S2 S3 Sy1 Sy2 Sy3 Sy4 Sy5
              1 n_12 2.3 6.1 0 A
              2 n_13 3.4 3.7 0 A C M
              3 n_14 1.3 1.0 0 B N
              4 n_23 2.0 4.1 0 N O P X Y




              If you get the following error:



              Error in nchar(.$Symbol) : 'nchar()' requires a character vector


              then it is likely that df$Symbol is of type factor (the default when creating or loading a data.frame) not character.



              You can either provide read.table or data.frame with the argument stringsAsFactor=F to keep the Symbol variable from being converted to factor, or convert it back to character.



              Tidyverse option (which can be inserted into the pipe just before the call to tidyr::separate:



              df <- df %>%
              dplyr::mutate(Symbol = as.character(Symbol))


              or with base R:



              df$Symbol <- as.character(df$Symbol)






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Oct 17 '18 at 16:02

























              answered Oct 16 '18 at 23:15









              divibisandivibisan

              4,89681833




              4,89681833













              • That's because because Symbol is of type factor not character. You should use the stringsAsFactor=F when loading/creating the data.frame, or convert it with dplyr::mutate(Symbol = as.character(Symbol)) or df$Symbol <- as.character(df$Symbol)

                – divibisan
                Oct 17 '18 at 15:58













              • Ah, I see- that works - thank you so much!

                – learning5023
                Oct 17 '18 at 16:03



















              • That's because because Symbol is of type factor not character. You should use the stringsAsFactor=F when loading/creating the data.frame, or convert it with dplyr::mutate(Symbol = as.character(Symbol)) or df$Symbol <- as.character(df$Symbol)

                – divibisan
                Oct 17 '18 at 15:58













              • Ah, I see- that works - thank you so much!

                – learning5023
                Oct 17 '18 at 16:03

















              That's because because Symbol is of type factor not character. You should use the stringsAsFactor=F when loading/creating the data.frame, or convert it with dplyr::mutate(Symbol = as.character(Symbol)) or df$Symbol <- as.character(df$Symbol)

              – divibisan
              Oct 17 '18 at 15:58







              That's because because Symbol is of type factor not character. You should use the stringsAsFactor=F when loading/creating the data.frame, or convert it with dplyr::mutate(Symbol = as.character(Symbol)) or df$Symbol <- as.character(df$Symbol)

              – divibisan
              Oct 17 '18 at 15:58















              Ah, I see- that works - thank you so much!

              – learning5023
              Oct 17 '18 at 16:03





              Ah, I see- that works - thank you so much!

              – learning5023
              Oct 17 '18 at 16:03













              3














              Here's a base R version using strcapture:



              ns <- max(nchar(dat$Symbol))
              cbind(
              dat,
              strcapture(
              paste(rep("(.)", ns), collapse=""),
              format(dat$Symbol, width=ns),
              proto=setNames(rep(list(""), ns), paste0("Sy",1:ns))
              )
              )


              A late base R addition using substring, which loops over each of the inputs, including the start and ends of each substring:



              dat[paste0("Sy",seq(ns))] <- matrix(substring(rep(dat$Symbol,each=ns),
              seq(ns), seq(ns)), ncol=ns, byrow=TRUE)


              # Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
              #1 n_12 2.3 6.1 0 A A
              #2 n_13 3.4 3.7 0 ACM A C M
              #3 n_14 1.3 1.0 0 BN B N
              #4 n_23 2.0 4.1 0 NOPXY N O P X Y





              share|improve this answer






























                3














                Here's a base R version using strcapture:



                ns <- max(nchar(dat$Symbol))
                cbind(
                dat,
                strcapture(
                paste(rep("(.)", ns), collapse=""),
                format(dat$Symbol, width=ns),
                proto=setNames(rep(list(""), ns), paste0("Sy",1:ns))
                )
                )


                A late base R addition using substring, which loops over each of the inputs, including the start and ends of each substring:



                dat[paste0("Sy",seq(ns))] <- matrix(substring(rep(dat$Symbol,each=ns),
                seq(ns), seq(ns)), ncol=ns, byrow=TRUE)


                # Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
                #1 n_12 2.3 6.1 0 A A
                #2 n_13 3.4 3.7 0 ACM A C M
                #3 n_14 1.3 1.0 0 BN B N
                #4 n_23 2.0 4.1 0 NOPXY N O P X Y





                share|improve this answer




























                  3












                  3








                  3







                  Here's a base R version using strcapture:



                  ns <- max(nchar(dat$Symbol))
                  cbind(
                  dat,
                  strcapture(
                  paste(rep("(.)", ns), collapse=""),
                  format(dat$Symbol, width=ns),
                  proto=setNames(rep(list(""), ns), paste0("Sy",1:ns))
                  )
                  )


                  A late base R addition using substring, which loops over each of the inputs, including the start and ends of each substring:



                  dat[paste0("Sy",seq(ns))] <- matrix(substring(rep(dat$Symbol,each=ns),
                  seq(ns), seq(ns)), ncol=ns, byrow=TRUE)


                  # Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
                  #1 n_12 2.3 6.1 0 A A
                  #2 n_13 3.4 3.7 0 ACM A C M
                  #3 n_14 1.3 1.0 0 BN B N
                  #4 n_23 2.0 4.1 0 NOPXY N O P X Y





                  share|improve this answer















                  Here's a base R version using strcapture:



                  ns <- max(nchar(dat$Symbol))
                  cbind(
                  dat,
                  strcapture(
                  paste(rep("(.)", ns), collapse=""),
                  format(dat$Symbol, width=ns),
                  proto=setNames(rep(list(""), ns), paste0("Sy",1:ns))
                  )
                  )


                  A late base R addition using substring, which loops over each of the inputs, including the start and ends of each substring:



                  dat[paste0("Sy",seq(ns))] <- matrix(substring(rep(dat$Symbol,each=ns),
                  seq(ns), seq(ns)), ncol=ns, byrow=TRUE)


                  # Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
                  #1 n_12 2.3 6.1 0 A A
                  #2 n_13 3.4 3.7 0 ACM A C M
                  #3 n_14 1.3 1.0 0 BN B N
                  #4 n_23 2.0 4.1 0 NOPXY N O P X Y






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Jan 2 at 1:11

























                  answered Oct 16 '18 at 23:42









                  thelatemailthelatemail

                  68k883151




                  68k883151























                      1














                      Here's an R base using brute force:



                      string <- strsplit(df$Symbol, "")
                      ind <- max(lengths(string))
                      out <- data.frame(df, do.call(rbind, lapply(string, function(x) {
                      if(length(x) != ind){
                      c(x[1:length(x)], x[(length(x)+1):ind] )
                      }else{
                      x
                      }
                      })))
                      names(out) <- sub("X(\d)", "Sy\1", names(out))
                      print(out, na.print = "")

                      Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
                      1 n_12 2.3 6.1 0 A A
                      2 n_13 3.4 3.7 0 ACM A C M
                      3 n_14 1.3 1.0 0 BN B N
                      4 n_23 2.0 4.1 0 NOPXY N O P X Y





                      share|improve this answer




























                        1














                        Here's an R base using brute force:



                        string <- strsplit(df$Symbol, "")
                        ind <- max(lengths(string))
                        out <- data.frame(df, do.call(rbind, lapply(string, function(x) {
                        if(length(x) != ind){
                        c(x[1:length(x)], x[(length(x)+1):ind] )
                        }else{
                        x
                        }
                        })))
                        names(out) <- sub("X(\d)", "Sy\1", names(out))
                        print(out, na.print = "")

                        Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
                        1 n_12 2.3 6.1 0 A A
                        2 n_13 3.4 3.7 0 ACM A C M
                        3 n_14 1.3 1.0 0 BN B N
                        4 n_23 2.0 4.1 0 NOPXY N O P X Y





                        share|improve this answer


























                          1












                          1








                          1







                          Here's an R base using brute force:



                          string <- strsplit(df$Symbol, "")
                          ind <- max(lengths(string))
                          out <- data.frame(df, do.call(rbind, lapply(string, function(x) {
                          if(length(x) != ind){
                          c(x[1:length(x)], x[(length(x)+1):ind] )
                          }else{
                          x
                          }
                          })))
                          names(out) <- sub("X(\d)", "Sy\1", names(out))
                          print(out, na.print = "")

                          Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
                          1 n_12 2.3 6.1 0 A A
                          2 n_13 3.4 3.7 0 ACM A C M
                          3 n_14 1.3 1.0 0 BN B N
                          4 n_23 2.0 4.1 0 NOPXY N O P X Y





                          share|improve this answer













                          Here's an R base using brute force:



                          string <- strsplit(df$Symbol, "")
                          ind <- max(lengths(string))
                          out <- data.frame(df, do.call(rbind, lapply(string, function(x) {
                          if(length(x) != ind){
                          c(x[1:length(x)], x[(length(x)+1):ind] )
                          }else{
                          x
                          }
                          })))
                          names(out) <- sub("X(\d)", "Sy\1", names(out))
                          print(out, na.print = "")

                          Name S1 S2 S3 Symbol Sy1 Sy2 Sy3 Sy4 Sy5
                          1 n_12 2.3 6.1 0 A A
                          2 n_13 3.4 3.7 0 ACM A C M
                          3 n_14 1.3 1.0 0 BN B N
                          4 n_23 2.0 4.1 0 NOPXY N O P X Y






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Oct 16 '18 at 23:56









                          Jilber UrbinaJilber Urbina

                          43k482114




                          43k482114






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52845046%2fr-splitting-a-column-of-varied-string-length-in-a-data-frame-into-multiple-col%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              MongoDB - Not Authorized To Execute Command

                              How to fix TextFormField cause rebuild widget in Flutter

                              in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith