Split regex to extract Strings of contiguous characters












18















Is there a regex that would work with String.split() to break a String into contiguous characters - ie split where the next character is different to the previous character?



Here's the test case:



    String regex = "your answer here";
String parts = "aaabbcddeee".split(regex);
System.out.println(Arrays.toString(parts));


Expected output:



[aaa, bb, c, dd, eee]


Although the test case has letters only as input, this is for clarity only; input characters may be any character.





Please do not provide "work-arounds" involving loops or other techniques.



The question is to find the right regex for the code as shown above - ie only using split() and no other methods calls. It is not a question about finding code that will "do the job".










share|improve this question





























    18















    Is there a regex that would work with String.split() to break a String into contiguous characters - ie split where the next character is different to the previous character?



    Here's the test case:



        String regex = "your answer here";
    String parts = "aaabbcddeee".split(regex);
    System.out.println(Arrays.toString(parts));


    Expected output:



    [aaa, bb, c, dd, eee]


    Although the test case has letters only as input, this is for clarity only; input characters may be any character.





    Please do not provide "work-arounds" involving loops or other techniques.



    The question is to find the right regex for the code as shown above - ie only using split() and no other methods calls. It is not a question about finding code that will "do the job".










    share|improve this question



























      18












      18








      18


      9






      Is there a regex that would work with String.split() to break a String into contiguous characters - ie split where the next character is different to the previous character?



      Here's the test case:



          String regex = "your answer here";
      String parts = "aaabbcddeee".split(regex);
      System.out.println(Arrays.toString(parts));


      Expected output:



      [aaa, bb, c, dd, eee]


      Although the test case has letters only as input, this is for clarity only; input characters may be any character.





      Please do not provide "work-arounds" involving loops or other techniques.



      The question is to find the right regex for the code as shown above - ie only using split() and no other methods calls. It is not a question about finding code that will "do the job".










      share|improve this question
















      Is there a regex that would work with String.split() to break a String into contiguous characters - ie split where the next character is different to the previous character?



      Here's the test case:



          String regex = "your answer here";
      String parts = "aaabbcddeee".split(regex);
      System.out.println(Arrays.toString(parts));


      Expected output:



      [aaa, bb, c, dd, eee]


      Although the test case has letters only as input, this is for clarity only; input characters may be any character.





      Please do not provide "work-arounds" involving loops or other techniques.



      The question is to find the right regex for the code as shown above - ie only using split() and no other methods calls. It is not a question about finding code that will "do the job".







      java regex split






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 28 '12 at 2:05







      Bohemian

















      asked Nov 28 '12 at 1:46









      BohemianBohemian

      295k64416553




      295k64416553
























          1 Answer
          1






          active

          oldest

          votes


















          26














          It is totally possible to write the regex for splitting in one step:



          "(?<=(.))(?!\1)"


          Since you want to split between every group of same characters, we just need to look for the boundary between 2 groups. I achieve this by using a positive look-behind just to grab the previous character, and use a negative look-ahead and back-reference to check that the next character is not the same character.



          As you can see, the regex is zero-width (only 2 look around assertions). No character is consumed by the regex.






          share|improve this answer



















          • 2





            +1 Success.

            – nickb
            Nov 28 '12 at 2:18








          • 1





            +1 Nice one, I'll make a note of this for the future.

            – arshajii
            Nov 28 '12 at 2:20











          • Ooh, nice! :) +1 more

            – Amadan
            Nov 28 '12 at 2:28











          • +1 Good job. Not being sour grapes, but I tried that except I coded the look ahead as (?!=\1) thinking != like the java boolean comparison. Doh!

            – Bohemian
            Nov 28 '12 at 7:42











          • in .net character within the group i.e (.) is also included in the result..i wonder why it's not the case with java

            – Anirudha
            Nov 28 '12 at 8:17













          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f13596454%2fsplit-regex-to-extract-strings-of-contiguous-characters%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          26














          It is totally possible to write the regex for splitting in one step:



          "(?<=(.))(?!\1)"


          Since you want to split between every group of same characters, we just need to look for the boundary between 2 groups. I achieve this by using a positive look-behind just to grab the previous character, and use a negative look-ahead and back-reference to check that the next character is not the same character.



          As you can see, the regex is zero-width (only 2 look around assertions). No character is consumed by the regex.






          share|improve this answer



















          • 2





            +1 Success.

            – nickb
            Nov 28 '12 at 2:18








          • 1





            +1 Nice one, I'll make a note of this for the future.

            – arshajii
            Nov 28 '12 at 2:20











          • Ooh, nice! :) +1 more

            – Amadan
            Nov 28 '12 at 2:28











          • +1 Good job. Not being sour grapes, but I tried that except I coded the look ahead as (?!=\1) thinking != like the java boolean comparison. Doh!

            – Bohemian
            Nov 28 '12 at 7:42











          • in .net character within the group i.e (.) is also included in the result..i wonder why it's not the case with java

            – Anirudha
            Nov 28 '12 at 8:17


















          26














          It is totally possible to write the regex for splitting in one step:



          "(?<=(.))(?!\1)"


          Since you want to split between every group of same characters, we just need to look for the boundary between 2 groups. I achieve this by using a positive look-behind just to grab the previous character, and use a negative look-ahead and back-reference to check that the next character is not the same character.



          As you can see, the regex is zero-width (only 2 look around assertions). No character is consumed by the regex.






          share|improve this answer



















          • 2





            +1 Success.

            – nickb
            Nov 28 '12 at 2:18








          • 1





            +1 Nice one, I'll make a note of this for the future.

            – arshajii
            Nov 28 '12 at 2:20











          • Ooh, nice! :) +1 more

            – Amadan
            Nov 28 '12 at 2:28











          • +1 Good job. Not being sour grapes, but I tried that except I coded the look ahead as (?!=\1) thinking != like the java boolean comparison. Doh!

            – Bohemian
            Nov 28 '12 at 7:42











          • in .net character within the group i.e (.) is also included in the result..i wonder why it's not the case with java

            – Anirudha
            Nov 28 '12 at 8:17
















          26












          26








          26







          It is totally possible to write the regex for splitting in one step:



          "(?<=(.))(?!\1)"


          Since you want to split between every group of same characters, we just need to look for the boundary between 2 groups. I achieve this by using a positive look-behind just to grab the previous character, and use a negative look-ahead and back-reference to check that the next character is not the same character.



          As you can see, the regex is zero-width (only 2 look around assertions). No character is consumed by the regex.






          share|improve this answer













          It is totally possible to write the regex for splitting in one step:



          "(?<=(.))(?!\1)"


          Since you want to split between every group of same characters, we just need to look for the boundary between 2 groups. I achieve this by using a positive look-behind just to grab the previous character, and use a negative look-ahead and back-reference to check that the next character is not the same character.



          As you can see, the regex is zero-width (only 2 look around assertions). No character is consumed by the regex.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 28 '12 at 2:17









          nhahtdhnhahtdh

          47.6k1291130




          47.6k1291130








          • 2





            +1 Success.

            – nickb
            Nov 28 '12 at 2:18








          • 1





            +1 Nice one, I'll make a note of this for the future.

            – arshajii
            Nov 28 '12 at 2:20











          • Ooh, nice! :) +1 more

            – Amadan
            Nov 28 '12 at 2:28











          • +1 Good job. Not being sour grapes, but I tried that except I coded the look ahead as (?!=\1) thinking != like the java boolean comparison. Doh!

            – Bohemian
            Nov 28 '12 at 7:42











          • in .net character within the group i.e (.) is also included in the result..i wonder why it's not the case with java

            – Anirudha
            Nov 28 '12 at 8:17
















          • 2





            +1 Success.

            – nickb
            Nov 28 '12 at 2:18








          • 1





            +1 Nice one, I'll make a note of this for the future.

            – arshajii
            Nov 28 '12 at 2:20











          • Ooh, nice! :) +1 more

            – Amadan
            Nov 28 '12 at 2:28











          • +1 Good job. Not being sour grapes, but I tried that except I coded the look ahead as (?!=\1) thinking != like the java boolean comparison. Doh!

            – Bohemian
            Nov 28 '12 at 7:42











          • in .net character within the group i.e (.) is also included in the result..i wonder why it's not the case with java

            – Anirudha
            Nov 28 '12 at 8:17










          2




          2





          +1 Success.

          – nickb
          Nov 28 '12 at 2:18







          +1 Success.

          – nickb
          Nov 28 '12 at 2:18






          1




          1





          +1 Nice one, I'll make a note of this for the future.

          – arshajii
          Nov 28 '12 at 2:20





          +1 Nice one, I'll make a note of this for the future.

          – arshajii
          Nov 28 '12 at 2:20













          Ooh, nice! :) +1 more

          – Amadan
          Nov 28 '12 at 2:28





          Ooh, nice! :) +1 more

          – Amadan
          Nov 28 '12 at 2:28













          +1 Good job. Not being sour grapes, but I tried that except I coded the look ahead as (?!=\1) thinking != like the java boolean comparison. Doh!

          – Bohemian
          Nov 28 '12 at 7:42





          +1 Good job. Not being sour grapes, but I tried that except I coded the look ahead as (?!=\1) thinking != like the java boolean comparison. Doh!

          – Bohemian
          Nov 28 '12 at 7:42













          in .net character within the group i.e (.) is also included in the result..i wonder why it's not the case with java

          – Anirudha
          Nov 28 '12 at 8:17







          in .net character within the group i.e (.) is also included in the result..i wonder why it's not the case with java

          – Anirudha
          Nov 28 '12 at 8:17




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f13596454%2fsplit-regex-to-extract-strings-of-contiguous-characters%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          'app-layout' is not a known element: how to share Component with different Modules

          android studio warns about leanback feature tag usage required on manifest while using Unity exported app?

          WPF add header to Image with URL pettitions [duplicate]