How to strip comments from Php/Html source? (with sed/awk/grep etc..)












0















I would like to remove comments from the source code of PHP files using Shell Script or AppleScript. (I'm using Codekit OS X App hooks)



I tried to do it with sed or grep commands, but the modern regex codes I'm used to with PHP / javascript don't work.



https://regex101.com/r/awpFe0/1/



Demo it will be enough to tell me what I want to do.



I need a working version on the command line.



I found this, but I can't get exactly the result I want:
https://stackoverflow.com/a/13062682/6320082



I've been working it for two days. Please help me.



Example contents:



Test string // test comment
Test string// test comment
echo 1;//test comment
http://domain.ltd // test comment
$wrong_slashes = "http://domain.ltd/asd//asd//asd";
function test() { // test comment
Test string /* test comment // test comment */
Test string //test /* test */
Test string /* test comment /* */ */
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
/**
multi-line comments
*/
<script src="//cdn.com/site.js">
$pattern = "([//])";
$pattern = '([//])';
<!-- html comments -->
<div>test</div> <!-- html comments -->
<!-- html comments --> <div>test</div>









share|improve this question



























    0















    I would like to remove comments from the source code of PHP files using Shell Script or AppleScript. (I'm using Codekit OS X App hooks)



    I tried to do it with sed or grep commands, but the modern regex codes I'm used to with PHP / javascript don't work.



    https://regex101.com/r/awpFe0/1/



    Demo it will be enough to tell me what I want to do.



    I need a working version on the command line.



    I found this, but I can't get exactly the result I want:
    https://stackoverflow.com/a/13062682/6320082



    I've been working it for two days. Please help me.



    Example contents:



    Test string // test comment
    Test string// test comment
    echo 1;//test comment
    http://domain.ltd // test comment
    $wrong_slashes = "http://domain.ltd/asd//asd//asd";
    function test() { // test comment
    Test string /* test comment // test comment */
    Test string //test /* test */
    Test string /* test comment /* */ */
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    /**
    multi-line comments
    */
    <script src="//cdn.com/site.js">
    $pattern = "([//])";
    $pattern = '([//])';
    <!-- html comments -->
    <div>test</div> <!-- html comments -->
    <!-- html comments --> <div>test</div>









    share|improve this question

























      0












      0








      0


      1






      I would like to remove comments from the source code of PHP files using Shell Script or AppleScript. (I'm using Codekit OS X App hooks)



      I tried to do it with sed or grep commands, but the modern regex codes I'm used to with PHP / javascript don't work.



      https://regex101.com/r/awpFe0/1/



      Demo it will be enough to tell me what I want to do.



      I need a working version on the command line.



      I found this, but I can't get exactly the result I want:
      https://stackoverflow.com/a/13062682/6320082



      I've been working it for two days. Please help me.



      Example contents:



      Test string // test comment
      Test string// test comment
      echo 1;//test comment
      http://domain.ltd // test comment
      $wrong_slashes = "http://domain.ltd/asd//asd//asd";
      function test() { // test comment
      Test string /* test comment // test comment */
      Test string //test /* test */
      Test string /* test comment /* */ */
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
      /**
      multi-line comments
      */
      <script src="//cdn.com/site.js">
      $pattern = "([//])";
      $pattern = '([//])';
      <!-- html comments -->
      <div>test</div> <!-- html comments -->
      <!-- html comments --> <div>test</div>









      share|improve this question














      I would like to remove comments from the source code of PHP files using Shell Script or AppleScript. (I'm using Codekit OS X App hooks)



      I tried to do it with sed or grep commands, but the modern regex codes I'm used to with PHP / javascript don't work.



      https://regex101.com/r/awpFe0/1/



      Demo it will be enough to tell me what I want to do.



      I need a working version on the command line.



      I found this, but I can't get exactly the result I want:
      https://stackoverflow.com/a/13062682/6320082



      I've been working it for two days. Please help me.



      Example contents:



      Test string // test comment
      Test string// test comment
      echo 1;//test comment
      http://domain.ltd // test comment
      $wrong_slashes = "http://domain.ltd/asd//asd//asd";
      function test() { // test comment
      Test string /* test comment // test comment */
      Test string //test /* test */
      Test string /* test comment /* */ */
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
      /**
      multi-line comments
      */
      <script src="//cdn.com/site.js">
      $pattern = "([//])";
      $pattern = '([//])';
      <!-- html comments -->
      <div>test</div> <!-- html comments -->
      <!-- html comments --> <div>test</div>






      regex bash sed command-line grep






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 20 '18 at 23:31









      Mert A.Mert A.

      16619




      16619
























          2 Answers
          2






          active

          oldest

          votes


















          1














          If Perl is your option, try something like:



          perl -e '
          while (<>) {
          $str .= $_; # slurps all lines into a string for multi-line matching
          }

          1 while $str =~ s#/*(?!.*/*).*?*/##sg;
          # removes C-style /* comments */ recursively
          $str =~ s#(?<!:)//[^x27"]*?$##mg;
          # removes // comments not preceded by ":" and not within quotes
          $str =~ s#<!--.*?-->##sg;
          # removes <!-- html comments -->
          print $str;
          ' example.txt


          which outputs:



          Test string
          Test string
          echo 1;
          http://domain.ltd
          $wrong_slashes = "http://domain.ltd/asd//asd//asd";
          function test() {
          Test string
          Test string
          Test string
          <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

          <script src="//cdn.com/site.js">
          $pattern = "([//])";
          $pattern = '([//])';

          <div>test</div>
          <div>test</div>


          Let me excuse that the script above might be optimized to the given sample
          and it will be easy to find bizarre codes with which my answer doesn't work.
          It may be close to impossible to write a flawless regex to match comments
          without the designated parser of the language.






          share|improve this answer































            0














            I've solved it by running PHP CLI with shell script. I did what I wanted using the example in the url with PHP.



            example function






            share|improve this answer























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403187%2fhow-to-strip-comments-from-php-html-source-with-sed-awk-grep-etc%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              1














              If Perl is your option, try something like:



              perl -e '
              while (<>) {
              $str .= $_; # slurps all lines into a string for multi-line matching
              }

              1 while $str =~ s#/*(?!.*/*).*?*/##sg;
              # removes C-style /* comments */ recursively
              $str =~ s#(?<!:)//[^x27"]*?$##mg;
              # removes // comments not preceded by ":" and not within quotes
              $str =~ s#<!--.*?-->##sg;
              # removes <!-- html comments -->
              print $str;
              ' example.txt


              which outputs:



              Test string
              Test string
              echo 1;
              http://domain.ltd
              $wrong_slashes = "http://domain.ltd/asd//asd//asd";
              function test() {
              Test string
              Test string
              Test string
              <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

              <script src="//cdn.com/site.js">
              $pattern = "([//])";
              $pattern = '([//])';

              <div>test</div>
              <div>test</div>


              Let me excuse that the script above might be optimized to the given sample
              and it will be easy to find bizarre codes with which my answer doesn't work.
              It may be close to impossible to write a flawless regex to match comments
              without the designated parser of the language.






              share|improve this answer




























                1














                If Perl is your option, try something like:



                perl -e '
                while (<>) {
                $str .= $_; # slurps all lines into a string for multi-line matching
                }

                1 while $str =~ s#/*(?!.*/*).*?*/##sg;
                # removes C-style /* comments */ recursively
                $str =~ s#(?<!:)//[^x27"]*?$##mg;
                # removes // comments not preceded by ":" and not within quotes
                $str =~ s#<!--.*?-->##sg;
                # removes <!-- html comments -->
                print $str;
                ' example.txt


                which outputs:



                Test string
                Test string
                echo 1;
                http://domain.ltd
                $wrong_slashes = "http://domain.ltd/asd//asd//asd";
                function test() {
                Test string
                Test string
                Test string
                <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

                <script src="//cdn.com/site.js">
                $pattern = "([//])";
                $pattern = '([//])';

                <div>test</div>
                <div>test</div>


                Let me excuse that the script above might be optimized to the given sample
                and it will be easy to find bizarre codes with which my answer doesn't work.
                It may be close to impossible to write a flawless regex to match comments
                without the designated parser of the language.






                share|improve this answer


























                  1












                  1








                  1







                  If Perl is your option, try something like:



                  perl -e '
                  while (<>) {
                  $str .= $_; # slurps all lines into a string for multi-line matching
                  }

                  1 while $str =~ s#/*(?!.*/*).*?*/##sg;
                  # removes C-style /* comments */ recursively
                  $str =~ s#(?<!:)//[^x27"]*?$##mg;
                  # removes // comments not preceded by ":" and not within quotes
                  $str =~ s#<!--.*?-->##sg;
                  # removes <!-- html comments -->
                  print $str;
                  ' example.txt


                  which outputs:



                  Test string
                  Test string
                  echo 1;
                  http://domain.ltd
                  $wrong_slashes = "http://domain.ltd/asd//asd//asd";
                  function test() {
                  Test string
                  Test string
                  Test string
                  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

                  <script src="//cdn.com/site.js">
                  $pattern = "([//])";
                  $pattern = '([//])';

                  <div>test</div>
                  <div>test</div>


                  Let me excuse that the script above might be optimized to the given sample
                  and it will be easy to find bizarre codes with which my answer doesn't work.
                  It may be close to impossible to write a flawless regex to match comments
                  without the designated parser of the language.






                  share|improve this answer













                  If Perl is your option, try something like:



                  perl -e '
                  while (<>) {
                  $str .= $_; # slurps all lines into a string for multi-line matching
                  }

                  1 while $str =~ s#/*(?!.*/*).*?*/##sg;
                  # removes C-style /* comments */ recursively
                  $str =~ s#(?<!:)//[^x27"]*?$##mg;
                  # removes // comments not preceded by ":" and not within quotes
                  $str =~ s#<!--.*?-->##sg;
                  # removes <!-- html comments -->
                  print $str;
                  ' example.txt


                  which outputs:



                  Test string
                  Test string
                  echo 1;
                  http://domain.ltd
                  $wrong_slashes = "http://domain.ltd/asd//asd//asd";
                  function test() {
                  Test string
                  Test string
                  Test string
                  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

                  <script src="//cdn.com/site.js">
                  $pattern = "([//])";
                  $pattern = '([//])';

                  <div>test</div>
                  <div>test</div>


                  Let me excuse that the script above might be optimized to the given sample
                  and it will be easy to find bizarre codes with which my answer doesn't work.
                  It may be close to impossible to write a flawless regex to match comments
                  without the designated parser of the language.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 21 '18 at 13:04









                  tshionotshiono

                  2,183235




                  2,183235

























                      0














                      I've solved it by running PHP CLI with shell script. I did what I wanted using the example in the url with PHP.



                      example function






                      share|improve this answer




























                        0














                        I've solved it by running PHP CLI with shell script. I did what I wanted using the example in the url with PHP.



                        example function






                        share|improve this answer


























                          0












                          0








                          0







                          I've solved it by running PHP CLI with shell script. I did what I wanted using the example in the url with PHP.



                          example function






                          share|improve this answer













                          I've solved it by running PHP CLI with shell script. I did what I wanted using the example in the url with PHP.



                          example function







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Nov 23 '18 at 2:43









                          Mert A.Mert A.

                          16619




                          16619






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403187%2fhow-to-strip-comments-from-php-html-source-with-sed-awk-grep-etc%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              MongoDB - Not Authorized To Execute Command

                              How to fix TextFormField cause rebuild widget in Flutter

                              in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith