Evaluating a log file using a sh script

I have a log file with a lot of lines with the following format:

IP - - [Timestamp Zone] 'Command Weblink Format' - size

I want to write a script.sh that gives me the number of times each website has been clicked.
The command

awk '{print $7}' server.log | sort -u

should give me a list wich puts each unique weblink in a separate line. The command

grep 'Weblink1' server.log | wc -l

should give me the number of times the Weblink1 has been clicked. I want a command that converts each line created by the Awk command above to a variable and then create a loop that runs the grep command on the extracted weblink. I could use

while IFS='' read -r line || [[ -n "$line" ]]; do

    echo "Text read from file: $line"

done

(source: Read a file line by line assigning the value to a variable) but I dont want to save the outpout of the Awk script in a .txt file.

My guess would be:

while IFS='' read -r line || [[ -n "$line" ]]; do

    grep '$line' server.log | wc -l | ='$variabel' |

    echo " $line was clicked $variable times "

done

But I'm not really familiar with connecting commands in a loop, as this is my first time. Would this loop work and how do I connect my loop and the Awk script?

edited Nov 22 '18 at 6:49

tripleee

92.9k13129184

asked Nov 22 '18 at 0:58

Joe Th

102

add a comment |

I have a log file with a lot of lines with the following format:

IP - - [Timestamp Zone] 'Command Weblink Format' - size

I want to write a script.sh that gives me the number of times each website has been clicked.
The command

awk '{print $7}' server.log | sort -u

should give me a list wich puts each unique weblink in a separate line. The command

grep 'Weblink1' server.log | wc -l

while IFS='' read -r line || [[ -n "$line" ]]; do

    echo "Text read from file: $line"

done

(source: Read a file line by line assigning the value to a variable) but I dont want to save the outpout of the Awk script in a .txt file.

My guess would be:

while IFS='' read -r line || [[ -n "$line" ]]; do

    grep '$line' server.log | wc -l | ='$variabel' |

    echo " $line was clicked $variable times "

done

But I'm not really familiar with connecting commands in a loop, as this is my first time. Would this loop work and how do I connect my loop and the Awk script?

edited Nov 22 '18 at 6:49

tripleee

92.9k13129184

asked Nov 22 '18 at 0:58

Joe Th

102

add a comment |

I have a log file with a lot of lines with the following format:

IP - - [Timestamp Zone] 'Command Weblink Format' - size

I want to write a script.sh that gives me the number of times each website has been clicked.
The command

awk '{print $7}' server.log | sort -u

should give me a list wich puts each unique weblink in a separate line. The command

grep 'Weblink1' server.log | wc -l

while IFS='' read -r line || [[ -n "$line" ]]; do

    echo "Text read from file: $line"

done

(source: Read a file line by line assigning the value to a variable) but I dont want to save the outpout of the Awk script in a .txt file.

My guess would be:

while IFS='' read -r line || [[ -n "$line" ]]; do

    grep '$line' server.log | wc -l | ='$variabel' |

    echo " $line was clicked $variable times "

done

But I'm not really familiar with connecting commands in a loop, as this is my first time. Would this loop work and how do I connect my loop and the Awk script?

edited Nov 22 '18 at 6:49

tripleee

92.9k13129184

asked Nov 22 '18 at 0:58

Joe Th

102

I have a log file with a lot of lines with the following format:

IP - - [Timestamp Zone] 'Command Weblink Format' - size

I want to write a script.sh that gives me the number of times each website has been clicked.
The command

awk '{print $7}' server.log | sort -u

should give me a list wich puts each unique weblink in a separate line. The command

grep 'Weblink1' server.log | wc -l

while IFS='' read -r line || [[ -n "$line" ]]; do

    echo "Text read from file: $line"

done

(source: Read a file line by line assigning the value to a variable) but I dont want to save the outpout of the Awk script in a .txt file.

My guess would be:

while IFS='' read -r line || [[ -n "$line" ]]; do

    grep '$line' server.log | wc -l | ='$variabel' |

    echo " $line was clicked $variable times "

done

But I'm not really familiar with connecting commands in a loop, as this is my first time. Would this loop work and how do I connect my loop and the Awk script?

bash loops sh

edited Nov 22 '18 at 6:49

tripleee

92.9k13129184

asked Nov 22 '18 at 0:58

Joe Th

102

edited Nov 22 '18 at 6:49

tripleee

92.9k13129184

asked Nov 22 '18 at 0:58

Joe Th

102

edited Nov 22 '18 at 6:49

tripleee

92.9k13129184

edited Nov 22 '18 at 6:49

tripleee

92.9k13129184

edited Nov 22 '18 at 6:49

tripleee

92.9k13129184

asked Nov 22 '18 at 0:58

Joe Th

102

asked Nov 22 '18 at 0:58

Joe Th

102

asked Nov 22 '18 at 0:58

Joe Th

102

add a comment |

1 Answer
1

active

oldest

votes

Shell commands in a loop connect the same way they do without a loop, and you aren't very close. But yes, this can be done in a loop if you want the horribly inefficient way for some reason such as a learning experience:

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  n=$(grep -c "$line" server.log)

  echo "$line" clicked $n times

done 



# you only need the read || [ -n ] idiom if the input can end with an

# unterminated partial line (is illformed); awk print output can't.

# you don't really need the IFS= and -r because the data here is URLs 

# which cannot contain whitespace and shouldn't contain backslash,

# but I left them in as good-habit-forming.



# in general variable expansions should be doublequoted

# to prevent wordsplitting and/or globbing, although in this case 

# $line is a URL which cannot contain whitespace and practically 

# cannot be a glob. $n is a number and definitely safe.



# grep -c does the count so you don't need wc -l

or more simply

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  echo "$line" clicked $(grep -c "$line" server.log) times

done

However if you just want the correct results, it is much more efficient and somewhat simpler to do it in one pass in awk:

awk '{n[$7]++}

    END{for(i in n){

        print i,"clicked",n[i],"times"}}' |

sort



# or GNU awk 4+ can do the sort itself, see the doc:

awk '{n[$7]++}

    END{PROCINFO["sorted_in"]="@ind_str_asc";

    for(i in n){

        print i,"clicked",n[i],"times"}}'

The associative array n collects the values from the seventh field as keys, and on each line, the value for the extracted key is incremented. Thus, at the end, the keys in n are all the URLs in the file, and the value for each is the number of times it occurred.

edited Nov 22 '18 at 6:50

tripleee

92.9k13129184

answered Nov 22 '18 at 3:21

dave_thompson_085

13.3k11633

Maybe emphasize exactly how horribly slow it will be to run grep on the entire file as many times as there are unique URLs in the log file.

– tripleee
Nov 22 '18 at 6:55

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422504%2fevaluating-a-log-file-using-a-sh-script%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  n=$(grep -c "$line" server.log)

  echo "$line" clicked $n times

done 



# you only need the read || [ -n ] idiom if the input can end with an

# unterminated partial line (is illformed); awk print output can't.

# you don't really need the IFS= and -r because the data here is URLs 

# which cannot contain whitespace and shouldn't contain backslash,

# but I left them in as good-habit-forming.



# in general variable expansions should be doublequoted

# to prevent wordsplitting and/or globbing, although in this case 

# $line is a URL which cannot contain whitespace and practically 

# cannot be a glob. $n is a number and definitely safe.



# grep -c does the count so you don't need wc -l

or more simply

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  echo "$line" clicked $(grep -c "$line" server.log) times

done

However if you just want the correct results, it is much more efficient and somewhat simpler to do it in one pass in awk:

awk '{n[$7]++}

    END{for(i in n){

        print i,"clicked",n[i],"times"}}' |

sort



# or GNU awk 4+ can do the sort itself, see the doc:

awk '{n[$7]++}

    END{PROCINFO["sorted_in"]="@ind_str_asc";

    for(i in n){

        print i,"clicked",n[i],"times"}}'

edited Nov 22 '18 at 6:50

tripleee

92.9k13129184

answered Nov 22 '18 at 3:21

dave_thompson_085

13.3k11633

Maybe emphasize exactly how horribly slow it will be to run grep on the entire file as many times as there are unique URLs in the log file.

– tripleee
Nov 22 '18 at 6:55

add a comment |

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  n=$(grep -c "$line" server.log)

  echo "$line" clicked $n times

done 



# you only need the read || [ -n ] idiom if the input can end with an

# unterminated partial line (is illformed); awk print output can't.

# you don't really need the IFS= and -r because the data here is URLs 

# which cannot contain whitespace and shouldn't contain backslash,

# but I left them in as good-habit-forming.



# in general variable expansions should be doublequoted

# to prevent wordsplitting and/or globbing, although in this case 

# $line is a URL which cannot contain whitespace and practically 

# cannot be a glob. $n is a number and definitely safe.



# grep -c does the count so you don't need wc -l

or more simply

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  echo "$line" clicked $(grep -c "$line" server.log) times

done

However if you just want the correct results, it is much more efficient and somewhat simpler to do it in one pass in awk:

awk '{n[$7]++}

    END{for(i in n){

        print i,"clicked",n[i],"times"}}' |

sort



# or GNU awk 4+ can do the sort itself, see the doc:

awk '{n[$7]++}

    END{PROCINFO["sorted_in"]="@ind_str_asc";

    for(i in n){

        print i,"clicked",n[i],"times"}}'

edited Nov 22 '18 at 6:50

tripleee

92.9k13129184

answered Nov 22 '18 at 3:21

dave_thompson_085

13.3k11633

Maybe emphasize exactly how horribly slow it will be to run grep on the entire file as many times as there are unique URLs in the log file.

– tripleee
Nov 22 '18 at 6:55

add a comment |

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  n=$(grep -c "$line" server.log)

  echo "$line" clicked $n times

done 



# you only need the read || [ -n ] idiom if the input can end with an

# unterminated partial line (is illformed); awk print output can't.

# you don't really need the IFS= and -r because the data here is URLs 

# which cannot contain whitespace and shouldn't contain backslash,

# but I left them in as good-habit-forming.



# in general variable expansions should be doublequoted

# to prevent wordsplitting and/or globbing, although in this case 

# $line is a URL which cannot contain whitespace and practically 

# cannot be a glob. $n is a number and definitely safe.



# grep -c does the count so you don't need wc -l

or more simply

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  echo "$line" clicked $(grep -c "$line" server.log) times

done

However if you just want the correct results, it is much more efficient and somewhat simpler to do it in one pass in awk:

awk '{n[$7]++}

    END{for(i in n){

        print i,"clicked",n[i],"times"}}' |

sort



# or GNU awk 4+ can do the sort itself, see the doc:

awk '{n[$7]++}

    END{PROCINFO["sorted_in"]="@ind_str_asc";

    for(i in n){

        print i,"clicked",n[i],"times"}}'

edited Nov 22 '18 at 6:50

tripleee

92.9k13129184

answered Nov 22 '18 at 3:21

dave_thompson_085

13.3k11633

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  n=$(grep -c "$line" server.log)

  echo "$line" clicked $n times

done 



# you only need the read || [ -n ] idiom if the input can end with an

# unterminated partial line (is illformed); awk print output can't.

# you don't really need the IFS= and -r because the data here is URLs 

# which cannot contain whitespace and shouldn't contain backslash,

# but I left them in as good-habit-forming.



# in general variable expansions should be doublequoted

# to prevent wordsplitting and/or globbing, although in this case 

# $line is a URL which cannot contain whitespace and practically 

# cannot be a glob. $n is a number and definitely safe.



# grep -c does the count so you don't need wc -l

or more simply

awk '{print $7}' server.log |

sort -u |

while IFS= read -r line; do 

  echo "$line" clicked $(grep -c "$line" server.log) times

done

However if you just want the correct results, it is much more efficient and somewhat simpler to do it in one pass in awk:

awk '{n[$7]++}

    END{for(i in n){

        print i,"clicked",n[i],"times"}}' |

sort



# or GNU awk 4+ can do the sort itself, see the doc:

awk '{n[$7]++}

    END{PROCINFO["sorted_in"]="@ind_str_asc";

    for(i in n){

        print i,"clicked",n[i],"times"}}'

edited Nov 22 '18 at 6:50

tripleee

92.9k13129184

answered Nov 22 '18 at 3:21

dave_thompson_085

13.3k11633

edited Nov 22 '18 at 6:50

tripleee

92.9k13129184

edited Nov 22 '18 at 6:50

tripleee

92.9k13129184

edited Nov 22 '18 at 6:50

tripleee

92.9k13129184

answered Nov 22 '18 at 3:21

dave_thompson_085

13.3k11633

answered Nov 22 '18 at 3:21

dave_thompson_085

13.3k11633

answered Nov 22 '18 at 3:21

dave_thompson_085

13.3k11633

Maybe emphasize exactly how horribly slow it will be to run grep on the entire file as many times as there are unique URLs in the log file.

– tripleee
Nov 22 '18 at 6:55

add a comment |

Maybe emphasize exactly how horribly slow it will be to run grep on the entire file as many times as there are unique URLs in the log file.

– tripleee
Nov 22 '18 at 6:55

Maybe emphasize exactly how horribly slow it will be to run grep on the entire file as many times as there are unique URLs in the log file.

– tripleee
Nov 22 '18 at 6:55

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu