Is piping, shifting, or parameter expansion more efficient?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
I'm trying to find the most efficient way to iterate through certain values that are a consistent number of values away from each other in a space separated list of words(I don't want to use an array). For example,
list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"
So I want to be able to just iterate through list and only access 1,5,6,9 and 15.
EDIT: I should have made it clear that the values I'm trying to get from the list don't have to be different in format from the rest of the list. What makes them special is solely their position in the list(In this case, position 1,4,7...). So the list could be1 2 3 5 9 8 6 90 84 9 3 2 15 75 55
but I'd still want the same numbers. And also, I want to be able to do it assuming I don't know the length of the list.
The methods I've thought of so far are:
Method 1
set $list
found=false
find=9
count=1
while [ $count -lt $# ]; do
if [ "${@:count:1}" -eq $find ]; then
found=true
break
fi
count=`expr $count + 3`
done
Method 2
set list
found=false
find=9
while [ $# ne 0 ]; do
if [ $1 -eq $find ]; then
found=true
break
fi
shift 3
done
Method 3
I'm pretty sure piping makes this the worst option, but I was trying to find a method that doesn't use set, out of curiosity.
found=false
find=9
count=1
num=`echo $list | cut -d ' ' -f$count`
while [ -n "$num" ]; do
if [ $num -eq $find ]; then
found=true
break
fi
count=`expr $count + 3`
num=`echo $list | cut -d ' ' -f$count`
done
So what would be most efficient, or am I missing a simpler method?
shell-script pipe performance cut
add a comment |
I'm trying to find the most efficient way to iterate through certain values that are a consistent number of values away from each other in a space separated list of words(I don't want to use an array). For example,
list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"
So I want to be able to just iterate through list and only access 1,5,6,9 and 15.
EDIT: I should have made it clear that the values I'm trying to get from the list don't have to be different in format from the rest of the list. What makes them special is solely their position in the list(In this case, position 1,4,7...). So the list could be1 2 3 5 9 8 6 90 84 9 3 2 15 75 55
but I'd still want the same numbers. And also, I want to be able to do it assuming I don't know the length of the list.
The methods I've thought of so far are:
Method 1
set $list
found=false
find=9
count=1
while [ $count -lt $# ]; do
if [ "${@:count:1}" -eq $find ]; then
found=true
break
fi
count=`expr $count + 3`
done
Method 2
set list
found=false
find=9
while [ $# ne 0 ]; do
if [ $1 -eq $find ]; then
found=true
break
fi
shift 3
done
Method 3
I'm pretty sure piping makes this the worst option, but I was trying to find a method that doesn't use set, out of curiosity.
found=false
find=9
count=1
num=`echo $list | cut -d ' ' -f$count`
while [ -n "$num" ]; do
if [ $num -eq $find ]; then
found=true
break
fi
count=`expr $count + 3`
num=`echo $list | cut -d ' ' -f$count`
done
So what would be most efficient, or am I missing a simpler method?
shell-script pipe performance cut
10
I wouldn't use a shell script in the first place if efficiency is an important concern. How big is your list that it makes a difference?
– Barmar
Feb 1 at 2:54
5
premature optimization is the source of all evil
– Barmar
Feb 1 at 2:56
2
Without doing statistics over actual instances of your problem, you will know nothing. This includes comparing to "programming in awk" etc. If statistics are too expensive, then looking for efficiency is probably not worth it.
– David Tonhofer
Feb 1 at 20:48
1
Levi, what exactly is the "efficient" way in your definition ? You want to find a faster way to iterate ?
– Sergiy Kolodyazhnyy
Feb 2 at 0:42
add a comment |
I'm trying to find the most efficient way to iterate through certain values that are a consistent number of values away from each other in a space separated list of words(I don't want to use an array). For example,
list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"
So I want to be able to just iterate through list and only access 1,5,6,9 and 15.
EDIT: I should have made it clear that the values I'm trying to get from the list don't have to be different in format from the rest of the list. What makes them special is solely their position in the list(In this case, position 1,4,7...). So the list could be1 2 3 5 9 8 6 90 84 9 3 2 15 75 55
but I'd still want the same numbers. And also, I want to be able to do it assuming I don't know the length of the list.
The methods I've thought of so far are:
Method 1
set $list
found=false
find=9
count=1
while [ $count -lt $# ]; do
if [ "${@:count:1}" -eq $find ]; then
found=true
break
fi
count=`expr $count + 3`
done
Method 2
set list
found=false
find=9
while [ $# ne 0 ]; do
if [ $1 -eq $find ]; then
found=true
break
fi
shift 3
done
Method 3
I'm pretty sure piping makes this the worst option, but I was trying to find a method that doesn't use set, out of curiosity.
found=false
find=9
count=1
num=`echo $list | cut -d ' ' -f$count`
while [ -n "$num" ]; do
if [ $num -eq $find ]; then
found=true
break
fi
count=`expr $count + 3`
num=`echo $list | cut -d ' ' -f$count`
done
So what would be most efficient, or am I missing a simpler method?
shell-script pipe performance cut
I'm trying to find the most efficient way to iterate through certain values that are a consistent number of values away from each other in a space separated list of words(I don't want to use an array). For example,
list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"
So I want to be able to just iterate through list and only access 1,5,6,9 and 15.
EDIT: I should have made it clear that the values I'm trying to get from the list don't have to be different in format from the rest of the list. What makes them special is solely their position in the list(In this case, position 1,4,7...). So the list could be1 2 3 5 9 8 6 90 84 9 3 2 15 75 55
but I'd still want the same numbers. And also, I want to be able to do it assuming I don't know the length of the list.
The methods I've thought of so far are:
Method 1
set $list
found=false
find=9
count=1
while [ $count -lt $# ]; do
if [ "${@:count:1}" -eq $find ]; then
found=true
break
fi
count=`expr $count + 3`
done
Method 2
set list
found=false
find=9
while [ $# ne 0 ]; do
if [ $1 -eq $find ]; then
found=true
break
fi
shift 3
done
Method 3
I'm pretty sure piping makes this the worst option, but I was trying to find a method that doesn't use set, out of curiosity.
found=false
find=9
count=1
num=`echo $list | cut -d ' ' -f$count`
while [ -n "$num" ]; do
if [ $num -eq $find ]; then
found=true
break
fi
count=`expr $count + 3`
num=`echo $list | cut -d ' ' -f$count`
done
So what would be most efficient, or am I missing a simpler method?
shell-script pipe performance cut
shell-script pipe performance cut
edited Jan 31 at 19:34
Levi Uzodike
asked Jan 31 at 19:10
Levi UzodikeLevi Uzodike
1236
1236
10
I wouldn't use a shell script in the first place if efficiency is an important concern. How big is your list that it makes a difference?
– Barmar
Feb 1 at 2:54
5
premature optimization is the source of all evil
– Barmar
Feb 1 at 2:56
2
Without doing statistics over actual instances of your problem, you will know nothing. This includes comparing to "programming in awk" etc. If statistics are too expensive, then looking for efficiency is probably not worth it.
– David Tonhofer
Feb 1 at 20:48
1
Levi, what exactly is the "efficient" way in your definition ? You want to find a faster way to iterate ?
– Sergiy Kolodyazhnyy
Feb 2 at 0:42
add a comment |
10
I wouldn't use a shell script in the first place if efficiency is an important concern. How big is your list that it makes a difference?
– Barmar
Feb 1 at 2:54
5
premature optimization is the source of all evil
– Barmar
Feb 1 at 2:56
2
Without doing statistics over actual instances of your problem, you will know nothing. This includes comparing to "programming in awk" etc. If statistics are too expensive, then looking for efficiency is probably not worth it.
– David Tonhofer
Feb 1 at 20:48
1
Levi, what exactly is the "efficient" way in your definition ? You want to find a faster way to iterate ?
– Sergiy Kolodyazhnyy
Feb 2 at 0:42
10
10
I wouldn't use a shell script in the first place if efficiency is an important concern. How big is your list that it makes a difference?
– Barmar
Feb 1 at 2:54
I wouldn't use a shell script in the first place if efficiency is an important concern. How big is your list that it makes a difference?
– Barmar
Feb 1 at 2:54
5
5
premature optimization is the source of all evil
– Barmar
Feb 1 at 2:56
premature optimization is the source of all evil
– Barmar
Feb 1 at 2:56
2
2
Without doing statistics over actual instances of your problem, you will know nothing. This includes comparing to "programming in awk" etc. If statistics are too expensive, then looking for efficiency is probably not worth it.
– David Tonhofer
Feb 1 at 20:48
Without doing statistics over actual instances of your problem, you will know nothing. This includes comparing to "programming in awk" etc. If statistics are too expensive, then looking for efficiency is probably not worth it.
– David Tonhofer
Feb 1 at 20:48
1
1
Levi, what exactly is the "efficient" way in your definition ? You want to find a faster way to iterate ?
– Sergiy Kolodyazhnyy
Feb 2 at 0:42
Levi, what exactly is the "efficient" way in your definition ? You want to find a faster way to iterate ?
– Sergiy Kolodyazhnyy
Feb 2 at 0:42
add a comment |
8 Answers
8
active
oldest
votes
Pretty simple with awk
. This will get you the value of every fourth field for input of any length:
$ awk -F' ' '{for( i=1;i<=NF;i+=3) { printf( "%s%s", $i, OFS ) }; printf( "n" ) }' <<< $list
1 5 6 9 15
This works be leveraging built-in awk
variables such as NF
(the number of fields in the record), and doing some simple for
looping to iterate along the fields to give you the ones you want without needing to know ahead of time how many there will be.
Or, if you do indeed just want those specific fields as specified in your example:
$ awk -F' ' '{ print $1, $4, $7, $10, $13 }' <<< $list
1 5 6 9 15
As for the question about efficiency, the simplest route would be to test this or each of your other methods and use time
to show how long it takes; you could also use tools like strace
to see how the system calls flow. Usage of time
looks like:
$ time ./script.sh
real 0m0.025s
user 0m0.004s
sys 0m0.008s
You can compare that output between varying methods to see which is the most efficient in terms of time; other tools can be used for other efficiency metrics.
1
Good point, @MichaelHomer; I've added an aside addressing the question of "how can I determine which method is the most efficient".
– DopeGhoti
Jan 31 at 20:58
2
@LeviUzodike Regardingecho
vs<<<
, "identical" is too strong a word. You could say thatstuff <<< "$list"
is nearly identical toprintf "%sn" "$list" | stuff
. Regardingecho
vsprintf
, I direct you to this answer
– JoL
Jan 31 at 20:59
5
@DopeGhoti Actually it does.<<<
adds a newline at the end. This is similar to how$()
removes a newline from the end. This is because lines are terminated by newlines.<<<
feeds an expression as a line, so it must be terminated by a newline."$()"
takes lines and provides them as an argument, so it makes sense to convert by removing the terminating newline.
– JoL
Feb 1 at 2:09
3
@LeviUzodike awk is a much under-appreciated tool. It will make all sorts of seemingly complex problems easy to solve. Especially when you are trying to write a complex regex for something like sed, you can often save hours by instead writing it procedurally in awk. Learning it will pa∕y large dividends.
– Joe
Feb 1 at 20:55
1
@LeviUzodike: Yesawk
is a stand-alone binary that has to start up. Unlike perl or especially Python, the awk interpreter starts up quickly (still all the usual dynamic linker overhead of making quite a few system calls, but awk only uses libc/libm and libdl. e.g. usestrace
to check out system-calls of awk startup). Many shells (like bash) are pretty slow, so firing up one awk process can be faster than looping over tokens in a list with shell built-ins even for small-ish list sizes. And sometimes you can write a#!/usr/bin/awk
script instead of a#!/bin/sh
script.
– Peter Cordes
Feb 2 at 4:36
|
show 4 more comments
First rule of software optimization: Don't.
Until you know the speed of the program is an issue, there's no need to think
about how fast it is. If your list is about that length or just ~100-1000 items
long, you probably won't even notice how long it takes. There's a chance you're spending more time thinking about the optimization than what the difference would be.
Second rule: Measure.
That's the sure way to find out, and the one that gives answers for your system.
Especially with shells, there are so many, and they aren't all identical. An
answer for one shell might not apply for yours.
In larger programs, profiling goes here too. The slowest part might not be the one you think it is.
Third, the first rule of shell script optimization: Don't use the shell.
Yeah, really. Many shells aren't made to be fast (since launching external
programs doesn't have to be), and they might even parse the lines of the source
code again each time.
Use something like awk or Perl instead. In a trivial micro-benchmark I did,
awk
was dozens of times faster than any common shell in running a simple loop (without I/O).
However, if you do use the shell, use the shell's builtin functions instead of external commands. Here, you're using
expr
which isn't builtin in any shells I found on my system, but which can be replaced with standard arithmetic expansion. E.g.i=$((i+1))
instead ofi=$(expr $i + 1)
to incrementi
. Your use ofcut
in the last example might also be replaceable with standard parameter expansions.
See also: Why is using a shell loop to process text considered bad practice?
Steps #1 and #2 should apply to your question.
12
#0, quote your expansions :-)
– Kusalananda♦
Jan 31 at 19:59
8
It's not thatawk
loops are necessarily any better or worse than shell loops. It's that the shell is really good at running commands and at directing input and output to and from processes, and frankly rather clunky at everything else; while tools likeawk
are fantastic at processing text data, because that's what shells and tools likeawk
are made for (respectively) in the first place.
– DopeGhoti
Jan 31 at 21:05
2
@DopeGhoti, shells do seem to be objectively slower, though. Some very simple while loops seem to be >25 times slower indash
than withgawk
, anddash
was the fastest shell I tested...
– ilkkachu
Jan 31 at 22:36
1
@Joe, it is :)dash
andbusybox
don't support(( .. ))
-- I think it's a nonstandard extension.++
is also explicitly mentioned as not required, so as far as I can tell,i=$((i+1))
or: $(( i += 1))
are the safe ones.
– ilkkachu
Feb 1 at 23:10
1
Re "more time thinking": this neglects an important factor. How often does it run, and for how many users? If a program wastes 1 second, which could be fixed by the programmer thinking about it for 30 minutes, it might be a waste of time if there's only one user who's going to run it once. On the other hand if there's a million users, that's a million seconds, or 11 days of user time. If the code wasted a minute of a million users, that's about 2 years of user time.
– agc
Feb 4 at 2:49
|
show 3 more comments
I'm only going to give some general advice in this answer, and not benchmarks. Benchmarks are the only way to reliably answer questions about performance. But since you don't say how much data you're manipulating and how often you perform this operation, there's no way to do a useful benchmark. What's more efficient for 10 items and what's more efficient for 1000000 items is often not the same.
As a general rule of thumb, invoking external commands is more expensive than doing something with pure shell constructs, as long as the pure shell code doesn't involve a loop. On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool. For example, your loop invoking cut
could well be noticeably slow in practice, but if you find a way to do the whole thing with a single cut
invocation that's likely to be faster than doing the same thing with string manipulation in the shell.
Do note that the cutoff point can vary a lot between systems. It can depend on the kernel, on how the kernel's scheduler is configured, on the filesystem containing the external executables, on how much CPU vs memory pressure there is at the moment, and many other factors.
Don't call expr
to perform arithmetic if you're at all concerned about performance. In fact, don't call expr
to perform arithmetic at all. Shells have built-in arithmetic, which is clearer and faster than invoking expr
.
You seem to be using bash, since you're using bash constructs that don't exist in sh. So why on earth would you not use an array? An array is the most natural solution, and it's likely to be the fastest, too. Note that array indices start at 0.
list=(1 2 3 5 9 8 6 90 84 9 3 2 15 75 55)
for ((count = 0; count += 3; count < ${#list[@]})); do
echo "${list[$count]}"
done
Your script may well be faster if you use sh, if your system has dash or ksh as sh
rather than bash. If you use sh, you don't get named arrays, but you still get the array one of positional parameters, which you can set with set
. To access an element at a position that is not known until runtime, you need to use eval
(take care of quoting things properly!).
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
count=1
while [ $count -le $# ]; do
eval "value=${$count}"
echo "$value"
count=$((count+1))
done
If you only ever want to access the array once and are going from left to right (skipping some values), you can use shift
instead of variable indices.
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
while [ $# -ge 1 ]; do
echo "$1"
shift && shift && shift
done
Which approach is faster depends on the shell and on the number of elements.
Another possibility is to use string processing. It has the advantage of not using the positional parameters, so you can use them for something else. It'll be slower for large amounts of data, but that's unlikely to make a noticeable difference for small amounts of data.
# List elements must be separated by a single space (not arbitrary whitespace)
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
while [ -n "$list" ]; do
echo "${list% *}"
case "$list" in * * * *) :;; *) break;; esac
list="${list#* * * }"
done
"On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool" but what if that tool has loops in it like awk? @ikkachu said awk loops are faster, but would you say that with < 1000 fields to iterate through, the benefit of faster loops wouldn't outweigh the cost of calling awk since it's an external command (assuming I could do the same task in shell loops with the use of only built in commands)?
– Levi Uzodike
Feb 1 at 16:40
@LeviUzodike Please re-read the first paragraph of my answer.
– Gilles
Feb 1 at 17:04
You could also replaceshift && shift && shift
withshift 3
in your third example - unless the shell you're using doesn't support it.
– Joe
Feb 1 at 21:14
2
@Joe Actually, no.shift 3
would fail if there were too few remaining arguments. You'd need something likeif [ $# -gt 3 ]; then shift 3; else set --; fi
– Gilles
Feb 1 at 21:25
add a comment |
awk
is a great choice, if you can do all your processing inside of the Awk script. Otherwise, you just end up piping the Awk output to other utilities, destroying the performance gain of awk
.
bash
iteration over an array is also great, if you can fit your entire list inside the array (which for modern shells is probably a guarantee) and you don't mind the array syntax gymnastics.
However, a pipeline approach:
xargs -n3 <<< "$list" | while read -ra a; do echo $a; done | grep 9
Where:
xargs
groups the whitespace-separated list into batches of three, each new-line separated
while read
consumes that list and outputs the first column of each group
grep
filters the first column (corresponding to every third position in the original list)
Improves understandability, in my opinion. People already know what these tools do, so it's easy to read from left to right and reason about what's going to happen. This approach also clearly documents the stride length (-n3
) and the filter pattern (9
), so it's easy to variabilize:
count=3
find=9
xargs -n "$count" <<< "$list" | while read -ra a; do echo $a; done | grep "$find"
When we ask questions of "efficiency", be sure to think about "total lifetime efficiency". That calculation includes the effort of maintainers to keep the code working, and we meat-bags are the least efficient machines in the whole operation.
add a comment |
Perhaps this?
cut -d' ' -f1,4,7,10,13 <<<$list
1 5 6 9 15
Sorry I wasn't clear before, but I wanted to be able to get the numbers at those positions without knowing the length of the list. But thanks, I forgot cut could do that.
– Levi Uzodike
Jan 31 at 19:51
add a comment |
Don't use shell commands if you want to be efficient. Limit yourself to pipes, redirections, substitutions etc, and programs. That's why xargs
and parallel
utilities exists - because bash while loops are inefficient and very slow. Use bash loops only as the last resolve.
list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"
if
<<<"$list" tr -d -s '[0-9 ]' |
tr -s ' ' | tr ' ' 'n' |
grep -q -x '9'
then
found=true
else
found=false
fi
echo ${found}
But you should get probably somewhat faster with good awk
.
Sorry I wasn't clear before, but I was looking for a solution that would able to extract the values based only on their position in list. I just made the original list like that because I wanted it to be obvious the values I wanted.
– Levi Uzodike
Jan 31 at 20:02
add a comment |
In my opinion the clearest solution (and probably the most performant too) is to use the RS and ORS awk variables:
awk -v RS=' ' -v ORS=' ' 'NR % 3 == 1' <<< "$list"
add a comment |
Using GNU
sed
and POSIX shell script:
echo $(printf '%sn' $list | sed -n '1~3p')
Or with
bash
's parameter substitution:
echo $(sed -n '1~3p' <<< ${list// /$'n'})
Non-GNU (i.e. POSIX)
sed
, andbash
:
sed 's/([^ ]* )[^ ]* *[^ ]* */1/g' <<< "$list"
Or more portably, using both POSIX
sed
and shell script:
echo "$list" | sed 's/([^ ]* )[^ ]* *[^ ]* */1/g'
Output of any of these:
1 5 6 9 15
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f497985%2fis-piping-shifting-or-parameter-expansion-more-efficient%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
8 Answers
8
active
oldest
votes
8 Answers
8
active
oldest
votes
active
oldest
votes
active
oldest
votes
Pretty simple with awk
. This will get you the value of every fourth field for input of any length:
$ awk -F' ' '{for( i=1;i<=NF;i+=3) { printf( "%s%s", $i, OFS ) }; printf( "n" ) }' <<< $list
1 5 6 9 15
This works be leveraging built-in awk
variables such as NF
(the number of fields in the record), and doing some simple for
looping to iterate along the fields to give you the ones you want without needing to know ahead of time how many there will be.
Or, if you do indeed just want those specific fields as specified in your example:
$ awk -F' ' '{ print $1, $4, $7, $10, $13 }' <<< $list
1 5 6 9 15
As for the question about efficiency, the simplest route would be to test this or each of your other methods and use time
to show how long it takes; you could also use tools like strace
to see how the system calls flow. Usage of time
looks like:
$ time ./script.sh
real 0m0.025s
user 0m0.004s
sys 0m0.008s
You can compare that output between varying methods to see which is the most efficient in terms of time; other tools can be used for other efficiency metrics.
1
Good point, @MichaelHomer; I've added an aside addressing the question of "how can I determine which method is the most efficient".
– DopeGhoti
Jan 31 at 20:58
2
@LeviUzodike Regardingecho
vs<<<
, "identical" is too strong a word. You could say thatstuff <<< "$list"
is nearly identical toprintf "%sn" "$list" | stuff
. Regardingecho
vsprintf
, I direct you to this answer
– JoL
Jan 31 at 20:59
5
@DopeGhoti Actually it does.<<<
adds a newline at the end. This is similar to how$()
removes a newline from the end. This is because lines are terminated by newlines.<<<
feeds an expression as a line, so it must be terminated by a newline."$()"
takes lines and provides them as an argument, so it makes sense to convert by removing the terminating newline.
– JoL
Feb 1 at 2:09
3
@LeviUzodike awk is a much under-appreciated tool. It will make all sorts of seemingly complex problems easy to solve. Especially when you are trying to write a complex regex for something like sed, you can often save hours by instead writing it procedurally in awk. Learning it will pa∕y large dividends.
– Joe
Feb 1 at 20:55
1
@LeviUzodike: Yesawk
is a stand-alone binary that has to start up. Unlike perl or especially Python, the awk interpreter starts up quickly (still all the usual dynamic linker overhead of making quite a few system calls, but awk only uses libc/libm and libdl. e.g. usestrace
to check out system-calls of awk startup). Many shells (like bash) are pretty slow, so firing up one awk process can be faster than looping over tokens in a list with shell built-ins even for small-ish list sizes. And sometimes you can write a#!/usr/bin/awk
script instead of a#!/bin/sh
script.
– Peter Cordes
Feb 2 at 4:36
|
show 4 more comments
Pretty simple with awk
. This will get you the value of every fourth field for input of any length:
$ awk -F' ' '{for( i=1;i<=NF;i+=3) { printf( "%s%s", $i, OFS ) }; printf( "n" ) }' <<< $list
1 5 6 9 15
This works be leveraging built-in awk
variables such as NF
(the number of fields in the record), and doing some simple for
looping to iterate along the fields to give you the ones you want without needing to know ahead of time how many there will be.
Or, if you do indeed just want those specific fields as specified in your example:
$ awk -F' ' '{ print $1, $4, $7, $10, $13 }' <<< $list
1 5 6 9 15
As for the question about efficiency, the simplest route would be to test this or each of your other methods and use time
to show how long it takes; you could also use tools like strace
to see how the system calls flow. Usage of time
looks like:
$ time ./script.sh
real 0m0.025s
user 0m0.004s
sys 0m0.008s
You can compare that output between varying methods to see which is the most efficient in terms of time; other tools can be used for other efficiency metrics.
1
Good point, @MichaelHomer; I've added an aside addressing the question of "how can I determine which method is the most efficient".
– DopeGhoti
Jan 31 at 20:58
2
@LeviUzodike Regardingecho
vs<<<
, "identical" is too strong a word. You could say thatstuff <<< "$list"
is nearly identical toprintf "%sn" "$list" | stuff
. Regardingecho
vsprintf
, I direct you to this answer
– JoL
Jan 31 at 20:59
5
@DopeGhoti Actually it does.<<<
adds a newline at the end. This is similar to how$()
removes a newline from the end. This is because lines are terminated by newlines.<<<
feeds an expression as a line, so it must be terminated by a newline."$()"
takes lines and provides them as an argument, so it makes sense to convert by removing the terminating newline.
– JoL
Feb 1 at 2:09
3
@LeviUzodike awk is a much under-appreciated tool. It will make all sorts of seemingly complex problems easy to solve. Especially when you are trying to write a complex regex for something like sed, you can often save hours by instead writing it procedurally in awk. Learning it will pa∕y large dividends.
– Joe
Feb 1 at 20:55
1
@LeviUzodike: Yesawk
is a stand-alone binary that has to start up. Unlike perl or especially Python, the awk interpreter starts up quickly (still all the usual dynamic linker overhead of making quite a few system calls, but awk only uses libc/libm and libdl. e.g. usestrace
to check out system-calls of awk startup). Many shells (like bash) are pretty slow, so firing up one awk process can be faster than looping over tokens in a list with shell built-ins even for small-ish list sizes. And sometimes you can write a#!/usr/bin/awk
script instead of a#!/bin/sh
script.
– Peter Cordes
Feb 2 at 4:36
|
show 4 more comments
Pretty simple with awk
. This will get you the value of every fourth field for input of any length:
$ awk -F' ' '{for( i=1;i<=NF;i+=3) { printf( "%s%s", $i, OFS ) }; printf( "n" ) }' <<< $list
1 5 6 9 15
This works be leveraging built-in awk
variables such as NF
(the number of fields in the record), and doing some simple for
looping to iterate along the fields to give you the ones you want without needing to know ahead of time how many there will be.
Or, if you do indeed just want those specific fields as specified in your example:
$ awk -F' ' '{ print $1, $4, $7, $10, $13 }' <<< $list
1 5 6 9 15
As for the question about efficiency, the simplest route would be to test this or each of your other methods and use time
to show how long it takes; you could also use tools like strace
to see how the system calls flow. Usage of time
looks like:
$ time ./script.sh
real 0m0.025s
user 0m0.004s
sys 0m0.008s
You can compare that output between varying methods to see which is the most efficient in terms of time; other tools can be used for other efficiency metrics.
Pretty simple with awk
. This will get you the value of every fourth field for input of any length:
$ awk -F' ' '{for( i=1;i<=NF;i+=3) { printf( "%s%s", $i, OFS ) }; printf( "n" ) }' <<< $list
1 5 6 9 15
This works be leveraging built-in awk
variables such as NF
(the number of fields in the record), and doing some simple for
looping to iterate along the fields to give you the ones you want without needing to know ahead of time how many there will be.
Or, if you do indeed just want those specific fields as specified in your example:
$ awk -F' ' '{ print $1, $4, $7, $10, $13 }' <<< $list
1 5 6 9 15
As for the question about efficiency, the simplest route would be to test this or each of your other methods and use time
to show how long it takes; you could also use tools like strace
to see how the system calls flow. Usage of time
looks like:
$ time ./script.sh
real 0m0.025s
user 0m0.004s
sys 0m0.008s
You can compare that output between varying methods to see which is the most efficient in terms of time; other tools can be used for other efficiency metrics.
edited Jan 31 at 21:01
answered Jan 31 at 19:21
DopeGhotiDopeGhoti
46.9k56190
46.9k56190
1
Good point, @MichaelHomer; I've added an aside addressing the question of "how can I determine which method is the most efficient".
– DopeGhoti
Jan 31 at 20:58
2
@LeviUzodike Regardingecho
vs<<<
, "identical" is too strong a word. You could say thatstuff <<< "$list"
is nearly identical toprintf "%sn" "$list" | stuff
. Regardingecho
vsprintf
, I direct you to this answer
– JoL
Jan 31 at 20:59
5
@DopeGhoti Actually it does.<<<
adds a newline at the end. This is similar to how$()
removes a newline from the end. This is because lines are terminated by newlines.<<<
feeds an expression as a line, so it must be terminated by a newline."$()"
takes lines and provides them as an argument, so it makes sense to convert by removing the terminating newline.
– JoL
Feb 1 at 2:09
3
@LeviUzodike awk is a much under-appreciated tool. It will make all sorts of seemingly complex problems easy to solve. Especially when you are trying to write a complex regex for something like sed, you can often save hours by instead writing it procedurally in awk. Learning it will pa∕y large dividends.
– Joe
Feb 1 at 20:55
1
@LeviUzodike: Yesawk
is a stand-alone binary that has to start up. Unlike perl or especially Python, the awk interpreter starts up quickly (still all the usual dynamic linker overhead of making quite a few system calls, but awk only uses libc/libm and libdl. e.g. usestrace
to check out system-calls of awk startup). Many shells (like bash) are pretty slow, so firing up one awk process can be faster than looping over tokens in a list with shell built-ins even for small-ish list sizes. And sometimes you can write a#!/usr/bin/awk
script instead of a#!/bin/sh
script.
– Peter Cordes
Feb 2 at 4:36
|
show 4 more comments
1
Good point, @MichaelHomer; I've added an aside addressing the question of "how can I determine which method is the most efficient".
– DopeGhoti
Jan 31 at 20:58
2
@LeviUzodike Regardingecho
vs<<<
, "identical" is too strong a word. You could say thatstuff <<< "$list"
is nearly identical toprintf "%sn" "$list" | stuff
. Regardingecho
vsprintf
, I direct you to this answer
– JoL
Jan 31 at 20:59
5
@DopeGhoti Actually it does.<<<
adds a newline at the end. This is similar to how$()
removes a newline from the end. This is because lines are terminated by newlines.<<<
feeds an expression as a line, so it must be terminated by a newline."$()"
takes lines and provides them as an argument, so it makes sense to convert by removing the terminating newline.
– JoL
Feb 1 at 2:09
3
@LeviUzodike awk is a much under-appreciated tool. It will make all sorts of seemingly complex problems easy to solve. Especially when you are trying to write a complex regex for something like sed, you can often save hours by instead writing it procedurally in awk. Learning it will pa∕y large dividends.
– Joe
Feb 1 at 20:55
1
@LeviUzodike: Yesawk
is a stand-alone binary that has to start up. Unlike perl or especially Python, the awk interpreter starts up quickly (still all the usual dynamic linker overhead of making quite a few system calls, but awk only uses libc/libm and libdl. e.g. usestrace
to check out system-calls of awk startup). Many shells (like bash) are pretty slow, so firing up one awk process can be faster than looping over tokens in a list with shell built-ins even for small-ish list sizes. And sometimes you can write a#!/usr/bin/awk
script instead of a#!/bin/sh
script.
– Peter Cordes
Feb 2 at 4:36
1
1
Good point, @MichaelHomer; I've added an aside addressing the question of "how can I determine which method is the most efficient".
– DopeGhoti
Jan 31 at 20:58
Good point, @MichaelHomer; I've added an aside addressing the question of "how can I determine which method is the most efficient".
– DopeGhoti
Jan 31 at 20:58
2
2
@LeviUzodike Regarding
echo
vs <<<
, "identical" is too strong a word. You could say that stuff <<< "$list"
is nearly identical to printf "%sn" "$list" | stuff
. Regarding echo
vs printf
, I direct you to this answer– JoL
Jan 31 at 20:59
@LeviUzodike Regarding
echo
vs <<<
, "identical" is too strong a word. You could say that stuff <<< "$list"
is nearly identical to printf "%sn" "$list" | stuff
. Regarding echo
vs printf
, I direct you to this answer– JoL
Jan 31 at 20:59
5
5
@DopeGhoti Actually it does.
<<<
adds a newline at the end. This is similar to how $()
removes a newline from the end. This is because lines are terminated by newlines. <<<
feeds an expression as a line, so it must be terminated by a newline. "$()"
takes lines and provides them as an argument, so it makes sense to convert by removing the terminating newline.– JoL
Feb 1 at 2:09
@DopeGhoti Actually it does.
<<<
adds a newline at the end. This is similar to how $()
removes a newline from the end. This is because lines are terminated by newlines. <<<
feeds an expression as a line, so it must be terminated by a newline. "$()"
takes lines and provides them as an argument, so it makes sense to convert by removing the terminating newline.– JoL
Feb 1 at 2:09
3
3
@LeviUzodike awk is a much under-appreciated tool. It will make all sorts of seemingly complex problems easy to solve. Especially when you are trying to write a complex regex for something like sed, you can often save hours by instead writing it procedurally in awk. Learning it will pa∕y large dividends.
– Joe
Feb 1 at 20:55
@LeviUzodike awk is a much under-appreciated tool. It will make all sorts of seemingly complex problems easy to solve. Especially when you are trying to write a complex regex for something like sed, you can often save hours by instead writing it procedurally in awk. Learning it will pa∕y large dividends.
– Joe
Feb 1 at 20:55
1
1
@LeviUzodike: Yes
awk
is a stand-alone binary that has to start up. Unlike perl or especially Python, the awk interpreter starts up quickly (still all the usual dynamic linker overhead of making quite a few system calls, but awk only uses libc/libm and libdl. e.g. use strace
to check out system-calls of awk startup). Many shells (like bash) are pretty slow, so firing up one awk process can be faster than looping over tokens in a list with shell built-ins even for small-ish list sizes. And sometimes you can write a #!/usr/bin/awk
script instead of a #!/bin/sh
script.– Peter Cordes
Feb 2 at 4:36
@LeviUzodike: Yes
awk
is a stand-alone binary that has to start up. Unlike perl or especially Python, the awk interpreter starts up quickly (still all the usual dynamic linker overhead of making quite a few system calls, but awk only uses libc/libm and libdl. e.g. use strace
to check out system-calls of awk startup). Many shells (like bash) are pretty slow, so firing up one awk process can be faster than looping over tokens in a list with shell built-ins even for small-ish list sizes. And sometimes you can write a #!/usr/bin/awk
script instead of a #!/bin/sh
script.– Peter Cordes
Feb 2 at 4:36
|
show 4 more comments
First rule of software optimization: Don't.
Until you know the speed of the program is an issue, there's no need to think
about how fast it is. If your list is about that length or just ~100-1000 items
long, you probably won't even notice how long it takes. There's a chance you're spending more time thinking about the optimization than what the difference would be.
Second rule: Measure.
That's the sure way to find out, and the one that gives answers for your system.
Especially with shells, there are so many, and they aren't all identical. An
answer for one shell might not apply for yours.
In larger programs, profiling goes here too. The slowest part might not be the one you think it is.
Third, the first rule of shell script optimization: Don't use the shell.
Yeah, really. Many shells aren't made to be fast (since launching external
programs doesn't have to be), and they might even parse the lines of the source
code again each time.
Use something like awk or Perl instead. In a trivial micro-benchmark I did,
awk
was dozens of times faster than any common shell in running a simple loop (without I/O).
However, if you do use the shell, use the shell's builtin functions instead of external commands. Here, you're using
expr
which isn't builtin in any shells I found on my system, but which can be replaced with standard arithmetic expansion. E.g.i=$((i+1))
instead ofi=$(expr $i + 1)
to incrementi
. Your use ofcut
in the last example might also be replaceable with standard parameter expansions.
See also: Why is using a shell loop to process text considered bad practice?
Steps #1 and #2 should apply to your question.
12
#0, quote your expansions :-)
– Kusalananda♦
Jan 31 at 19:59
8
It's not thatawk
loops are necessarily any better or worse than shell loops. It's that the shell is really good at running commands and at directing input and output to and from processes, and frankly rather clunky at everything else; while tools likeawk
are fantastic at processing text data, because that's what shells and tools likeawk
are made for (respectively) in the first place.
– DopeGhoti
Jan 31 at 21:05
2
@DopeGhoti, shells do seem to be objectively slower, though. Some very simple while loops seem to be >25 times slower indash
than withgawk
, anddash
was the fastest shell I tested...
– ilkkachu
Jan 31 at 22:36
1
@Joe, it is :)dash
andbusybox
don't support(( .. ))
-- I think it's a nonstandard extension.++
is also explicitly mentioned as not required, so as far as I can tell,i=$((i+1))
or: $(( i += 1))
are the safe ones.
– ilkkachu
Feb 1 at 23:10
1
Re "more time thinking": this neglects an important factor. How often does it run, and for how many users? If a program wastes 1 second, which could be fixed by the programmer thinking about it for 30 minutes, it might be a waste of time if there's only one user who's going to run it once. On the other hand if there's a million users, that's a million seconds, or 11 days of user time. If the code wasted a minute of a million users, that's about 2 years of user time.
– agc
Feb 4 at 2:49
|
show 3 more comments
First rule of software optimization: Don't.
Until you know the speed of the program is an issue, there's no need to think
about how fast it is. If your list is about that length or just ~100-1000 items
long, you probably won't even notice how long it takes. There's a chance you're spending more time thinking about the optimization than what the difference would be.
Second rule: Measure.
That's the sure way to find out, and the one that gives answers for your system.
Especially with shells, there are so many, and they aren't all identical. An
answer for one shell might not apply for yours.
In larger programs, profiling goes here too. The slowest part might not be the one you think it is.
Third, the first rule of shell script optimization: Don't use the shell.
Yeah, really. Many shells aren't made to be fast (since launching external
programs doesn't have to be), and they might even parse the lines of the source
code again each time.
Use something like awk or Perl instead. In a trivial micro-benchmark I did,
awk
was dozens of times faster than any common shell in running a simple loop (without I/O).
However, if you do use the shell, use the shell's builtin functions instead of external commands. Here, you're using
expr
which isn't builtin in any shells I found on my system, but which can be replaced with standard arithmetic expansion. E.g.i=$((i+1))
instead ofi=$(expr $i + 1)
to incrementi
. Your use ofcut
in the last example might also be replaceable with standard parameter expansions.
See also: Why is using a shell loop to process text considered bad practice?
Steps #1 and #2 should apply to your question.
12
#0, quote your expansions :-)
– Kusalananda♦
Jan 31 at 19:59
8
It's not thatawk
loops are necessarily any better or worse than shell loops. It's that the shell is really good at running commands and at directing input and output to and from processes, and frankly rather clunky at everything else; while tools likeawk
are fantastic at processing text data, because that's what shells and tools likeawk
are made for (respectively) in the first place.
– DopeGhoti
Jan 31 at 21:05
2
@DopeGhoti, shells do seem to be objectively slower, though. Some very simple while loops seem to be >25 times slower indash
than withgawk
, anddash
was the fastest shell I tested...
– ilkkachu
Jan 31 at 22:36
1
@Joe, it is :)dash
andbusybox
don't support(( .. ))
-- I think it's a nonstandard extension.++
is also explicitly mentioned as not required, so as far as I can tell,i=$((i+1))
or: $(( i += 1))
are the safe ones.
– ilkkachu
Feb 1 at 23:10
1
Re "more time thinking": this neglects an important factor. How often does it run, and for how many users? If a program wastes 1 second, which could be fixed by the programmer thinking about it for 30 minutes, it might be a waste of time if there's only one user who's going to run it once. On the other hand if there's a million users, that's a million seconds, or 11 days of user time. If the code wasted a minute of a million users, that's about 2 years of user time.
– agc
Feb 4 at 2:49
|
show 3 more comments
First rule of software optimization: Don't.
Until you know the speed of the program is an issue, there's no need to think
about how fast it is. If your list is about that length or just ~100-1000 items
long, you probably won't even notice how long it takes. There's a chance you're spending more time thinking about the optimization than what the difference would be.
Second rule: Measure.
That's the sure way to find out, and the one that gives answers for your system.
Especially with shells, there are so many, and they aren't all identical. An
answer for one shell might not apply for yours.
In larger programs, profiling goes here too. The slowest part might not be the one you think it is.
Third, the first rule of shell script optimization: Don't use the shell.
Yeah, really. Many shells aren't made to be fast (since launching external
programs doesn't have to be), and they might even parse the lines of the source
code again each time.
Use something like awk or Perl instead. In a trivial micro-benchmark I did,
awk
was dozens of times faster than any common shell in running a simple loop (without I/O).
However, if you do use the shell, use the shell's builtin functions instead of external commands. Here, you're using
expr
which isn't builtin in any shells I found on my system, but which can be replaced with standard arithmetic expansion. E.g.i=$((i+1))
instead ofi=$(expr $i + 1)
to incrementi
. Your use ofcut
in the last example might also be replaceable with standard parameter expansions.
See also: Why is using a shell loop to process text considered bad practice?
Steps #1 and #2 should apply to your question.
First rule of software optimization: Don't.
Until you know the speed of the program is an issue, there's no need to think
about how fast it is. If your list is about that length or just ~100-1000 items
long, you probably won't even notice how long it takes. There's a chance you're spending more time thinking about the optimization than what the difference would be.
Second rule: Measure.
That's the sure way to find out, and the one that gives answers for your system.
Especially with shells, there are so many, and they aren't all identical. An
answer for one shell might not apply for yours.
In larger programs, profiling goes here too. The slowest part might not be the one you think it is.
Third, the first rule of shell script optimization: Don't use the shell.
Yeah, really. Many shells aren't made to be fast (since launching external
programs doesn't have to be), and they might even parse the lines of the source
code again each time.
Use something like awk or Perl instead. In a trivial micro-benchmark I did,
awk
was dozens of times faster than any common shell in running a simple loop (without I/O).
However, if you do use the shell, use the shell's builtin functions instead of external commands. Here, you're using
expr
which isn't builtin in any shells I found on my system, but which can be replaced with standard arithmetic expansion. E.g.i=$((i+1))
instead ofi=$(expr $i + 1)
to incrementi
. Your use ofcut
in the last example might also be replaceable with standard parameter expansions.
See also: Why is using a shell loop to process text considered bad practice?
Steps #1 and #2 should apply to your question.
edited Feb 1 at 9:40
answered Jan 31 at 19:33
ilkkachuilkkachu
63.3k10104181
63.3k10104181
12
#0, quote your expansions :-)
– Kusalananda♦
Jan 31 at 19:59
8
It's not thatawk
loops are necessarily any better or worse than shell loops. It's that the shell is really good at running commands and at directing input and output to and from processes, and frankly rather clunky at everything else; while tools likeawk
are fantastic at processing text data, because that's what shells and tools likeawk
are made for (respectively) in the first place.
– DopeGhoti
Jan 31 at 21:05
2
@DopeGhoti, shells do seem to be objectively slower, though. Some very simple while loops seem to be >25 times slower indash
than withgawk
, anddash
was the fastest shell I tested...
– ilkkachu
Jan 31 at 22:36
1
@Joe, it is :)dash
andbusybox
don't support(( .. ))
-- I think it's a nonstandard extension.++
is also explicitly mentioned as not required, so as far as I can tell,i=$((i+1))
or: $(( i += 1))
are the safe ones.
– ilkkachu
Feb 1 at 23:10
1
Re "more time thinking": this neglects an important factor. How often does it run, and for how many users? If a program wastes 1 second, which could be fixed by the programmer thinking about it for 30 minutes, it might be a waste of time if there's only one user who's going to run it once. On the other hand if there's a million users, that's a million seconds, or 11 days of user time. If the code wasted a minute of a million users, that's about 2 years of user time.
– agc
Feb 4 at 2:49
|
show 3 more comments
12
#0, quote your expansions :-)
– Kusalananda♦
Jan 31 at 19:59
8
It's not thatawk
loops are necessarily any better or worse than shell loops. It's that the shell is really good at running commands and at directing input and output to and from processes, and frankly rather clunky at everything else; while tools likeawk
are fantastic at processing text data, because that's what shells and tools likeawk
are made for (respectively) in the first place.
– DopeGhoti
Jan 31 at 21:05
2
@DopeGhoti, shells do seem to be objectively slower, though. Some very simple while loops seem to be >25 times slower indash
than withgawk
, anddash
was the fastest shell I tested...
– ilkkachu
Jan 31 at 22:36
1
@Joe, it is :)dash
andbusybox
don't support(( .. ))
-- I think it's a nonstandard extension.++
is also explicitly mentioned as not required, so as far as I can tell,i=$((i+1))
or: $(( i += 1))
are the safe ones.
– ilkkachu
Feb 1 at 23:10
1
Re "more time thinking": this neglects an important factor. How often does it run, and for how many users? If a program wastes 1 second, which could be fixed by the programmer thinking about it for 30 minutes, it might be a waste of time if there's only one user who's going to run it once. On the other hand if there's a million users, that's a million seconds, or 11 days of user time. If the code wasted a minute of a million users, that's about 2 years of user time.
– agc
Feb 4 at 2:49
12
12
#0, quote your expansions :-)
– Kusalananda♦
Jan 31 at 19:59
#0, quote your expansions :-)
– Kusalananda♦
Jan 31 at 19:59
8
8
It's not that
awk
loops are necessarily any better or worse than shell loops. It's that the shell is really good at running commands and at directing input and output to and from processes, and frankly rather clunky at everything else; while tools like awk
are fantastic at processing text data, because that's what shells and tools like awk
are made for (respectively) in the first place.– DopeGhoti
Jan 31 at 21:05
It's not that
awk
loops are necessarily any better or worse than shell loops. It's that the shell is really good at running commands and at directing input and output to and from processes, and frankly rather clunky at everything else; while tools like awk
are fantastic at processing text data, because that's what shells and tools like awk
are made for (respectively) in the first place.– DopeGhoti
Jan 31 at 21:05
2
2
@DopeGhoti, shells do seem to be objectively slower, though. Some very simple while loops seem to be >25 times slower in
dash
than with gawk
, and dash
was the fastest shell I tested...– ilkkachu
Jan 31 at 22:36
@DopeGhoti, shells do seem to be objectively slower, though. Some very simple while loops seem to be >25 times slower in
dash
than with gawk
, and dash
was the fastest shell I tested...– ilkkachu
Jan 31 at 22:36
1
1
@Joe, it is :)
dash
and busybox
don't support (( .. ))
-- I think it's a nonstandard extension. ++
is also explicitly mentioned as not required, so as far as I can tell, i=$((i+1))
or : $(( i += 1))
are the safe ones.– ilkkachu
Feb 1 at 23:10
@Joe, it is :)
dash
and busybox
don't support (( .. ))
-- I think it's a nonstandard extension. ++
is also explicitly mentioned as not required, so as far as I can tell, i=$((i+1))
or : $(( i += 1))
are the safe ones.– ilkkachu
Feb 1 at 23:10
1
1
Re "more time thinking": this neglects an important factor. How often does it run, and for how many users? If a program wastes 1 second, which could be fixed by the programmer thinking about it for 30 minutes, it might be a waste of time if there's only one user who's going to run it once. On the other hand if there's a million users, that's a million seconds, or 11 days of user time. If the code wasted a minute of a million users, that's about 2 years of user time.
– agc
Feb 4 at 2:49
Re "more time thinking": this neglects an important factor. How often does it run, and for how many users? If a program wastes 1 second, which could be fixed by the programmer thinking about it for 30 minutes, it might be a waste of time if there's only one user who's going to run it once. On the other hand if there's a million users, that's a million seconds, or 11 days of user time. If the code wasted a minute of a million users, that's about 2 years of user time.
– agc
Feb 4 at 2:49
|
show 3 more comments
I'm only going to give some general advice in this answer, and not benchmarks. Benchmarks are the only way to reliably answer questions about performance. But since you don't say how much data you're manipulating and how often you perform this operation, there's no way to do a useful benchmark. What's more efficient for 10 items and what's more efficient for 1000000 items is often not the same.
As a general rule of thumb, invoking external commands is more expensive than doing something with pure shell constructs, as long as the pure shell code doesn't involve a loop. On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool. For example, your loop invoking cut
could well be noticeably slow in practice, but if you find a way to do the whole thing with a single cut
invocation that's likely to be faster than doing the same thing with string manipulation in the shell.
Do note that the cutoff point can vary a lot between systems. It can depend on the kernel, on how the kernel's scheduler is configured, on the filesystem containing the external executables, on how much CPU vs memory pressure there is at the moment, and many other factors.
Don't call expr
to perform arithmetic if you're at all concerned about performance. In fact, don't call expr
to perform arithmetic at all. Shells have built-in arithmetic, which is clearer and faster than invoking expr
.
You seem to be using bash, since you're using bash constructs that don't exist in sh. So why on earth would you not use an array? An array is the most natural solution, and it's likely to be the fastest, too. Note that array indices start at 0.
list=(1 2 3 5 9 8 6 90 84 9 3 2 15 75 55)
for ((count = 0; count += 3; count < ${#list[@]})); do
echo "${list[$count]}"
done
Your script may well be faster if you use sh, if your system has dash or ksh as sh
rather than bash. If you use sh, you don't get named arrays, but you still get the array one of positional parameters, which you can set with set
. To access an element at a position that is not known until runtime, you need to use eval
(take care of quoting things properly!).
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
count=1
while [ $count -le $# ]; do
eval "value=${$count}"
echo "$value"
count=$((count+1))
done
If you only ever want to access the array once and are going from left to right (skipping some values), you can use shift
instead of variable indices.
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
while [ $# -ge 1 ]; do
echo "$1"
shift && shift && shift
done
Which approach is faster depends on the shell and on the number of elements.
Another possibility is to use string processing. It has the advantage of not using the positional parameters, so you can use them for something else. It'll be slower for large amounts of data, but that's unlikely to make a noticeable difference for small amounts of data.
# List elements must be separated by a single space (not arbitrary whitespace)
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
while [ -n "$list" ]; do
echo "${list% *}"
case "$list" in * * * *) :;; *) break;; esac
list="${list#* * * }"
done
"On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool" but what if that tool has loops in it like awk? @ikkachu said awk loops are faster, but would you say that with < 1000 fields to iterate through, the benefit of faster loops wouldn't outweigh the cost of calling awk since it's an external command (assuming I could do the same task in shell loops with the use of only built in commands)?
– Levi Uzodike
Feb 1 at 16:40
@LeviUzodike Please re-read the first paragraph of my answer.
– Gilles
Feb 1 at 17:04
You could also replaceshift && shift && shift
withshift 3
in your third example - unless the shell you're using doesn't support it.
– Joe
Feb 1 at 21:14
2
@Joe Actually, no.shift 3
would fail if there were too few remaining arguments. You'd need something likeif [ $# -gt 3 ]; then shift 3; else set --; fi
– Gilles
Feb 1 at 21:25
add a comment |
I'm only going to give some general advice in this answer, and not benchmarks. Benchmarks are the only way to reliably answer questions about performance. But since you don't say how much data you're manipulating and how often you perform this operation, there's no way to do a useful benchmark. What's more efficient for 10 items and what's more efficient for 1000000 items is often not the same.
As a general rule of thumb, invoking external commands is more expensive than doing something with pure shell constructs, as long as the pure shell code doesn't involve a loop. On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool. For example, your loop invoking cut
could well be noticeably slow in practice, but if you find a way to do the whole thing with a single cut
invocation that's likely to be faster than doing the same thing with string manipulation in the shell.
Do note that the cutoff point can vary a lot between systems. It can depend on the kernel, on how the kernel's scheduler is configured, on the filesystem containing the external executables, on how much CPU vs memory pressure there is at the moment, and many other factors.
Don't call expr
to perform arithmetic if you're at all concerned about performance. In fact, don't call expr
to perform arithmetic at all. Shells have built-in arithmetic, which is clearer and faster than invoking expr
.
You seem to be using bash, since you're using bash constructs that don't exist in sh. So why on earth would you not use an array? An array is the most natural solution, and it's likely to be the fastest, too. Note that array indices start at 0.
list=(1 2 3 5 9 8 6 90 84 9 3 2 15 75 55)
for ((count = 0; count += 3; count < ${#list[@]})); do
echo "${list[$count]}"
done
Your script may well be faster if you use sh, if your system has dash or ksh as sh
rather than bash. If you use sh, you don't get named arrays, but you still get the array one of positional parameters, which you can set with set
. To access an element at a position that is not known until runtime, you need to use eval
(take care of quoting things properly!).
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
count=1
while [ $count -le $# ]; do
eval "value=${$count}"
echo "$value"
count=$((count+1))
done
If you only ever want to access the array once and are going from left to right (skipping some values), you can use shift
instead of variable indices.
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
while [ $# -ge 1 ]; do
echo "$1"
shift && shift && shift
done
Which approach is faster depends on the shell and on the number of elements.
Another possibility is to use string processing. It has the advantage of not using the positional parameters, so you can use them for something else. It'll be slower for large amounts of data, but that's unlikely to make a noticeable difference for small amounts of data.
# List elements must be separated by a single space (not arbitrary whitespace)
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
while [ -n "$list" ]; do
echo "${list% *}"
case "$list" in * * * *) :;; *) break;; esac
list="${list#* * * }"
done
"On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool" but what if that tool has loops in it like awk? @ikkachu said awk loops are faster, but would you say that with < 1000 fields to iterate through, the benefit of faster loops wouldn't outweigh the cost of calling awk since it's an external command (assuming I could do the same task in shell loops with the use of only built in commands)?
– Levi Uzodike
Feb 1 at 16:40
@LeviUzodike Please re-read the first paragraph of my answer.
– Gilles
Feb 1 at 17:04
You could also replaceshift && shift && shift
withshift 3
in your third example - unless the shell you're using doesn't support it.
– Joe
Feb 1 at 21:14
2
@Joe Actually, no.shift 3
would fail if there were too few remaining arguments. You'd need something likeif [ $# -gt 3 ]; then shift 3; else set --; fi
– Gilles
Feb 1 at 21:25
add a comment |
I'm only going to give some general advice in this answer, and not benchmarks. Benchmarks are the only way to reliably answer questions about performance. But since you don't say how much data you're manipulating and how often you perform this operation, there's no way to do a useful benchmark. What's more efficient for 10 items and what's more efficient for 1000000 items is often not the same.
As a general rule of thumb, invoking external commands is more expensive than doing something with pure shell constructs, as long as the pure shell code doesn't involve a loop. On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool. For example, your loop invoking cut
could well be noticeably slow in practice, but if you find a way to do the whole thing with a single cut
invocation that's likely to be faster than doing the same thing with string manipulation in the shell.
Do note that the cutoff point can vary a lot between systems. It can depend on the kernel, on how the kernel's scheduler is configured, on the filesystem containing the external executables, on how much CPU vs memory pressure there is at the moment, and many other factors.
Don't call expr
to perform arithmetic if you're at all concerned about performance. In fact, don't call expr
to perform arithmetic at all. Shells have built-in arithmetic, which is clearer and faster than invoking expr
.
You seem to be using bash, since you're using bash constructs that don't exist in sh. So why on earth would you not use an array? An array is the most natural solution, and it's likely to be the fastest, too. Note that array indices start at 0.
list=(1 2 3 5 9 8 6 90 84 9 3 2 15 75 55)
for ((count = 0; count += 3; count < ${#list[@]})); do
echo "${list[$count]}"
done
Your script may well be faster if you use sh, if your system has dash or ksh as sh
rather than bash. If you use sh, you don't get named arrays, but you still get the array one of positional parameters, which you can set with set
. To access an element at a position that is not known until runtime, you need to use eval
(take care of quoting things properly!).
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
count=1
while [ $count -le $# ]; do
eval "value=${$count}"
echo "$value"
count=$((count+1))
done
If you only ever want to access the array once and are going from left to right (skipping some values), you can use shift
instead of variable indices.
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
while [ $# -ge 1 ]; do
echo "$1"
shift && shift && shift
done
Which approach is faster depends on the shell and on the number of elements.
Another possibility is to use string processing. It has the advantage of not using the positional parameters, so you can use them for something else. It'll be slower for large amounts of data, but that's unlikely to make a noticeable difference for small amounts of data.
# List elements must be separated by a single space (not arbitrary whitespace)
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
while [ -n "$list" ]; do
echo "${list% *}"
case "$list" in * * * *) :;; *) break;; esac
list="${list#* * * }"
done
I'm only going to give some general advice in this answer, and not benchmarks. Benchmarks are the only way to reliably answer questions about performance. But since you don't say how much data you're manipulating and how often you perform this operation, there's no way to do a useful benchmark. What's more efficient for 10 items and what's more efficient for 1000000 items is often not the same.
As a general rule of thumb, invoking external commands is more expensive than doing something with pure shell constructs, as long as the pure shell code doesn't involve a loop. On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool. For example, your loop invoking cut
could well be noticeably slow in practice, but if you find a way to do the whole thing with a single cut
invocation that's likely to be faster than doing the same thing with string manipulation in the shell.
Do note that the cutoff point can vary a lot between systems. It can depend on the kernel, on how the kernel's scheduler is configured, on the filesystem containing the external executables, on how much CPU vs memory pressure there is at the moment, and many other factors.
Don't call expr
to perform arithmetic if you're at all concerned about performance. In fact, don't call expr
to perform arithmetic at all. Shells have built-in arithmetic, which is clearer and faster than invoking expr
.
You seem to be using bash, since you're using bash constructs that don't exist in sh. So why on earth would you not use an array? An array is the most natural solution, and it's likely to be the fastest, too. Note that array indices start at 0.
list=(1 2 3 5 9 8 6 90 84 9 3 2 15 75 55)
for ((count = 0; count += 3; count < ${#list[@]})); do
echo "${list[$count]}"
done
Your script may well be faster if you use sh, if your system has dash or ksh as sh
rather than bash. If you use sh, you don't get named arrays, but you still get the array one of positional parameters, which you can set with set
. To access an element at a position that is not known until runtime, you need to use eval
(take care of quoting things properly!).
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
count=1
while [ $count -le $# ]; do
eval "value=${$count}"
echo "$value"
count=$((count+1))
done
If you only ever want to access the array once and are going from left to right (skipping some values), you can use shift
instead of variable indices.
# List elements must not contain whitespace or ?*[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
while [ $# -ge 1 ]; do
echo "$1"
shift && shift && shift
done
Which approach is faster depends on the shell and on the number of elements.
Another possibility is to use string processing. It has the advantage of not using the positional parameters, so you can use them for something else. It'll be slower for large amounts of data, but that's unlikely to make a noticeable difference for small amounts of data.
# List elements must be separated by a single space (not arbitrary whitespace)
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
while [ -n "$list" ]; do
echo "${list% *}"
case "$list" in * * * *) :;; *) break;; esac
list="${list#* * * }"
done
answered Feb 1 at 7:59
GillesGilles
546k13011131626
546k13011131626
"On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool" but what if that tool has loops in it like awk? @ikkachu said awk loops are faster, but would you say that with < 1000 fields to iterate through, the benefit of faster loops wouldn't outweigh the cost of calling awk since it's an external command (assuming I could do the same task in shell loops with the use of only built in commands)?
– Levi Uzodike
Feb 1 at 16:40
@LeviUzodike Please re-read the first paragraph of my answer.
– Gilles
Feb 1 at 17:04
You could also replaceshift && shift && shift
withshift 3
in your third example - unless the shell you're using doesn't support it.
– Joe
Feb 1 at 21:14
2
@Joe Actually, no.shift 3
would fail if there were too few remaining arguments. You'd need something likeif [ $# -gt 3 ]; then shift 3; else set --; fi
– Gilles
Feb 1 at 21:25
add a comment |
"On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool" but what if that tool has loops in it like awk? @ikkachu said awk loops are faster, but would you say that with < 1000 fields to iterate through, the benefit of faster loops wouldn't outweigh the cost of calling awk since it's an external command (assuming I could do the same task in shell loops with the use of only built in commands)?
– Levi Uzodike
Feb 1 at 16:40
@LeviUzodike Please re-read the first paragraph of my answer.
– Gilles
Feb 1 at 17:04
You could also replaceshift && shift && shift
withshift 3
in your third example - unless the shell you're using doesn't support it.
– Joe
Feb 1 at 21:14
2
@Joe Actually, no.shift 3
would fail if there were too few remaining arguments. You'd need something likeif [ $# -gt 3 ]; then shift 3; else set --; fi
– Gilles
Feb 1 at 21:25
"On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool" but what if that tool has loops in it like awk? @ikkachu said awk loops are faster, but would you say that with < 1000 fields to iterate through, the benefit of faster loops wouldn't outweigh the cost of calling awk since it's an external command (assuming I could do the same task in shell loops with the use of only built in commands)?
– Levi Uzodike
Feb 1 at 16:40
"On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool" but what if that tool has loops in it like awk? @ikkachu said awk loops are faster, but would you say that with < 1000 fields to iterate through, the benefit of faster loops wouldn't outweigh the cost of calling awk since it's an external command (assuming I could do the same task in shell loops with the use of only built in commands)?
– Levi Uzodike
Feb 1 at 16:40
@LeviUzodike Please re-read the first paragraph of my answer.
– Gilles
Feb 1 at 17:04
@LeviUzodike Please re-read the first paragraph of my answer.
– Gilles
Feb 1 at 17:04
You could also replace
shift && shift && shift
with shift 3
in your third example - unless the shell you're using doesn't support it.– Joe
Feb 1 at 21:14
You could also replace
shift && shift && shift
with shift 3
in your third example - unless the shell you're using doesn't support it.– Joe
Feb 1 at 21:14
2
2
@Joe Actually, no.
shift 3
would fail if there were too few remaining arguments. You'd need something like if [ $# -gt 3 ]; then shift 3; else set --; fi
– Gilles
Feb 1 at 21:25
@Joe Actually, no.
shift 3
would fail if there were too few remaining arguments. You'd need something like if [ $# -gt 3 ]; then shift 3; else set --; fi
– Gilles
Feb 1 at 21:25
add a comment |
awk
is a great choice, if you can do all your processing inside of the Awk script. Otherwise, you just end up piping the Awk output to other utilities, destroying the performance gain of awk
.
bash
iteration over an array is also great, if you can fit your entire list inside the array (which for modern shells is probably a guarantee) and you don't mind the array syntax gymnastics.
However, a pipeline approach:
xargs -n3 <<< "$list" | while read -ra a; do echo $a; done | grep 9
Where:
xargs
groups the whitespace-separated list into batches of three, each new-line separated
while read
consumes that list and outputs the first column of each group
grep
filters the first column (corresponding to every third position in the original list)
Improves understandability, in my opinion. People already know what these tools do, so it's easy to read from left to right and reason about what's going to happen. This approach also clearly documents the stride length (-n3
) and the filter pattern (9
), so it's easy to variabilize:
count=3
find=9
xargs -n "$count" <<< "$list" | while read -ra a; do echo $a; done | grep "$find"
When we ask questions of "efficiency", be sure to think about "total lifetime efficiency". That calculation includes the effort of maintainers to keep the code working, and we meat-bags are the least efficient machines in the whole operation.
add a comment |
awk
is a great choice, if you can do all your processing inside of the Awk script. Otherwise, you just end up piping the Awk output to other utilities, destroying the performance gain of awk
.
bash
iteration over an array is also great, if you can fit your entire list inside the array (which for modern shells is probably a guarantee) and you don't mind the array syntax gymnastics.
However, a pipeline approach:
xargs -n3 <<< "$list" | while read -ra a; do echo $a; done | grep 9
Where:
xargs
groups the whitespace-separated list into batches of three, each new-line separated
while read
consumes that list and outputs the first column of each group
grep
filters the first column (corresponding to every third position in the original list)
Improves understandability, in my opinion. People already know what these tools do, so it's easy to read from left to right and reason about what's going to happen. This approach also clearly documents the stride length (-n3
) and the filter pattern (9
), so it's easy to variabilize:
count=3
find=9
xargs -n "$count" <<< "$list" | while read -ra a; do echo $a; done | grep "$find"
When we ask questions of "efficiency", be sure to think about "total lifetime efficiency". That calculation includes the effort of maintainers to keep the code working, and we meat-bags are the least efficient machines in the whole operation.
add a comment |
awk
is a great choice, if you can do all your processing inside of the Awk script. Otherwise, you just end up piping the Awk output to other utilities, destroying the performance gain of awk
.
bash
iteration over an array is also great, if you can fit your entire list inside the array (which for modern shells is probably a guarantee) and you don't mind the array syntax gymnastics.
However, a pipeline approach:
xargs -n3 <<< "$list" | while read -ra a; do echo $a; done | grep 9
Where:
xargs
groups the whitespace-separated list into batches of three, each new-line separated
while read
consumes that list and outputs the first column of each group
grep
filters the first column (corresponding to every third position in the original list)
Improves understandability, in my opinion. People already know what these tools do, so it's easy to read from left to right and reason about what's going to happen. This approach also clearly documents the stride length (-n3
) and the filter pattern (9
), so it's easy to variabilize:
count=3
find=9
xargs -n "$count" <<< "$list" | while read -ra a; do echo $a; done | grep "$find"
When we ask questions of "efficiency", be sure to think about "total lifetime efficiency". That calculation includes the effort of maintainers to keep the code working, and we meat-bags are the least efficient machines in the whole operation.
awk
is a great choice, if you can do all your processing inside of the Awk script. Otherwise, you just end up piping the Awk output to other utilities, destroying the performance gain of awk
.
bash
iteration over an array is also great, if you can fit your entire list inside the array (which for modern shells is probably a guarantee) and you don't mind the array syntax gymnastics.
However, a pipeline approach:
xargs -n3 <<< "$list" | while read -ra a; do echo $a; done | grep 9
Where:
xargs
groups the whitespace-separated list into batches of three, each new-line separated
while read
consumes that list and outputs the first column of each group
grep
filters the first column (corresponding to every third position in the original list)
Improves understandability, in my opinion. People already know what these tools do, so it's easy to read from left to right and reason about what's going to happen. This approach also clearly documents the stride length (-n3
) and the filter pattern (9
), so it's easy to variabilize:
count=3
find=9
xargs -n "$count" <<< "$list" | while read -ra a; do echo $a; done | grep "$find"
When we ask questions of "efficiency", be sure to think about "total lifetime efficiency". That calculation includes the effort of maintainers to keep the code working, and we meat-bags are the least efficient machines in the whole operation.
answered Feb 1 at 19:08
bishopbishop
2,1362923
2,1362923
add a comment |
add a comment |
Perhaps this?
cut -d' ' -f1,4,7,10,13 <<<$list
1 5 6 9 15
Sorry I wasn't clear before, but I wanted to be able to get the numbers at those positions without knowing the length of the list. But thanks, I forgot cut could do that.
– Levi Uzodike
Jan 31 at 19:51
add a comment |
Perhaps this?
cut -d' ' -f1,4,7,10,13 <<<$list
1 5 6 9 15
Sorry I wasn't clear before, but I wanted to be able to get the numbers at those positions without knowing the length of the list. But thanks, I forgot cut could do that.
– Levi Uzodike
Jan 31 at 19:51
add a comment |
Perhaps this?
cut -d' ' -f1,4,7,10,13 <<<$list
1 5 6 9 15
Perhaps this?
cut -d' ' -f1,4,7,10,13 <<<$list
1 5 6 9 15
answered Jan 31 at 19:21
Doug O'NealDoug O'Neal
2,9941919
2,9941919
Sorry I wasn't clear before, but I wanted to be able to get the numbers at those positions without knowing the length of the list. But thanks, I forgot cut could do that.
– Levi Uzodike
Jan 31 at 19:51
add a comment |
Sorry I wasn't clear before, but I wanted to be able to get the numbers at those positions without knowing the length of the list. But thanks, I forgot cut could do that.
– Levi Uzodike
Jan 31 at 19:51
Sorry I wasn't clear before, but I wanted to be able to get the numbers at those positions without knowing the length of the list. But thanks, I forgot cut could do that.
– Levi Uzodike
Jan 31 at 19:51
Sorry I wasn't clear before, but I wanted to be able to get the numbers at those positions without knowing the length of the list. But thanks, I forgot cut could do that.
– Levi Uzodike
Jan 31 at 19:51
add a comment |
Don't use shell commands if you want to be efficient. Limit yourself to pipes, redirections, substitutions etc, and programs. That's why xargs
and parallel
utilities exists - because bash while loops are inefficient and very slow. Use bash loops only as the last resolve.
list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"
if
<<<"$list" tr -d -s '[0-9 ]' |
tr -s ' ' | tr ' ' 'n' |
grep -q -x '9'
then
found=true
else
found=false
fi
echo ${found}
But you should get probably somewhat faster with good awk
.
Sorry I wasn't clear before, but I was looking for a solution that would able to extract the values based only on their position in list. I just made the original list like that because I wanted it to be obvious the values I wanted.
– Levi Uzodike
Jan 31 at 20:02
add a comment |
Don't use shell commands if you want to be efficient. Limit yourself to pipes, redirections, substitutions etc, and programs. That's why xargs
and parallel
utilities exists - because bash while loops are inefficient and very slow. Use bash loops only as the last resolve.
list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"
if
<<<"$list" tr -d -s '[0-9 ]' |
tr -s ' ' | tr ' ' 'n' |
grep -q -x '9'
then
found=true
else
found=false
fi
echo ${found}
But you should get probably somewhat faster with good awk
.
Sorry I wasn't clear before, but I was looking for a solution that would able to extract the values based only on their position in list. I just made the original list like that because I wanted it to be obvious the values I wanted.
– Levi Uzodike
Jan 31 at 20:02
add a comment |
Don't use shell commands if you want to be efficient. Limit yourself to pipes, redirections, substitutions etc, and programs. That's why xargs
and parallel
utilities exists - because bash while loops are inefficient and very slow. Use bash loops only as the last resolve.
list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"
if
<<<"$list" tr -d -s '[0-9 ]' |
tr -s ' ' | tr ' ' 'n' |
grep -q -x '9'
then
found=true
else
found=false
fi
echo ${found}
But you should get probably somewhat faster with good awk
.
Don't use shell commands if you want to be efficient. Limit yourself to pipes, redirections, substitutions etc, and programs. That's why xargs
and parallel
utilities exists - because bash while loops are inefficient and very slow. Use bash loops only as the last resolve.
list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"
if
<<<"$list" tr -d -s '[0-9 ]' |
tr -s ' ' | tr ' ' 'n' |
grep -q -x '9'
then
found=true
else
found=false
fi
echo ${found}
But you should get probably somewhat faster with good awk
.
edited Jan 31 at 19:27
answered Jan 31 at 19:19
Kamil CukKamil Cuk
1194
1194
Sorry I wasn't clear before, but I was looking for a solution that would able to extract the values based only on their position in list. I just made the original list like that because I wanted it to be obvious the values I wanted.
– Levi Uzodike
Jan 31 at 20:02
add a comment |
Sorry I wasn't clear before, but I was looking for a solution that would able to extract the values based only on their position in list. I just made the original list like that because I wanted it to be obvious the values I wanted.
– Levi Uzodike
Jan 31 at 20:02
Sorry I wasn't clear before, but I was looking for a solution that would able to extract the values based only on their position in list. I just made the original list like that because I wanted it to be obvious the values I wanted.
– Levi Uzodike
Jan 31 at 20:02
Sorry I wasn't clear before, but I was looking for a solution that would able to extract the values based only on their position in list. I just made the original list like that because I wanted it to be obvious the values I wanted.
– Levi Uzodike
Jan 31 at 20:02
add a comment |
In my opinion the clearest solution (and probably the most performant too) is to use the RS and ORS awk variables:
awk -v RS=' ' -v ORS=' ' 'NR % 3 == 1' <<< "$list"
add a comment |
In my opinion the clearest solution (and probably the most performant too) is to use the RS and ORS awk variables:
awk -v RS=' ' -v ORS=' ' 'NR % 3 == 1' <<< "$list"
add a comment |
In my opinion the clearest solution (and probably the most performant too) is to use the RS and ORS awk variables:
awk -v RS=' ' -v ORS=' ' 'NR % 3 == 1' <<< "$list"
In my opinion the clearest solution (and probably the most performant too) is to use the RS and ORS awk variables:
awk -v RS=' ' -v ORS=' ' 'NR % 3 == 1' <<< "$list"
answered Feb 2 at 16:43
user000001user000001
997714
997714
add a comment |
add a comment |
Using GNU
sed
and POSIX shell script:
echo $(printf '%sn' $list | sed -n '1~3p')
Or with
bash
's parameter substitution:
echo $(sed -n '1~3p' <<< ${list// /$'n'})
Non-GNU (i.e. POSIX)
sed
, andbash
:
sed 's/([^ ]* )[^ ]* *[^ ]* */1/g' <<< "$list"
Or more portably, using both POSIX
sed
and shell script:
echo "$list" | sed 's/([^ ]* )[^ ]* *[^ ]* */1/g'
Output of any of these:
1 5 6 9 15
add a comment |
Using GNU
sed
and POSIX shell script:
echo $(printf '%sn' $list | sed -n '1~3p')
Or with
bash
's parameter substitution:
echo $(sed -n '1~3p' <<< ${list// /$'n'})
Non-GNU (i.e. POSIX)
sed
, andbash
:
sed 's/([^ ]* )[^ ]* *[^ ]* */1/g' <<< "$list"
Or more portably, using both POSIX
sed
and shell script:
echo "$list" | sed 's/([^ ]* )[^ ]* *[^ ]* */1/g'
Output of any of these:
1 5 6 9 15
add a comment |
Using GNU
sed
and POSIX shell script:
echo $(printf '%sn' $list | sed -n '1~3p')
Or with
bash
's parameter substitution:
echo $(sed -n '1~3p' <<< ${list// /$'n'})
Non-GNU (i.e. POSIX)
sed
, andbash
:
sed 's/([^ ]* )[^ ]* *[^ ]* */1/g' <<< "$list"
Or more portably, using both POSIX
sed
and shell script:
echo "$list" | sed 's/([^ ]* )[^ ]* *[^ ]* */1/g'
Output of any of these:
1 5 6 9 15
Using GNU
sed
and POSIX shell script:
echo $(printf '%sn' $list | sed -n '1~3p')
Or with
bash
's parameter substitution:
echo $(sed -n '1~3p' <<< ${list// /$'n'})
Non-GNU (i.e. POSIX)
sed
, andbash
:
sed 's/([^ ]* )[^ ]* *[^ ]* */1/g' <<< "$list"
Or more portably, using both POSIX
sed
and shell script:
echo "$list" | sed 's/([^ ]* )[^ ]* *[^ ]* */1/g'
Output of any of these:
1 5 6 9 15
edited Feb 4 at 4:02
answered Feb 4 at 3:25
agcagc
4,80211138
4,80211138
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f497985%2fis-piping-shifting-or-parameter-expansion-more-efficient%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
10
I wouldn't use a shell script in the first place if efficiency is an important concern. How big is your list that it makes a difference?
– Barmar
Feb 1 at 2:54
5
premature optimization is the source of all evil
– Barmar
Feb 1 at 2:56
2
Without doing statistics over actual instances of your problem, you will know nothing. This includes comparing to "programming in awk" etc. If statistics are too expensive, then looking for efficiency is probably not worth it.
– David Tonhofer
Feb 1 at 20:48
1
Levi, what exactly is the "efficient" way in your definition ? You want to find a faster way to iterate ?
– Sergiy Kolodyazhnyy
Feb 2 at 0:42