Redundant mov operations if you passing variable by reference
Is there any reason why modern C++ compilers can't optimize redundant mov instruction if you changing variable passed by reference?
Slow: https://gcc.godbolt.org/z/2Bmidk
Redundant mov:
10: mov QWORD PTR [rdi], rdx
Fast: https://gcc.godbolt.org/z/u3GMLx
Why compiler just don't store begin_ variable in CPU register and write it to memory in the end of function?
c++ gcc assembly optimization compiler-optimization
add a comment |
Is there any reason why modern C++ compilers can't optimize redundant mov instruction if you changing variable passed by reference?
Slow: https://gcc.godbolt.org/z/2Bmidk
Redundant mov:
10: mov QWORD PTR [rdi], rdx
Fast: https://gcc.godbolt.org/z/u3GMLx
Why compiler just don't store begin_ variable in CPU register and write it to memory in the end of function?
c++ gcc assembly optimization compiler-optimization
2
That's the calling convention.
– Matthieu Brucher
Jan 1 at 20:19
1
@MatthieuBrucher, but there's no calls involved?(everything inlined)
– RiaD
Jan 1 at 20:24
5
This is not an easy optimization: the compiler would have to prove that the qword atrdi
and any of the bytes read from the charstream never overlap
– harold
Jan 1 at 20:40
1
@harold Nice catch, you should make an answer out of it.
– Sebastian Redl
Jan 1 at 20:41
@harold Seems like you're right: gcc.godbolt.org/z/_wm_zO
– yarrr
Jan 1 at 20:49
add a comment |
Is there any reason why modern C++ compilers can't optimize redundant mov instruction if you changing variable passed by reference?
Slow: https://gcc.godbolt.org/z/2Bmidk
Redundant mov:
10: mov QWORD PTR [rdi], rdx
Fast: https://gcc.godbolt.org/z/u3GMLx
Why compiler just don't store begin_ variable in CPU register and write it to memory in the end of function?
c++ gcc assembly optimization compiler-optimization
Is there any reason why modern C++ compilers can't optimize redundant mov instruction if you changing variable passed by reference?
Slow: https://gcc.godbolt.org/z/2Bmidk
Redundant mov:
10: mov QWORD PTR [rdi], rdx
Fast: https://gcc.godbolt.org/z/u3GMLx
Why compiler just don't store begin_ variable in CPU register and write it to memory in the end of function?
c++ gcc assembly optimization compiler-optimization
c++ gcc assembly optimization compiler-optimization
edited Jan 1 at 20:34
yarrr
asked Jan 1 at 20:17
yarrryarrr
585
585
2
That's the calling convention.
– Matthieu Brucher
Jan 1 at 20:19
1
@MatthieuBrucher, but there's no calls involved?(everything inlined)
– RiaD
Jan 1 at 20:24
5
This is not an easy optimization: the compiler would have to prove that the qword atrdi
and any of the bytes read from the charstream never overlap
– harold
Jan 1 at 20:40
1
@harold Nice catch, you should make an answer out of it.
– Sebastian Redl
Jan 1 at 20:41
@harold Seems like you're right: gcc.godbolt.org/z/_wm_zO
– yarrr
Jan 1 at 20:49
add a comment |
2
That's the calling convention.
– Matthieu Brucher
Jan 1 at 20:19
1
@MatthieuBrucher, but there's no calls involved?(everything inlined)
– RiaD
Jan 1 at 20:24
5
This is not an easy optimization: the compiler would have to prove that the qword atrdi
and any of the bytes read from the charstream never overlap
– harold
Jan 1 at 20:40
1
@harold Nice catch, you should make an answer out of it.
– Sebastian Redl
Jan 1 at 20:41
@harold Seems like you're right: gcc.godbolt.org/z/_wm_zO
– yarrr
Jan 1 at 20:49
2
2
That's the calling convention.
– Matthieu Brucher
Jan 1 at 20:19
That's the calling convention.
– Matthieu Brucher
Jan 1 at 20:19
1
1
@MatthieuBrucher, but there's no calls involved?(everything inlined)
– RiaD
Jan 1 at 20:24
@MatthieuBrucher, but there's no calls involved?(everything inlined)
– RiaD
Jan 1 at 20:24
5
5
This is not an easy optimization: the compiler would have to prove that the qword at
rdi
and any of the bytes read from the charstream never overlap– harold
Jan 1 at 20:40
This is not an easy optimization: the compiler would have to prove that the qword at
rdi
and any of the bytes read from the charstream never overlap– harold
Jan 1 at 20:40
1
1
@harold Nice catch, you should make an answer out of it.
– Sebastian Redl
Jan 1 at 20:41
@harold Nice catch, you should make an answer out of it.
– Sebastian Redl
Jan 1 at 20:41
@harold Seems like you're right: gcc.godbolt.org/z/_wm_zO
– yarrr
Jan 1 at 20:49
@harold Seems like you're right: gcc.godbolt.org/z/_wm_zO
– yarrr
Jan 1 at 20:49
add a comment |
1 Answer
1
active
oldest
votes
It seems that it may be invalid optimisation. What if begin_
equals to this
i.e address of CharStream
itself (and it's valid to read bytes of any object using char*
)? In that case after first read CharStream
will change and so may the value of range [begin; end)
To avoid this you may do one of the following:
- accept CharStream by value (so that it's address is unique and doesn't coincide with any
char*
): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications) - use another type instead of
char
so that it can't alias withCharStream
: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data usingByte*
instead ofchar*
because it'ssome_other_type*
originally)
2
Note that passing by value changes the behavior of the code -- it no longer modifies the originalCharStream
passed toget_hash
– Chris Dodd
Jan 1 at 21:48
@ChrisDodd, that's a good point, thanks
– RiaD
Jan 1 at 21:50
3
Taking a local copy also works
– harold
Jan 1 at 21:59
2
How about usingsize_t get_hash(CharStream & __restrict__ stream, size_t length)
? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.
– David Wohlferd
Jan 2 at 6:17
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53998638%2fredundant-mov-operations-if-you-passing-variable-by-reference%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
It seems that it may be invalid optimisation. What if begin_
equals to this
i.e address of CharStream
itself (and it's valid to read bytes of any object using char*
)? In that case after first read CharStream
will change and so may the value of range [begin; end)
To avoid this you may do one of the following:
- accept CharStream by value (so that it's address is unique and doesn't coincide with any
char*
): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications) - use another type instead of
char
so that it can't alias withCharStream
: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data usingByte*
instead ofchar*
because it'ssome_other_type*
originally)
2
Note that passing by value changes the behavior of the code -- it no longer modifies the originalCharStream
passed toget_hash
– Chris Dodd
Jan 1 at 21:48
@ChrisDodd, that's a good point, thanks
– RiaD
Jan 1 at 21:50
3
Taking a local copy also works
– harold
Jan 1 at 21:59
2
How about usingsize_t get_hash(CharStream & __restrict__ stream, size_t length)
? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.
– David Wohlferd
Jan 2 at 6:17
add a comment |
It seems that it may be invalid optimisation. What if begin_
equals to this
i.e address of CharStream
itself (and it's valid to read bytes of any object using char*
)? In that case after first read CharStream
will change and so may the value of range [begin; end)
To avoid this you may do one of the following:
- accept CharStream by value (so that it's address is unique and doesn't coincide with any
char*
): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications) - use another type instead of
char
so that it can't alias withCharStream
: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data usingByte*
instead ofchar*
because it'ssome_other_type*
originally)
2
Note that passing by value changes the behavior of the code -- it no longer modifies the originalCharStream
passed toget_hash
– Chris Dodd
Jan 1 at 21:48
@ChrisDodd, that's a good point, thanks
– RiaD
Jan 1 at 21:50
3
Taking a local copy also works
– harold
Jan 1 at 21:59
2
How about usingsize_t get_hash(CharStream & __restrict__ stream, size_t length)
? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.
– David Wohlferd
Jan 2 at 6:17
add a comment |
It seems that it may be invalid optimisation. What if begin_
equals to this
i.e address of CharStream
itself (and it's valid to read bytes of any object using char*
)? In that case after first read CharStream
will change and so may the value of range [begin; end)
To avoid this you may do one of the following:
- accept CharStream by value (so that it's address is unique and doesn't coincide with any
char*
): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications) - use another type instead of
char
so that it can't alias withCharStream
: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data usingByte*
instead ofchar*
because it'ssome_other_type*
originally)
It seems that it may be invalid optimisation. What if begin_
equals to this
i.e address of CharStream
itself (and it's valid to read bytes of any object using char*
)? In that case after first read CharStream
will change and so may the value of range [begin; end)
To avoid this you may do one of the following:
- accept CharStream by value (so that it's address is unique and doesn't coincide with any
char*
): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications) - use another type instead of
char
so that it can't alias withCharStream
: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data usingByte*
instead ofchar*
because it'ssome_other_type*
originally)
edited Jan 1 at 21:51
answered Jan 1 at 21:14


RiaDRiaD
33.2k957103
33.2k957103
2
Note that passing by value changes the behavior of the code -- it no longer modifies the originalCharStream
passed toget_hash
– Chris Dodd
Jan 1 at 21:48
@ChrisDodd, that's a good point, thanks
– RiaD
Jan 1 at 21:50
3
Taking a local copy also works
– harold
Jan 1 at 21:59
2
How about usingsize_t get_hash(CharStream & __restrict__ stream, size_t length)
? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.
– David Wohlferd
Jan 2 at 6:17
add a comment |
2
Note that passing by value changes the behavior of the code -- it no longer modifies the originalCharStream
passed toget_hash
– Chris Dodd
Jan 1 at 21:48
@ChrisDodd, that's a good point, thanks
– RiaD
Jan 1 at 21:50
3
Taking a local copy also works
– harold
Jan 1 at 21:59
2
How about usingsize_t get_hash(CharStream & __restrict__ stream, size_t length)
? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.
– David Wohlferd
Jan 2 at 6:17
2
2
Note that passing by value changes the behavior of the code -- it no longer modifies the original
CharStream
passed to get_hash
– Chris Dodd
Jan 1 at 21:48
Note that passing by value changes the behavior of the code -- it no longer modifies the original
CharStream
passed to get_hash
– Chris Dodd
Jan 1 at 21:48
@ChrisDodd, that's a good point, thanks
– RiaD
Jan 1 at 21:50
@ChrisDodd, that's a good point, thanks
– RiaD
Jan 1 at 21:50
3
3
Taking a local copy also works
– harold
Jan 1 at 21:59
Taking a local copy also works
– harold
Jan 1 at 21:59
2
2
How about using
size_t get_hash(CharStream & __restrict__ stream, size_t length)
? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.– David Wohlferd
Jan 2 at 6:17
How about using
size_t get_hash(CharStream & __restrict__ stream, size_t length)
? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.– David Wohlferd
Jan 2 at 6:17
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53998638%2fredundant-mov-operations-if-you-passing-variable-by-reference%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
That's the calling convention.
– Matthieu Brucher
Jan 1 at 20:19
1
@MatthieuBrucher, but there's no calls involved?(everything inlined)
– RiaD
Jan 1 at 20:24
5
This is not an easy optimization: the compiler would have to prove that the qword at
rdi
and any of the bytes read from the charstream never overlap– harold
Jan 1 at 20:40
1
@harold Nice catch, you should make an answer out of it.
– Sebastian Redl
Jan 1 at 20:41
@harold Seems like you're right: gcc.godbolt.org/z/_wm_zO
– yarrr
Jan 1 at 20:49