Redundant mov operations if you passing variable by reference












2















Is there any reason why modern C++ compilers can't optimize redundant mov instruction if you changing variable passed by reference?



Slow: https://gcc.godbolt.org/z/2Bmidk



Redundant mov:



10:        mov     QWORD PTR [rdi], rdx


Fast: https://gcc.godbolt.org/z/u3GMLx



Why compiler just don't store begin_ variable in CPU register and write it to memory in the end of function?










share|improve this question




















  • 2





    That's the calling convention.

    – Matthieu Brucher
    Jan 1 at 20:19






  • 1





    @MatthieuBrucher, but there's no calls involved?(everything inlined)

    – RiaD
    Jan 1 at 20:24








  • 5





    This is not an easy optimization: the compiler would have to prove that the qword at rdi and any of the bytes read from the charstream never overlap

    – harold
    Jan 1 at 20:40






  • 1





    @harold Nice catch, you should make an answer out of it.

    – Sebastian Redl
    Jan 1 at 20:41











  • @harold Seems like you're right: gcc.godbolt.org/z/_wm_zO

    – yarrr
    Jan 1 at 20:49
















2















Is there any reason why modern C++ compilers can't optimize redundant mov instruction if you changing variable passed by reference?



Slow: https://gcc.godbolt.org/z/2Bmidk



Redundant mov:



10:        mov     QWORD PTR [rdi], rdx


Fast: https://gcc.godbolt.org/z/u3GMLx



Why compiler just don't store begin_ variable in CPU register and write it to memory in the end of function?










share|improve this question




















  • 2





    That's the calling convention.

    – Matthieu Brucher
    Jan 1 at 20:19






  • 1





    @MatthieuBrucher, but there's no calls involved?(everything inlined)

    – RiaD
    Jan 1 at 20:24








  • 5





    This is not an easy optimization: the compiler would have to prove that the qword at rdi and any of the bytes read from the charstream never overlap

    – harold
    Jan 1 at 20:40






  • 1





    @harold Nice catch, you should make an answer out of it.

    – Sebastian Redl
    Jan 1 at 20:41











  • @harold Seems like you're right: gcc.godbolt.org/z/_wm_zO

    – yarrr
    Jan 1 at 20:49














2












2








2








Is there any reason why modern C++ compilers can't optimize redundant mov instruction if you changing variable passed by reference?



Slow: https://gcc.godbolt.org/z/2Bmidk



Redundant mov:



10:        mov     QWORD PTR [rdi], rdx


Fast: https://gcc.godbolt.org/z/u3GMLx



Why compiler just don't store begin_ variable in CPU register and write it to memory in the end of function?










share|improve this question
















Is there any reason why modern C++ compilers can't optimize redundant mov instruction if you changing variable passed by reference?



Slow: https://gcc.godbolt.org/z/2Bmidk



Redundant mov:



10:        mov     QWORD PTR [rdi], rdx


Fast: https://gcc.godbolt.org/z/u3GMLx



Why compiler just don't store begin_ variable in CPU register and write it to memory in the end of function?







c++ gcc assembly optimization compiler-optimization






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 1 at 20:34







yarrr

















asked Jan 1 at 20:17









yarrryarrr

585




585








  • 2





    That's the calling convention.

    – Matthieu Brucher
    Jan 1 at 20:19






  • 1





    @MatthieuBrucher, but there's no calls involved?(everything inlined)

    – RiaD
    Jan 1 at 20:24








  • 5





    This is not an easy optimization: the compiler would have to prove that the qword at rdi and any of the bytes read from the charstream never overlap

    – harold
    Jan 1 at 20:40






  • 1





    @harold Nice catch, you should make an answer out of it.

    – Sebastian Redl
    Jan 1 at 20:41











  • @harold Seems like you're right: gcc.godbolt.org/z/_wm_zO

    – yarrr
    Jan 1 at 20:49














  • 2





    That's the calling convention.

    – Matthieu Brucher
    Jan 1 at 20:19






  • 1





    @MatthieuBrucher, but there's no calls involved?(everything inlined)

    – RiaD
    Jan 1 at 20:24








  • 5





    This is not an easy optimization: the compiler would have to prove that the qword at rdi and any of the bytes read from the charstream never overlap

    – harold
    Jan 1 at 20:40






  • 1





    @harold Nice catch, you should make an answer out of it.

    – Sebastian Redl
    Jan 1 at 20:41











  • @harold Seems like you're right: gcc.godbolt.org/z/_wm_zO

    – yarrr
    Jan 1 at 20:49








2




2





That's the calling convention.

– Matthieu Brucher
Jan 1 at 20:19





That's the calling convention.

– Matthieu Brucher
Jan 1 at 20:19




1




1





@MatthieuBrucher, but there's no calls involved?(everything inlined)

– RiaD
Jan 1 at 20:24







@MatthieuBrucher, but there's no calls involved?(everything inlined)

– RiaD
Jan 1 at 20:24






5




5





This is not an easy optimization: the compiler would have to prove that the qword at rdi and any of the bytes read from the charstream never overlap

– harold
Jan 1 at 20:40





This is not an easy optimization: the compiler would have to prove that the qword at rdi and any of the bytes read from the charstream never overlap

– harold
Jan 1 at 20:40




1




1





@harold Nice catch, you should make an answer out of it.

– Sebastian Redl
Jan 1 at 20:41





@harold Nice catch, you should make an answer out of it.

– Sebastian Redl
Jan 1 at 20:41













@harold Seems like you're right: gcc.godbolt.org/z/_wm_zO

– yarrr
Jan 1 at 20:49





@harold Seems like you're right: gcc.godbolt.org/z/_wm_zO

– yarrr
Jan 1 at 20:49












1 Answer
1






active

oldest

votes


















2














It seems that it may be invalid optimisation. What if begin_ equals to this i.e address of CharStream itself (and it's valid to read bytes of any object using char*)? In that case after first read CharStream will change and so may the value of range [begin; end)



To avoid this you may do one of the following:




  • accept CharStream by value (so that it's address is unique and doesn't coincide with any char*): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications)

  • use another type instead of char so that it can't alias with CharStream: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data using Byte* instead of char* because it's some_other_type* originally)






share|improve this answer





















  • 2





    Note that passing by value changes the behavior of the code -- it no longer modifies the original CharStream passed to get_hash

    – Chris Dodd
    Jan 1 at 21:48











  • @ChrisDodd, that's a good point, thanks

    – RiaD
    Jan 1 at 21:50






  • 3





    Taking a local copy also works

    – harold
    Jan 1 at 21:59






  • 2





    How about using size_t get_hash(CharStream & __restrict__ stream, size_t length)? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.

    – David Wohlferd
    Jan 2 at 6:17











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53998638%2fredundant-mov-operations-if-you-passing-variable-by-reference%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














It seems that it may be invalid optimisation. What if begin_ equals to this i.e address of CharStream itself (and it's valid to read bytes of any object using char*)? In that case after first read CharStream will change and so may the value of range [begin; end)



To avoid this you may do one of the following:




  • accept CharStream by value (so that it's address is unique and doesn't coincide with any char*): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications)

  • use another type instead of char so that it can't alias with CharStream: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data using Byte* instead of char* because it's some_other_type* originally)






share|improve this answer





















  • 2





    Note that passing by value changes the behavior of the code -- it no longer modifies the original CharStream passed to get_hash

    – Chris Dodd
    Jan 1 at 21:48











  • @ChrisDodd, that's a good point, thanks

    – RiaD
    Jan 1 at 21:50






  • 3





    Taking a local copy also works

    – harold
    Jan 1 at 21:59






  • 2





    How about using size_t get_hash(CharStream & __restrict__ stream, size_t length)? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.

    – David Wohlferd
    Jan 2 at 6:17
















2














It seems that it may be invalid optimisation. What if begin_ equals to this i.e address of CharStream itself (and it's valid to read bytes of any object using char*)? In that case after first read CharStream will change and so may the value of range [begin; end)



To avoid this you may do one of the following:




  • accept CharStream by value (so that it's address is unique and doesn't coincide with any char*): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications)

  • use another type instead of char so that it can't alias with CharStream: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data using Byte* instead of char* because it's some_other_type* originally)






share|improve this answer





















  • 2





    Note that passing by value changes the behavior of the code -- it no longer modifies the original CharStream passed to get_hash

    – Chris Dodd
    Jan 1 at 21:48











  • @ChrisDodd, that's a good point, thanks

    – RiaD
    Jan 1 at 21:50






  • 3





    Taking a local copy also works

    – harold
    Jan 1 at 21:59






  • 2





    How about using size_t get_hash(CharStream & __restrict__ stream, size_t length)? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.

    – David Wohlferd
    Jan 2 at 6:17














2












2








2







It seems that it may be invalid optimisation. What if begin_ equals to this i.e address of CharStream itself (and it's valid to read bytes of any object using char*)? In that case after first read CharStream will change and so may the value of range [begin; end)



To avoid this you may do one of the following:




  • accept CharStream by value (so that it's address is unique and doesn't coincide with any char*): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications)

  • use another type instead of char so that it can't alias with CharStream: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data using Byte* instead of char* because it's some_other_type* originally)






share|improve this answer















It seems that it may be invalid optimisation. What if begin_ equals to this i.e address of CharStream itself (and it's valid to read bytes of any object using char*)? In that case after first read CharStream will change and so may the value of range [begin; end)



To avoid this you may do one of the following:




  • accept CharStream by value (so that it's address is unique and doesn't coincide with any char*): https://gcc.godbolt.org/z/QfOUwW (note the change in behaviour. You'll need to return the stream if you need modifications)

  • use another type instead of char so that it can't alias with CharStream: https://gcc.godbolt.org/z/2_gREf (beware, it might be undefined to read your data using Byte* instead of char* because it's some_other_type* originally)







share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 1 at 21:51

























answered Jan 1 at 21:14









RiaDRiaD

33.2k957103




33.2k957103








  • 2





    Note that passing by value changes the behavior of the code -- it no longer modifies the original CharStream passed to get_hash

    – Chris Dodd
    Jan 1 at 21:48











  • @ChrisDodd, that's a good point, thanks

    – RiaD
    Jan 1 at 21:50






  • 3





    Taking a local copy also works

    – harold
    Jan 1 at 21:59






  • 2





    How about using size_t get_hash(CharStream & __restrict__ stream, size_t length)? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.

    – David Wohlferd
    Jan 2 at 6:17














  • 2





    Note that passing by value changes the behavior of the code -- it no longer modifies the original CharStream passed to get_hash

    – Chris Dodd
    Jan 1 at 21:48











  • @ChrisDodd, that's a good point, thanks

    – RiaD
    Jan 1 at 21:50






  • 3





    Taking a local copy also works

    – harold
    Jan 1 at 21:59






  • 2





    How about using size_t get_hash(CharStream & __restrict__ stream, size_t length)? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.

    – David Wohlferd
    Jan 2 at 6:17








2




2





Note that passing by value changes the behavior of the code -- it no longer modifies the original CharStream passed to get_hash

– Chris Dodd
Jan 1 at 21:48





Note that passing by value changes the behavior of the code -- it no longer modifies the original CharStream passed to get_hash

– Chris Dodd
Jan 1 at 21:48













@ChrisDodd, that's a good point, thanks

– RiaD
Jan 1 at 21:50





@ChrisDodd, that's a good point, thanks

– RiaD
Jan 1 at 21:50




3




3





Taking a local copy also works

– harold
Jan 1 at 21:59





Taking a local copy also works

– harold
Jan 1 at 21:59




2




2





How about using size_t get_hash(CharStream & __restrict__ stream, size_t length)? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.

– David Wohlferd
Jan 2 at 6:17





How about using size_t get_hash(CharStream & __restrict__ stream, size_t length)? Rather than making a copy, it assures the compiler that this pointer isn't being used somewhere else, allowing additional optimization.

– David Wohlferd
Jan 2 at 6:17




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53998638%2fredundant-mov-operations-if-you-passing-variable-by-reference%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

How to fix TextFormField cause rebuild widget in Flutter

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith