Does “seek” write zero bytes when used on a newly created binary file?












1















I noticed, that when I do something like this:



with open('testfile', 'wb') as fl:
fl.seek(2048*512)
fl.write(b'aaaaa')


Regardless of my python version, the hexdump of the resulting file will have zero bytes in the first portion of the testfile:



$ hexdump -C testfile
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00100000 61 61 61 61 61 |aaaaa|
00100005


Is that guaranteed, intended behavior that can be counted on across operating systems? If so, where is that fact documented?










share|improve this question

























  • I can't find its behavior in this scenario documented anywhere, so wouldn't count on it.

    – martineau
    Nov 20 '18 at 1:57






  • 1





    "A write operation increases the size of the file to the file pointer position plus the size of the buffer written, which results in the intervening bytes uninitialized." to quote MSDN for SetFilePointer (which is probably used on Windows to implement seek). docs.microsoft.com/en-us/windows/desktop/api/fileapi/…

    – C. Yduqoli
    Nov 20 '18 at 2:03


















1















I noticed, that when I do something like this:



with open('testfile', 'wb') as fl:
fl.seek(2048*512)
fl.write(b'aaaaa')


Regardless of my python version, the hexdump of the resulting file will have zero bytes in the first portion of the testfile:



$ hexdump -C testfile
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00100000 61 61 61 61 61 |aaaaa|
00100005


Is that guaranteed, intended behavior that can be counted on across operating systems? If so, where is that fact documented?










share|improve this question

























  • I can't find its behavior in this scenario documented anywhere, so wouldn't count on it.

    – martineau
    Nov 20 '18 at 1:57






  • 1





    "A write operation increases the size of the file to the file pointer position plus the size of the buffer written, which results in the intervening bytes uninitialized." to quote MSDN for SetFilePointer (which is probably used on Windows to implement seek). docs.microsoft.com/en-us/windows/desktop/api/fileapi/…

    – C. Yduqoli
    Nov 20 '18 at 2:03
















1












1








1








I noticed, that when I do something like this:



with open('testfile', 'wb') as fl:
fl.seek(2048*512)
fl.write(b'aaaaa')


Regardless of my python version, the hexdump of the resulting file will have zero bytes in the first portion of the testfile:



$ hexdump -C testfile
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00100000 61 61 61 61 61 |aaaaa|
00100005


Is that guaranteed, intended behavior that can be counted on across operating systems? If so, where is that fact documented?










share|improve this question
















I noticed, that when I do something like this:



with open('testfile', 'wb') as fl:
fl.seek(2048*512)
fl.write(b'aaaaa')


Regardless of my python version, the hexdump of the resulting file will have zero bytes in the first portion of the testfile:



$ hexdump -C testfile
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00100000 61 61 61 61 61 |aaaaa|
00100005


Is that guaranteed, intended behavior that can be counted on across operating systems? If so, where is that fact documented?







python file-io






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 20 '18 at 1:43









martineau

66.3k989180




66.3k989180










asked Nov 20 '18 at 0:55









con-f-usecon-f-use

1,07411839




1,07411839













  • I can't find its behavior in this scenario documented anywhere, so wouldn't count on it.

    – martineau
    Nov 20 '18 at 1:57






  • 1





    "A write operation increases the size of the file to the file pointer position plus the size of the buffer written, which results in the intervening bytes uninitialized." to quote MSDN for SetFilePointer (which is probably used on Windows to implement seek). docs.microsoft.com/en-us/windows/desktop/api/fileapi/…

    – C. Yduqoli
    Nov 20 '18 at 2:03





















  • I can't find its behavior in this scenario documented anywhere, so wouldn't count on it.

    – martineau
    Nov 20 '18 at 1:57






  • 1





    "A write operation increases the size of the file to the file pointer position plus the size of the buffer written, which results in the intervening bytes uninitialized." to quote MSDN for SetFilePointer (which is probably used on Windows to implement seek). docs.microsoft.com/en-us/windows/desktop/api/fileapi/…

    – C. Yduqoli
    Nov 20 '18 at 2:03



















I can't find its behavior in this scenario documented anywhere, so wouldn't count on it.

– martineau
Nov 20 '18 at 1:57





I can't find its behavior in this scenario documented anywhere, so wouldn't count on it.

– martineau
Nov 20 '18 at 1:57




1




1





"A write operation increases the size of the file to the file pointer position plus the size of the buffer written, which results in the intervening bytes uninitialized." to quote MSDN for SetFilePointer (which is probably used on Windows to implement seek). docs.microsoft.com/en-us/windows/desktop/api/fileapi/…

– C. Yduqoli
Nov 20 '18 at 2:03







"A write operation increases the size of the file to the file pointer position plus the size of the buffer written, which results in the intervening bytes uninitialized." to quote MSDN for SetFilePointer (which is probably used on Windows to implement seek). docs.microsoft.com/en-us/windows/desktop/api/fileapi/…

– C. Yduqoli
Nov 20 '18 at 2:03














2 Answers
2






active

oldest

votes


















1















Is [zeroed out areas] intended behavior that can be counted on across operating systems? If so, where is that fact documented?




Across operating systems? Probably not: if Python is ported to some ancient OSes, those writes might either not produce a gap at all (ignoring the seek), or fail. However, modern systems all support sparse files, or at least fake them to the point of making the above work just fine and behave that way. If someone doing such a backport cares, they could add a faking layer.



You're probably safe if you rely on this. Just don't assume that the holes will stay holes: some systems may fill them in when restoring from backup or migrating files across a cluster or whatever. If you seek a few terabytes in and write one byte, the file might take only a little space until the fill-in point, if and when it ever occurs.






share|improve this answer































    1














    A binary file has a pointer that indicates the file position at which the next read or write operation will take place. Position is counted in terms of bytes. Read and write operations always move the pointer so it's at the end of whatever was just read or written.



    Seeking to a position past the end of the file results in the file size being increased as needed, with the new bytes filled with 0. In the code above, you're creating a new file and using seek incrementing filepointer's (In python, there are no pointers, so this is stream position) start position. This will add 0's.



    In Binary files, the current stream position is the bytes offset from the start of the file. If you increase stream position, all previous positions will be filled with 0 in a binary file.



    Is this across operating systems; probably YES for all latest or LTS OSes. Might not be promising in some legacy systems.



    Stream Position Documentation -https://docs.python.org/3/library/io.html#binary-i-o






    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53384736%2fdoes-seek-write-zero-bytes-when-used-on-a-newly-created-binary-file%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1















      Is [zeroed out areas] intended behavior that can be counted on across operating systems? If so, where is that fact documented?




      Across operating systems? Probably not: if Python is ported to some ancient OSes, those writes might either not produce a gap at all (ignoring the seek), or fail. However, modern systems all support sparse files, or at least fake them to the point of making the above work just fine and behave that way. If someone doing such a backport cares, they could add a faking layer.



      You're probably safe if you rely on this. Just don't assume that the holes will stay holes: some systems may fill them in when restoring from backup or migrating files across a cluster or whatever. If you seek a few terabytes in and write one byte, the file might take only a little space until the fill-in point, if and when it ever occurs.






      share|improve this answer




























        1















        Is [zeroed out areas] intended behavior that can be counted on across operating systems? If so, where is that fact documented?




        Across operating systems? Probably not: if Python is ported to some ancient OSes, those writes might either not produce a gap at all (ignoring the seek), or fail. However, modern systems all support sparse files, or at least fake them to the point of making the above work just fine and behave that way. If someone doing such a backport cares, they could add a faking layer.



        You're probably safe if you rely on this. Just don't assume that the holes will stay holes: some systems may fill them in when restoring from backup or migrating files across a cluster or whatever. If you seek a few terabytes in and write one byte, the file might take only a little space until the fill-in point, if and when it ever occurs.






        share|improve this answer


























          1












          1








          1








          Is [zeroed out areas] intended behavior that can be counted on across operating systems? If so, where is that fact documented?




          Across operating systems? Probably not: if Python is ported to some ancient OSes, those writes might either not produce a gap at all (ignoring the seek), or fail. However, modern systems all support sparse files, or at least fake them to the point of making the above work just fine and behave that way. If someone doing such a backport cares, they could add a faking layer.



          You're probably safe if you rely on this. Just don't assume that the holes will stay holes: some systems may fill them in when restoring from backup or migrating files across a cluster or whatever. If you seek a few terabytes in and write one byte, the file might take only a little space until the fill-in point, if and when it ever occurs.






          share|improve this answer














          Is [zeroed out areas] intended behavior that can be counted on across operating systems? If so, where is that fact documented?




          Across operating systems? Probably not: if Python is ported to some ancient OSes, those writes might either not produce a gap at all (ignoring the seek), or fail. However, modern systems all support sparse files, or at least fake them to the point of making the above work just fine and behave that way. If someone doing such a backport cares, they could add a faking layer.



          You're probably safe if you rely on this. Just don't assume that the holes will stay holes: some systems may fill them in when restoring from backup or migrating files across a cluster or whatever. If you seek a few terabytes in and write one byte, the file might take only a little space until the fill-in point, if and when it ever occurs.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 20 '18 at 1:23









          torektorek

          185k18235317




          185k18235317

























              1














              A binary file has a pointer that indicates the file position at which the next read or write operation will take place. Position is counted in terms of bytes. Read and write operations always move the pointer so it's at the end of whatever was just read or written.



              Seeking to a position past the end of the file results in the file size being increased as needed, with the new bytes filled with 0. In the code above, you're creating a new file and using seek incrementing filepointer's (In python, there are no pointers, so this is stream position) start position. This will add 0's.



              In Binary files, the current stream position is the bytes offset from the start of the file. If you increase stream position, all previous positions will be filled with 0 in a binary file.



              Is this across operating systems; probably YES for all latest or LTS OSes. Might not be promising in some legacy systems.



              Stream Position Documentation -https://docs.python.org/3/library/io.html#binary-i-o






              share|improve this answer




























                1














                A binary file has a pointer that indicates the file position at which the next read or write operation will take place. Position is counted in terms of bytes. Read and write operations always move the pointer so it's at the end of whatever was just read or written.



                Seeking to a position past the end of the file results in the file size being increased as needed, with the new bytes filled with 0. In the code above, you're creating a new file and using seek incrementing filepointer's (In python, there are no pointers, so this is stream position) start position. This will add 0's.



                In Binary files, the current stream position is the bytes offset from the start of the file. If you increase stream position, all previous positions will be filled with 0 in a binary file.



                Is this across operating systems; probably YES for all latest or LTS OSes. Might not be promising in some legacy systems.



                Stream Position Documentation -https://docs.python.org/3/library/io.html#binary-i-o






                share|improve this answer


























                  1












                  1








                  1







                  A binary file has a pointer that indicates the file position at which the next read or write operation will take place. Position is counted in terms of bytes. Read and write operations always move the pointer so it's at the end of whatever was just read or written.



                  Seeking to a position past the end of the file results in the file size being increased as needed, with the new bytes filled with 0. In the code above, you're creating a new file and using seek incrementing filepointer's (In python, there are no pointers, so this is stream position) start position. This will add 0's.



                  In Binary files, the current stream position is the bytes offset from the start of the file. If you increase stream position, all previous positions will be filled with 0 in a binary file.



                  Is this across operating systems; probably YES for all latest or LTS OSes. Might not be promising in some legacy systems.



                  Stream Position Documentation -https://docs.python.org/3/library/io.html#binary-i-o






                  share|improve this answer













                  A binary file has a pointer that indicates the file position at which the next read or write operation will take place. Position is counted in terms of bytes. Read and write operations always move the pointer so it's at the end of whatever was just read or written.



                  Seeking to a position past the end of the file results in the file size being increased as needed, with the new bytes filled with 0. In the code above, you're creating a new file and using seek incrementing filepointer's (In python, there are no pointers, so this is stream position) start position. This will add 0's.



                  In Binary files, the current stream position is the bytes offset from the start of the file. If you increase stream position, all previous positions will be filled with 0 in a binary file.



                  Is this across operating systems; probably YES for all latest or LTS OSes. Might not be promising in some legacy systems.



                  Stream Position Documentation -https://docs.python.org/3/library/io.html#binary-i-o







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 20 '18 at 1:34









                  Pruthvi KumarPruthvi Kumar

                  58719




                  58719






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53384736%2fdoes-seek-write-zero-bytes-when-used-on-a-newly-created-binary-file%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      MongoDB - Not Authorized To Execute Command

                      How to fix TextFormField cause rebuild widget in Flutter

                      in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith