How to read a binary file quickly in c#? (ReadOnlySpan vs MemoryStream)












4















I'm trying to parse a binary file as fastest as possible. So this is what I first tried to do:



using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
m.Position = 0;

using (BinaryReaderBigEndian b = new BinaryReaderBigEndian(m)) {
while (b.BaseStream.Position != b.BaseStream.Length) {
UInt32 value = b.ReadUInt32();
} } } } }


Where BinaryReaderBigEndian class is implemented as it follows:



public static class BinaryReaderBigEndian {
public BinaryReaderBigEndian(Stream stream) : base(stream) { }

public override UInt32 ReadUInt32() {
var x = base.ReadBytes(4);
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }


Then, I tried to get a performance improvement using ReadOnlySpan instead of MemoryStream. So, I tried doing:



using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
int position = 0;
ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.ToArray());

while (position != stream.Length) {
UInt32 value = stream.ReadUInt32(position);
position += 4;
} } } }


Where BinaryReaderBigEndian class changed in:



public static class BinaryReaderBigEndian {
public override UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start) {
var data = stream.Slice(start, 4).ToArray();
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }


But, unfortunately, I didn't notice any improvement. So, where am I doing wrong?










share|improve this question























  • where is your bottle neck? CPU, Memory, Disk access?

    – Richard Hubley
    Nov 21 '18 at 16:06











  • Nowhere! This is why I'm surprised.

    – heliosophist
    Nov 21 '18 at 16:14











  • Why are you copying the entire file in to a memory stream first? That is going to eat up a significant amount of time

    – Scott Chamberlain
    Nov 21 '18 at 18:36











  • The reasons are because it takes me less overall time for reading from the hdd and because I'm avoiding page faults.

    – heliosophist
    Nov 22 '18 at 13:36
















4















I'm trying to parse a binary file as fastest as possible. So this is what I first tried to do:



using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
m.Position = 0;

using (BinaryReaderBigEndian b = new BinaryReaderBigEndian(m)) {
while (b.BaseStream.Position != b.BaseStream.Length) {
UInt32 value = b.ReadUInt32();
} } } } }


Where BinaryReaderBigEndian class is implemented as it follows:



public static class BinaryReaderBigEndian {
public BinaryReaderBigEndian(Stream stream) : base(stream) { }

public override UInt32 ReadUInt32() {
var x = base.ReadBytes(4);
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }


Then, I tried to get a performance improvement using ReadOnlySpan instead of MemoryStream. So, I tried doing:



using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
int position = 0;
ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.ToArray());

while (position != stream.Length) {
UInt32 value = stream.ReadUInt32(position);
position += 4;
} } } }


Where BinaryReaderBigEndian class changed in:



public static class BinaryReaderBigEndian {
public override UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start) {
var data = stream.Slice(start, 4).ToArray();
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }


But, unfortunately, I didn't notice any improvement. So, where am I doing wrong?










share|improve this question























  • where is your bottle neck? CPU, Memory, Disk access?

    – Richard Hubley
    Nov 21 '18 at 16:06











  • Nowhere! This is why I'm surprised.

    – heliosophist
    Nov 21 '18 at 16:14











  • Why are you copying the entire file in to a memory stream first? That is going to eat up a significant amount of time

    – Scott Chamberlain
    Nov 21 '18 at 18:36











  • The reasons are because it takes me less overall time for reading from the hdd and because I'm avoiding page faults.

    – heliosophist
    Nov 22 '18 at 13:36














4












4








4


3






I'm trying to parse a binary file as fastest as possible. So this is what I first tried to do:



using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
m.Position = 0;

using (BinaryReaderBigEndian b = new BinaryReaderBigEndian(m)) {
while (b.BaseStream.Position != b.BaseStream.Length) {
UInt32 value = b.ReadUInt32();
} } } } }


Where BinaryReaderBigEndian class is implemented as it follows:



public static class BinaryReaderBigEndian {
public BinaryReaderBigEndian(Stream stream) : base(stream) { }

public override UInt32 ReadUInt32() {
var x = base.ReadBytes(4);
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }


Then, I tried to get a performance improvement using ReadOnlySpan instead of MemoryStream. So, I tried doing:



using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
int position = 0;
ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.ToArray());

while (position != stream.Length) {
UInt32 value = stream.ReadUInt32(position);
position += 4;
} } } }


Where BinaryReaderBigEndian class changed in:



public static class BinaryReaderBigEndian {
public override UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start) {
var data = stream.Slice(start, 4).ToArray();
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }


But, unfortunately, I didn't notice any improvement. So, where am I doing wrong?










share|improve this question














I'm trying to parse a binary file as fastest as possible. So this is what I first tried to do:



using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
m.Position = 0;

using (BinaryReaderBigEndian b = new BinaryReaderBigEndian(m)) {
while (b.BaseStream.Position != b.BaseStream.Length) {
UInt32 value = b.ReadUInt32();
} } } } }


Where BinaryReaderBigEndian class is implemented as it follows:



public static class BinaryReaderBigEndian {
public BinaryReaderBigEndian(Stream stream) : base(stream) { }

public override UInt32 ReadUInt32() {
var x = base.ReadBytes(4);
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }


Then, I tried to get a performance improvement using ReadOnlySpan instead of MemoryStream. So, I tried doing:



using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
int position = 0;
ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.ToArray());

while (position != stream.Length) {
UInt32 value = stream.ReadUInt32(position);
position += 4;
} } } }


Where BinaryReaderBigEndian class changed in:



public static class BinaryReaderBigEndian {
public override UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start) {
var data = stream.Slice(start, 4).ToArray();
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }


But, unfortunately, I didn't notice any improvement. So, where am I doing wrong?







c# html .net memorystream






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 21 '18 at 16:02









heliosophistheliosophist

233




233













  • where is your bottle neck? CPU, Memory, Disk access?

    – Richard Hubley
    Nov 21 '18 at 16:06











  • Nowhere! This is why I'm surprised.

    – heliosophist
    Nov 21 '18 at 16:14











  • Why are you copying the entire file in to a memory stream first? That is going to eat up a significant amount of time

    – Scott Chamberlain
    Nov 21 '18 at 18:36











  • The reasons are because it takes me less overall time for reading from the hdd and because I'm avoiding page faults.

    – heliosophist
    Nov 22 '18 at 13:36



















  • where is your bottle neck? CPU, Memory, Disk access?

    – Richard Hubley
    Nov 21 '18 at 16:06











  • Nowhere! This is why I'm surprised.

    – heliosophist
    Nov 21 '18 at 16:14











  • Why are you copying the entire file in to a memory stream first? That is going to eat up a significant amount of time

    – Scott Chamberlain
    Nov 21 '18 at 18:36











  • The reasons are because it takes me less overall time for reading from the hdd and because I'm avoiding page faults.

    – heliosophist
    Nov 22 '18 at 13:36

















where is your bottle neck? CPU, Memory, Disk access?

– Richard Hubley
Nov 21 '18 at 16:06





where is your bottle neck? CPU, Memory, Disk access?

– Richard Hubley
Nov 21 '18 at 16:06













Nowhere! This is why I'm surprised.

– heliosophist
Nov 21 '18 at 16:14





Nowhere! This is why I'm surprised.

– heliosophist
Nov 21 '18 at 16:14













Why are you copying the entire file in to a memory stream first? That is going to eat up a significant amount of time

– Scott Chamberlain
Nov 21 '18 at 18:36





Why are you copying the entire file in to a memory stream first? That is going to eat up a significant amount of time

– Scott Chamberlain
Nov 21 '18 at 18:36













The reasons are because it takes me less overall time for reading from the hdd and because I'm avoiding page faults.

– heliosophist
Nov 22 '18 at 13:36





The reasons are because it takes me less overall time for reading from the hdd and because I'm avoiding page faults.

– heliosophist
Nov 22 '18 at 13:36












1 Answer
1






active

oldest

votes


















4














I did some measurement of your code on my computer (Intel Q9400, 8 GiB RAM, SSD disk, Win10 x64 Home, .NET Framework 4/7/2, tested with 15 MB (when unpacked) file) with these results:



No-Span version: 520 ms

Span version: 720 ms



So Span version is actually slower! Why? Because new ReadOnlySpan<byte>(m.ToArray()) performs additional copy of whole file and also ReadUInt32() performs many slicings of the Span (slicing is cheap, but not free). Since you performed more work, you can't expect performance to be any better just because you used Span.



So can we do better? Yes. It turns out that the slowest part of your code is actually garbage collection caused by repeatedly allocating 4-byte Arrays created by the .ToArray() calls in ReadUInt32() method. You can avoid it by implementing ReadUInt32() yourself. It's pretty easy and also eliminates need for Span slicing. You can also replace new ReadOnlySpan<byte>(m.ToArray()) with new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);, which performs cheap slicing instead of copy of whole file. So now code looks like this:



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
using (MemoryStream m = new MemoryStream())
{
d.CopyTo(m);
int position = 0;

ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);

while (position != stream.Length)
{
UInt32 value = stream.ReadUInt32(position);
position += 4;
}
}
}
}
}

public static class BinaryReaderBigEndian
{
public static UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start)
{
UInt32 res = 0;
for (int i = 0; i < 4; i++)
{
res = (res << 8) | (((UInt32)stream[start + i]) & 0xff);
}
return res;
}
}


With these changes I get from 720 ms down to 165 ms (4x faster). Sounds great, doesn't it? But we can do even better. We can completely avoid MemoryStream copy and inline and further optimize ReadUInt32():



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
var buffer = new byte[64 * 1024];

do
{
int bufferDataLength = FillBuffer(d, buffer);

if (bufferDataLength % 4 != 0)
throw new Exception("Stream length not divisible by 4");

if (bufferDataLength == 0)
break;

for (int i = 0; i < bufferDataLength; i += 4)
{
uint value = unchecked(
(((uint)buffer[i]) << 24)
| (((uint)buffer[i + 1]) << 16)
| (((uint)buffer[i + 2]) << 8)
| (((uint)buffer[i + 3]) << 0));
}

} while (true);
}
}
}

private static int FillBuffer(Stream stream, byte buffer)
{
int read = 0;
int totalRead = 0;
do
{
read = stream.Read(buffer, totalRead, buffer.Length - totalRead);
totalRead += read;

} while (read > 0 && totalRead < buffer.Length);

return totalRead;
}


And now it takes less than 90 ms (8x faster then the original!). And without Span! Span is great in situations, where it allows perform slicing and avoid array copy, but it won't improve performance just by blindly using it. After all, Span is designed to have performance characteristics on par with Array, but not better (and only on runtimes that have special support for it, such as .NET Core 2.1).






share|improve this answer
























  • Thank you for your answer. I've tested both your solutions, but I could get only a speed up of about 3x. I'm creating a large buffer that contains all the uncompressed file. The most of time is taken by your FillBuffer function. Is there a way to get a better FillBuffer and ReadBytes (I need a function that can work like stream.readBytes() or span slice) or maybe better library to uncompress GZip files?

    – heliosophist
    Nov 23 '18 at 18:44











  • I tested on relatively small file, which was probably cached in RAM by Windows (due to my repeated tests), so reading this file was blazingly fast. If your file is big and not cached, program is likely to spend more time by reading data from disk and thus be slower. You can try open file with one of the new FileStream() overloads, that allows to specify advanced options like buffer size, sequential file access, etc. You can also try experimenting with size of my buffer var buffer = new byte[64 * 1024]; - larger buffer allows less calls to OS...

    – Ňuf
    Nov 24 '18 at 0:32











  • ...on the other hand smaller size increases chance that buffer will completly fit into L1/L2 cache. I intentionally tried to avoid ReadBytes(), because of problems with slow garbage collection. I also noticed that (for no obvious reason) my program was ~100ms slower when I target .NET Framework 4.7.1 (instead of 4.7.2). I never used any 3rd-party GZip library, so you'll have to ask Google for help :)

    – Ňuf
    Nov 24 '18 at 0:32











  • Ok, still thank you for your help. Do you know if exists a way to improve avoiding garbage collection in my first version of the code, the one that uses MemoryStream?

    – heliosophist
    Nov 25 '18 at 20:48











  • As long as it calls base.ReadBytes(4), there probably isn't much you can do about it. You coluld try to postpone GC, use GC server background mode ...

    – Ňuf
    Nov 26 '18 at 18:11











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416004%2fhow-to-read-a-binary-file-quickly-in-c-readonlyspan-vs-memorystream%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









4














I did some measurement of your code on my computer (Intel Q9400, 8 GiB RAM, SSD disk, Win10 x64 Home, .NET Framework 4/7/2, tested with 15 MB (when unpacked) file) with these results:



No-Span version: 520 ms

Span version: 720 ms



So Span version is actually slower! Why? Because new ReadOnlySpan<byte>(m.ToArray()) performs additional copy of whole file and also ReadUInt32() performs many slicings of the Span (slicing is cheap, but not free). Since you performed more work, you can't expect performance to be any better just because you used Span.



So can we do better? Yes. It turns out that the slowest part of your code is actually garbage collection caused by repeatedly allocating 4-byte Arrays created by the .ToArray() calls in ReadUInt32() method. You can avoid it by implementing ReadUInt32() yourself. It's pretty easy and also eliminates need for Span slicing. You can also replace new ReadOnlySpan<byte>(m.ToArray()) with new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);, which performs cheap slicing instead of copy of whole file. So now code looks like this:



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
using (MemoryStream m = new MemoryStream())
{
d.CopyTo(m);
int position = 0;

ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);

while (position != stream.Length)
{
UInt32 value = stream.ReadUInt32(position);
position += 4;
}
}
}
}
}

public static class BinaryReaderBigEndian
{
public static UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start)
{
UInt32 res = 0;
for (int i = 0; i < 4; i++)
{
res = (res << 8) | (((UInt32)stream[start + i]) & 0xff);
}
return res;
}
}


With these changes I get from 720 ms down to 165 ms (4x faster). Sounds great, doesn't it? But we can do even better. We can completely avoid MemoryStream copy and inline and further optimize ReadUInt32():



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
var buffer = new byte[64 * 1024];

do
{
int bufferDataLength = FillBuffer(d, buffer);

if (bufferDataLength % 4 != 0)
throw new Exception("Stream length not divisible by 4");

if (bufferDataLength == 0)
break;

for (int i = 0; i < bufferDataLength; i += 4)
{
uint value = unchecked(
(((uint)buffer[i]) << 24)
| (((uint)buffer[i + 1]) << 16)
| (((uint)buffer[i + 2]) << 8)
| (((uint)buffer[i + 3]) << 0));
}

} while (true);
}
}
}

private static int FillBuffer(Stream stream, byte buffer)
{
int read = 0;
int totalRead = 0;
do
{
read = stream.Read(buffer, totalRead, buffer.Length - totalRead);
totalRead += read;

} while (read > 0 && totalRead < buffer.Length);

return totalRead;
}


And now it takes less than 90 ms (8x faster then the original!). And without Span! Span is great in situations, where it allows perform slicing and avoid array copy, but it won't improve performance just by blindly using it. After all, Span is designed to have performance characteristics on par with Array, but not better (and only on runtimes that have special support for it, such as .NET Core 2.1).






share|improve this answer
























  • Thank you for your answer. I've tested both your solutions, but I could get only a speed up of about 3x. I'm creating a large buffer that contains all the uncompressed file. The most of time is taken by your FillBuffer function. Is there a way to get a better FillBuffer and ReadBytes (I need a function that can work like stream.readBytes() or span slice) or maybe better library to uncompress GZip files?

    – heliosophist
    Nov 23 '18 at 18:44











  • I tested on relatively small file, which was probably cached in RAM by Windows (due to my repeated tests), so reading this file was blazingly fast. If your file is big and not cached, program is likely to spend more time by reading data from disk and thus be slower. You can try open file with one of the new FileStream() overloads, that allows to specify advanced options like buffer size, sequential file access, etc. You can also try experimenting with size of my buffer var buffer = new byte[64 * 1024]; - larger buffer allows less calls to OS...

    – Ňuf
    Nov 24 '18 at 0:32











  • ...on the other hand smaller size increases chance that buffer will completly fit into L1/L2 cache. I intentionally tried to avoid ReadBytes(), because of problems with slow garbage collection. I also noticed that (for no obvious reason) my program was ~100ms slower when I target .NET Framework 4.7.1 (instead of 4.7.2). I never used any 3rd-party GZip library, so you'll have to ask Google for help :)

    – Ňuf
    Nov 24 '18 at 0:32











  • Ok, still thank you for your help. Do you know if exists a way to improve avoiding garbage collection in my first version of the code, the one that uses MemoryStream?

    – heliosophist
    Nov 25 '18 at 20:48











  • As long as it calls base.ReadBytes(4), there probably isn't much you can do about it. You coluld try to postpone GC, use GC server background mode ...

    – Ňuf
    Nov 26 '18 at 18:11
















4














I did some measurement of your code on my computer (Intel Q9400, 8 GiB RAM, SSD disk, Win10 x64 Home, .NET Framework 4/7/2, tested with 15 MB (when unpacked) file) with these results:



No-Span version: 520 ms

Span version: 720 ms



So Span version is actually slower! Why? Because new ReadOnlySpan<byte>(m.ToArray()) performs additional copy of whole file and also ReadUInt32() performs many slicings of the Span (slicing is cheap, but not free). Since you performed more work, you can't expect performance to be any better just because you used Span.



So can we do better? Yes. It turns out that the slowest part of your code is actually garbage collection caused by repeatedly allocating 4-byte Arrays created by the .ToArray() calls in ReadUInt32() method. You can avoid it by implementing ReadUInt32() yourself. It's pretty easy and also eliminates need for Span slicing. You can also replace new ReadOnlySpan<byte>(m.ToArray()) with new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);, which performs cheap slicing instead of copy of whole file. So now code looks like this:



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
using (MemoryStream m = new MemoryStream())
{
d.CopyTo(m);
int position = 0;

ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);

while (position != stream.Length)
{
UInt32 value = stream.ReadUInt32(position);
position += 4;
}
}
}
}
}

public static class BinaryReaderBigEndian
{
public static UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start)
{
UInt32 res = 0;
for (int i = 0; i < 4; i++)
{
res = (res << 8) | (((UInt32)stream[start + i]) & 0xff);
}
return res;
}
}


With these changes I get from 720 ms down to 165 ms (4x faster). Sounds great, doesn't it? But we can do even better. We can completely avoid MemoryStream copy and inline and further optimize ReadUInt32():



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
var buffer = new byte[64 * 1024];

do
{
int bufferDataLength = FillBuffer(d, buffer);

if (bufferDataLength % 4 != 0)
throw new Exception("Stream length not divisible by 4");

if (bufferDataLength == 0)
break;

for (int i = 0; i < bufferDataLength; i += 4)
{
uint value = unchecked(
(((uint)buffer[i]) << 24)
| (((uint)buffer[i + 1]) << 16)
| (((uint)buffer[i + 2]) << 8)
| (((uint)buffer[i + 3]) << 0));
}

} while (true);
}
}
}

private static int FillBuffer(Stream stream, byte buffer)
{
int read = 0;
int totalRead = 0;
do
{
read = stream.Read(buffer, totalRead, buffer.Length - totalRead);
totalRead += read;

} while (read > 0 && totalRead < buffer.Length);

return totalRead;
}


And now it takes less than 90 ms (8x faster then the original!). And without Span! Span is great in situations, where it allows perform slicing and avoid array copy, but it won't improve performance just by blindly using it. After all, Span is designed to have performance characteristics on par with Array, but not better (and only on runtimes that have special support for it, such as .NET Core 2.1).






share|improve this answer
























  • Thank you for your answer. I've tested both your solutions, but I could get only a speed up of about 3x. I'm creating a large buffer that contains all the uncompressed file. The most of time is taken by your FillBuffer function. Is there a way to get a better FillBuffer and ReadBytes (I need a function that can work like stream.readBytes() or span slice) or maybe better library to uncompress GZip files?

    – heliosophist
    Nov 23 '18 at 18:44











  • I tested on relatively small file, which was probably cached in RAM by Windows (due to my repeated tests), so reading this file was blazingly fast. If your file is big and not cached, program is likely to spend more time by reading data from disk and thus be slower. You can try open file with one of the new FileStream() overloads, that allows to specify advanced options like buffer size, sequential file access, etc. You can also try experimenting with size of my buffer var buffer = new byte[64 * 1024]; - larger buffer allows less calls to OS...

    – Ňuf
    Nov 24 '18 at 0:32











  • ...on the other hand smaller size increases chance that buffer will completly fit into L1/L2 cache. I intentionally tried to avoid ReadBytes(), because of problems with slow garbage collection. I also noticed that (for no obvious reason) my program was ~100ms slower when I target .NET Framework 4.7.1 (instead of 4.7.2). I never used any 3rd-party GZip library, so you'll have to ask Google for help :)

    – Ňuf
    Nov 24 '18 at 0:32











  • Ok, still thank you for your help. Do you know if exists a way to improve avoiding garbage collection in my first version of the code, the one that uses MemoryStream?

    – heliosophist
    Nov 25 '18 at 20:48











  • As long as it calls base.ReadBytes(4), there probably isn't much you can do about it. You coluld try to postpone GC, use GC server background mode ...

    – Ňuf
    Nov 26 '18 at 18:11














4












4








4







I did some measurement of your code on my computer (Intel Q9400, 8 GiB RAM, SSD disk, Win10 x64 Home, .NET Framework 4/7/2, tested with 15 MB (when unpacked) file) with these results:



No-Span version: 520 ms

Span version: 720 ms



So Span version is actually slower! Why? Because new ReadOnlySpan<byte>(m.ToArray()) performs additional copy of whole file and also ReadUInt32() performs many slicings of the Span (slicing is cheap, but not free). Since you performed more work, you can't expect performance to be any better just because you used Span.



So can we do better? Yes. It turns out that the slowest part of your code is actually garbage collection caused by repeatedly allocating 4-byte Arrays created by the .ToArray() calls in ReadUInt32() method. You can avoid it by implementing ReadUInt32() yourself. It's pretty easy and also eliminates need for Span slicing. You can also replace new ReadOnlySpan<byte>(m.ToArray()) with new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);, which performs cheap slicing instead of copy of whole file. So now code looks like this:



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
using (MemoryStream m = new MemoryStream())
{
d.CopyTo(m);
int position = 0;

ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);

while (position != stream.Length)
{
UInt32 value = stream.ReadUInt32(position);
position += 4;
}
}
}
}
}

public static class BinaryReaderBigEndian
{
public static UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start)
{
UInt32 res = 0;
for (int i = 0; i < 4; i++)
{
res = (res << 8) | (((UInt32)stream[start + i]) & 0xff);
}
return res;
}
}


With these changes I get from 720 ms down to 165 ms (4x faster). Sounds great, doesn't it? But we can do even better. We can completely avoid MemoryStream copy and inline and further optimize ReadUInt32():



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
var buffer = new byte[64 * 1024];

do
{
int bufferDataLength = FillBuffer(d, buffer);

if (bufferDataLength % 4 != 0)
throw new Exception("Stream length not divisible by 4");

if (bufferDataLength == 0)
break;

for (int i = 0; i < bufferDataLength; i += 4)
{
uint value = unchecked(
(((uint)buffer[i]) << 24)
| (((uint)buffer[i + 1]) << 16)
| (((uint)buffer[i + 2]) << 8)
| (((uint)buffer[i + 3]) << 0));
}

} while (true);
}
}
}

private static int FillBuffer(Stream stream, byte buffer)
{
int read = 0;
int totalRead = 0;
do
{
read = stream.Read(buffer, totalRead, buffer.Length - totalRead);
totalRead += read;

} while (read > 0 && totalRead < buffer.Length);

return totalRead;
}


And now it takes less than 90 ms (8x faster then the original!). And without Span! Span is great in situations, where it allows perform slicing and avoid array copy, but it won't improve performance just by blindly using it. After all, Span is designed to have performance characteristics on par with Array, but not better (and only on runtimes that have special support for it, such as .NET Core 2.1).






share|improve this answer













I did some measurement of your code on my computer (Intel Q9400, 8 GiB RAM, SSD disk, Win10 x64 Home, .NET Framework 4/7/2, tested with 15 MB (when unpacked) file) with these results:



No-Span version: 520 ms

Span version: 720 ms



So Span version is actually slower! Why? Because new ReadOnlySpan<byte>(m.ToArray()) performs additional copy of whole file and also ReadUInt32() performs many slicings of the Span (slicing is cheap, but not free). Since you performed more work, you can't expect performance to be any better just because you used Span.



So can we do better? Yes. It turns out that the slowest part of your code is actually garbage collection caused by repeatedly allocating 4-byte Arrays created by the .ToArray() calls in ReadUInt32() method. You can avoid it by implementing ReadUInt32() yourself. It's pretty easy and also eliminates need for Span slicing. You can also replace new ReadOnlySpan<byte>(m.ToArray()) with new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);, which performs cheap slicing instead of copy of whole file. So now code looks like this:



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
using (MemoryStream m = new MemoryStream())
{
d.CopyTo(m);
int position = 0;

ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);

while (position != stream.Length)
{
UInt32 value = stream.ReadUInt32(position);
position += 4;
}
}
}
}
}

public static class BinaryReaderBigEndian
{
public static UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start)
{
UInt32 res = 0;
for (int i = 0; i < 4; i++)
{
res = (res << 8) | (((UInt32)stream[start + i]) & 0xff);
}
return res;
}
}


With these changes I get from 720 ms down to 165 ms (4x faster). Sounds great, doesn't it? But we can do even better. We can completely avoid MemoryStream copy and inline and further optimize ReadUInt32():



public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
var buffer = new byte[64 * 1024];

do
{
int bufferDataLength = FillBuffer(d, buffer);

if (bufferDataLength % 4 != 0)
throw new Exception("Stream length not divisible by 4");

if (bufferDataLength == 0)
break;

for (int i = 0; i < bufferDataLength; i += 4)
{
uint value = unchecked(
(((uint)buffer[i]) << 24)
| (((uint)buffer[i + 1]) << 16)
| (((uint)buffer[i + 2]) << 8)
| (((uint)buffer[i + 3]) << 0));
}

} while (true);
}
}
}

private static int FillBuffer(Stream stream, byte buffer)
{
int read = 0;
int totalRead = 0;
do
{
read = stream.Read(buffer, totalRead, buffer.Length - totalRead);
totalRead += read;

} while (read > 0 && totalRead < buffer.Length);

return totalRead;
}


And now it takes less than 90 ms (8x faster then the original!). And without Span! Span is great in situations, where it allows perform slicing and avoid array copy, but it won't improve performance just by blindly using it. After all, Span is designed to have performance characteristics on par with Array, but not better (and only on runtimes that have special support for it, such as .NET Core 2.1).







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 22 '18 at 22:32









ŇufŇuf

4,91421222




4,91421222













  • Thank you for your answer. I've tested both your solutions, but I could get only a speed up of about 3x. I'm creating a large buffer that contains all the uncompressed file. The most of time is taken by your FillBuffer function. Is there a way to get a better FillBuffer and ReadBytes (I need a function that can work like stream.readBytes() or span slice) or maybe better library to uncompress GZip files?

    – heliosophist
    Nov 23 '18 at 18:44











  • I tested on relatively small file, which was probably cached in RAM by Windows (due to my repeated tests), so reading this file was blazingly fast. If your file is big and not cached, program is likely to spend more time by reading data from disk and thus be slower. You can try open file with one of the new FileStream() overloads, that allows to specify advanced options like buffer size, sequential file access, etc. You can also try experimenting with size of my buffer var buffer = new byte[64 * 1024]; - larger buffer allows less calls to OS...

    – Ňuf
    Nov 24 '18 at 0:32











  • ...on the other hand smaller size increases chance that buffer will completly fit into L1/L2 cache. I intentionally tried to avoid ReadBytes(), because of problems with slow garbage collection. I also noticed that (for no obvious reason) my program was ~100ms slower when I target .NET Framework 4.7.1 (instead of 4.7.2). I never used any 3rd-party GZip library, so you'll have to ask Google for help :)

    – Ňuf
    Nov 24 '18 at 0:32











  • Ok, still thank you for your help. Do you know if exists a way to improve avoiding garbage collection in my first version of the code, the one that uses MemoryStream?

    – heliosophist
    Nov 25 '18 at 20:48











  • As long as it calls base.ReadBytes(4), there probably isn't much you can do about it. You coluld try to postpone GC, use GC server background mode ...

    – Ňuf
    Nov 26 '18 at 18:11



















  • Thank you for your answer. I've tested both your solutions, but I could get only a speed up of about 3x. I'm creating a large buffer that contains all the uncompressed file. The most of time is taken by your FillBuffer function. Is there a way to get a better FillBuffer and ReadBytes (I need a function that can work like stream.readBytes() or span slice) or maybe better library to uncompress GZip files?

    – heliosophist
    Nov 23 '18 at 18:44











  • I tested on relatively small file, which was probably cached in RAM by Windows (due to my repeated tests), so reading this file was blazingly fast. If your file is big and not cached, program is likely to spend more time by reading data from disk and thus be slower. You can try open file with one of the new FileStream() overloads, that allows to specify advanced options like buffer size, sequential file access, etc. You can also try experimenting with size of my buffer var buffer = new byte[64 * 1024]; - larger buffer allows less calls to OS...

    – Ňuf
    Nov 24 '18 at 0:32











  • ...on the other hand smaller size increases chance that buffer will completly fit into L1/L2 cache. I intentionally tried to avoid ReadBytes(), because of problems with slow garbage collection. I also noticed that (for no obvious reason) my program was ~100ms slower when I target .NET Framework 4.7.1 (instead of 4.7.2). I never used any 3rd-party GZip library, so you'll have to ask Google for help :)

    – Ňuf
    Nov 24 '18 at 0:32











  • Ok, still thank you for your help. Do you know if exists a way to improve avoiding garbage collection in my first version of the code, the one that uses MemoryStream?

    – heliosophist
    Nov 25 '18 at 20:48











  • As long as it calls base.ReadBytes(4), there probably isn't much you can do about it. You coluld try to postpone GC, use GC server background mode ...

    – Ňuf
    Nov 26 '18 at 18:11

















Thank you for your answer. I've tested both your solutions, but I could get only a speed up of about 3x. I'm creating a large buffer that contains all the uncompressed file. The most of time is taken by your FillBuffer function. Is there a way to get a better FillBuffer and ReadBytes (I need a function that can work like stream.readBytes() or span slice) or maybe better library to uncompress GZip files?

– heliosophist
Nov 23 '18 at 18:44





Thank you for your answer. I've tested both your solutions, but I could get only a speed up of about 3x. I'm creating a large buffer that contains all the uncompressed file. The most of time is taken by your FillBuffer function. Is there a way to get a better FillBuffer and ReadBytes (I need a function that can work like stream.readBytes() or span slice) or maybe better library to uncompress GZip files?

– heliosophist
Nov 23 '18 at 18:44













I tested on relatively small file, which was probably cached in RAM by Windows (due to my repeated tests), so reading this file was blazingly fast. If your file is big and not cached, program is likely to spend more time by reading data from disk and thus be slower. You can try open file with one of the new FileStream() overloads, that allows to specify advanced options like buffer size, sequential file access, etc. You can also try experimenting with size of my buffer var buffer = new byte[64 * 1024]; - larger buffer allows less calls to OS...

– Ňuf
Nov 24 '18 at 0:32





I tested on relatively small file, which was probably cached in RAM by Windows (due to my repeated tests), so reading this file was blazingly fast. If your file is big and not cached, program is likely to spend more time by reading data from disk and thus be slower. You can try open file with one of the new FileStream() overloads, that allows to specify advanced options like buffer size, sequential file access, etc. You can also try experimenting with size of my buffer var buffer = new byte[64 * 1024]; - larger buffer allows less calls to OS...

– Ňuf
Nov 24 '18 at 0:32













...on the other hand smaller size increases chance that buffer will completly fit into L1/L2 cache. I intentionally tried to avoid ReadBytes(), because of problems with slow garbage collection. I also noticed that (for no obvious reason) my program was ~100ms slower when I target .NET Framework 4.7.1 (instead of 4.7.2). I never used any 3rd-party GZip library, so you'll have to ask Google for help :)

– Ňuf
Nov 24 '18 at 0:32





...on the other hand smaller size increases chance that buffer will completly fit into L1/L2 cache. I intentionally tried to avoid ReadBytes(), because of problems with slow garbage collection. I also noticed that (for no obvious reason) my program was ~100ms slower when I target .NET Framework 4.7.1 (instead of 4.7.2). I never used any 3rd-party GZip library, so you'll have to ask Google for help :)

– Ňuf
Nov 24 '18 at 0:32













Ok, still thank you for your help. Do you know if exists a way to improve avoiding garbage collection in my first version of the code, the one that uses MemoryStream?

– heliosophist
Nov 25 '18 at 20:48





Ok, still thank you for your help. Do you know if exists a way to improve avoiding garbage collection in my first version of the code, the one that uses MemoryStream?

– heliosophist
Nov 25 '18 at 20:48













As long as it calls base.ReadBytes(4), there probably isn't much you can do about it. You coluld try to postpone GC, use GC server background mode ...

– Ňuf
Nov 26 '18 at 18:11





As long as it calls base.ReadBytes(4), there probably isn't much you can do about it. You coluld try to postpone GC, use GC server background mode ...

– Ňuf
Nov 26 '18 at 18:11




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416004%2fhow-to-read-a-binary-file-quickly-in-c-readonlyspan-vs-memorystream%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

How to fix TextFormField cause rebuild widget in Flutter