Efficiently Generating SHA256 Checksum For Files Using C#

* I just added another post on similar topic and added more details on performance and a case where buffered memory usage really did matter:  http://peterkellner.net/2010/12/03/using-sha256managed-to-generate-sha256-hash/

I’m building a file synchronization with the cloud application and I want to store a checksum with the file so that I can verify later that the file is what I think it is. I’m not a crypto guy so I searched around the internet for a solution.   I found lots of examples and settled on this one that uses SHA256 for the job.  I also found some comments saying that it would be more efficient to wrap it in a BufferedStream rather than processing the entire file at once.

Example Links: http://efreedom.com/Question/1-1345851/MD5-File-Processing ; http://stackoverflow.com/questions/1177607/what-is-the-fastest-way-to-create-a-checksum-for-large-files-in-c/1177744#1177744

My intention for this post was to show how much more efficient it would be to use BufferedStream, however my results don’t show that.  I’m guessing that somehow, the efficiency is happening under the covers in a place I don’t see.

If anyone knows this space well, please feel free to comment and suggest a better method.  I’m publishing my source below for the test and my surprisingly similar results whether I used buffering or not.

Looking forward to the responses.

 

File Size In Megabytes Non-Buffered Buffered Max Memory
.8 MB .088 Seconds .082 Seconds  
851 MB (buffer not set) 30.3 Seconds 30.4 Seconds 1700MB
851 MB (buffer 1MB) 29.2 Seconds 29.6 Seconds 1450MB

 

using System;
using System.Diagnostics;
using System.IO;
using System.Security.Cryptography;

namespace ConsoleApplicationMD5test
{
internal class Program
{
private static void Main(string[] args)
{
//const string fileName = @"g:\tempjunk\BlobSyncClient.zip";
const string fileName =
@"g:\msdn\en_visual_studio_2008_service_pack_1_x86_dvd_x15-12962.iso";

var stopwatch1 = new Stopwatch();
stopwatch1.Start();
string str1 = ""; // GetChecksum(fileName);
stopwatch1.Stop();

var stopwatch2 = new Stopwatch();
stopwatch2.Start();
var fileStream = new FileStream(fileName, FileMode.OpenOrCreate,
FileAccess.Read);
string str2 = GetChecksumBuffered(fileStream);
stopwatch2.Stop();

Console.WriteLine(str1 + " " + stopwatch1.ElapsedMilliseconds);
Console.WriteLine(str2 + " " + stopwatch2.ElapsedMilliseconds);

Console.ReadLine();
}

private static string GetChecksum(string file)
{
using (FileStream stream = File.OpenRead(file))
{
var sha = new SHA256Managed();
byte[] checksum = sha.ComputeHash(stream);
return BitConverter.ToString(checksum).Replace("-", String.Empty);
}
}

private static string GetChecksumBuffered(Stream stream)
{
using (var bufferedStream = new BufferedStream(stream, 1024 * 32))
{
var sha = new SHA256Managed();
byte[] checksum = sha.ComputeHash(bufferedStream);
return BitConverter.ToString(checksum).Replace("-", String.Empty);
}
}
}
}

About Peter Kellner

Peter is a software professional specializing in mobile and web technologies. He has also been a Microsoft MVP for the past 7 years. To read more about Peter Kellner and his experience click here. For information about how Peter Kellner might be able to help you with your project click here.

Follow me:


Comments

  1. Peter,
    Brad Abrams explained in a post back in 2004 (http://blogs.msdn.com/b/brada/archive/2004/04/15/114329.aspx) the reason why there’s no benefit in wrapping a FileStream in a BufferedStream: because, in order to give better performance by default, almost all the built-in .Net streams have had buffering logic incorporated directly.

Trackbacks

  1. [...] my very last post I talked about how using SHA256 seemed to not be affected by whether you use Buffered or Not [...]

Your Comments

*

Protected with IP Blacklist CloudIP Blacklist Cloud

Follow

Get every new post delivered to your Inbox

Join other followers: