Windows Vista Forums

GZip Compression :(
  1. #1


    Carlo Razzeto Guest

    GZip Compression :(

    Hello there,

    I'm having an odd issue with GZIP compression (having followed example code
    found on MSDN). Basically, after running through the compression routine I
    end up with a byte array several times larger than the source text file,
    full of zero data. Below is the code used to do the compression, it's a part
    of a web service to retreive a file, there's a compress option prior to
    base64 encoding the data. In the following code all undeclared variables you
    see are properties, compress repersents a compress attribute specified in
    the xml request, FileName is a relitive path to the file on the server
    inside the webroot.



    Response.ContentType = "text/xml"
    If Not File.Exists(Server.MapPath(FileName)) Then
    Throw New GetBinaryFileException(FileName,
    GetBinaryFileException.GetBinaryFileError.FileNotFound)
    End If

    Dim FileData() As Byte = Nothing
    Dim FStream As New FileStream(Server.MapPath(FileName),
    FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
    If Compress Then
    Dim TempData(FStream.Length - 1) As Byte
    FStream.Read(TempData, 0, FStream.Length)
    Dim MStream As New MemoryStream
    Dim Compressor As New GZipStream(MStream,
    CompressionMode.Compress, True)
    Compressor.Write(TempData, 0, TempData.Length)

    ReDim FileData(MStream.Length - 1)
    Dim BytesRead As Integer = MStream.Read(FileData, 0,
    MStream.Length)
    MStream.Close()
    MStream.Dispose()
    Compressor.Close()
    Compressor.Dispose()
    Else
    ReDim FileData(FStream.Length - 1)
    FStream.Read(FileData, 0, FStream.Length)
    End If
    FStream.Close()
    FStream.Dispose()

    Dim Base64 As String = Convert.ToBase64String(FileData)

    Dim FileDataNode As XmlNode =
    XmlExchangeLib.GetOrSetXmlNode("FileData", Root)
    XmlExchangeLib.AddAttributeWithValue(FileDataNode, "Compressed",
    Compress.ToString().ToLower())
    FileDataNode.InnerText = Base64
    XmlResponse.Save(Response.OutputStream)


      My System SpecsSystem Spec

  2. #2


    Marc Gravell Guest

    Re: GZip Compression :(

    First, you need to make sure that you close the zip-stream (compressor)
    before looking at the memory-stream - it won't have finished writing yet;
    second, you then either need to rewind the memory stream, or just use
    ToArray() to get the full contents.
    Third - Read (on the file stream) is not strictly guaranteed to get
    everything - and even if it did it isn't very efficient. But
    File.ReadAllBytes would be a more reliable way of reading the entire file at
    once.

    You might also be allocating the FileData array one too short - I'm not sure
    (VB...)

    Marc



      My System SpecsSystem Spec

  3. #3


    Carlo Razzeto Guest

    Re: GZip Compression :(

    The last point is the one I know for a fact is fine, in VB you need to
    declare it to length -1. But I'll take a look at the rest of the points.
    Thanks very much for your thoughts, most helpful...
    "Marc Gravell" <marc.gravell@xxxxxx> wrote in message
    news:eEnZFMBkIHA.5280@xxxxxx

    > First, you need to make sure that you close the zip-stream (compressor)
    > before looking at the memory-stream - it won't have finished writing yet;
    > second, you then either need to rewind the memory stream, or just use
    > ToArray() to get the full contents.
    > Third - Read (on the file stream) is not strictly guaranteed to get
    > everything - and even if it did it isn't very efficient. But
    > File.ReadAllBytes would be a more reliable way of reading the entire file
    > at once.
    >
    > You might also be allocating the FileData array one too short - I'm not
    > sure (VB...)
    >
    > Marc
    >

      My System SpecsSystem Spec

  4. #4


    Carlo Razzeto Guest

    Re: GZip Compression :(

    Thanks for the advice, I swithed to autoclosing the zip stream and using
    ToArray on the memory stream and it seems to be pulling bytes. Now my only
    consern is I'm getting back a byte array much larger than my original 26
    byte text file

    "Marc Gravell" <marc.gravell@xxxxxx> wrote in message
    news:eEnZFMBkIHA.5280@xxxxxx

    > First, you need to make sure that you close the zip-stream (compressor)
    > before looking at the memory-stream - it won't have finished writing yet;
    > second, you then either need to rewind the memory stream, or just use
    > ToArray() to get the full contents.
    > Third - Read (on the file stream) is not strictly guaranteed to get
    > everything - and even if it did it isn't very efficient. But
    > File.ReadAllBytes would be a more reliable way of reading the entire file
    > at once.
    >
    > You might also be allocating the FileData array one too short - I'm not
    > sure (VB...)
    >
    > Marc
    >

      My System SpecsSystem Spec

  5. #5


    Family Tree Mike Guest

    Re: GZip Compression :(

    Just to make sure...

    You are talking about "before" the step of going to base64, correct? The
    base 64 step will bloat the string by a factor of 1.37 plus header data, if I
    recall correctly.



    "Carlo Razzeto" wrote:

    > Thanks for the advice, I swithed to autoclosing the zip stream and using
    > ToArray on the memory stream and it seems to be pulling bytes. Now my only
    > consern is I'm getting back a byte array much larger than my original 26
    > byte text file
    >
    > "Marc Gravell" <marc.gravell@xxxxxx> wrote in message
    > news:eEnZFMBkIHA.5280@xxxxxx

    > > First, you need to make sure that you close the zip-stream (compressor)
    > > before looking at the memory-stream - it won't have finished writing yet;
    > > second, you then either need to rewind the memory stream, or just use
    > > ToArray() to get the full contents.
    > > Third - Read (on the file stream) is not strictly guaranteed to get
    > > everything - and even if it did it isn't very efficient. But
    > > File.ReadAllBytes would be a more reliable way of reading the entire file
    > > at once.
    > >
    > > You might also be allocating the FileData array one too short - I'm not
    > > sure (VB...)
    > >
    > > Marc
    > >
    >

      My System SpecsSystem Spec

  6. #6


    Carlo Razzeto Guest

    Re: GZip Compression :(

    Raw byte array size (prior to conversion to base64 string). I read in 26
    bytes and typically get back 132 bytes worth of "compressed" data.

    "Family Tree Mike" <FamilyTreeMike@xxxxxx> wrote in
    message news:C6F4164A-4BF0-4E3E-8997-8B4506AEC0C3@xxxxxx

    > Just to make sure...
    >
    > You are talking about "before" the step of going to base64, correct? The
    > base 64 step will bloat the string by a factor of 1.37 plus header data,
    > if I
    > recall correctly.
    >
    >
    >
    > "Carlo Razzeto" wrote:
    >

    >> Thanks for the advice, I swithed to autoclosing the zip stream and using
    >> ToArray on the memory stream and it seems to be pulling bytes. Now my
    >> only
    >> consern is I'm getting back a byte array much larger than my original 26
    >> byte text file
    >>
    >> "Marc Gravell" <marc.gravell@xxxxxx> wrote in message
    >> news:eEnZFMBkIHA.5280@xxxxxx

    >> > First, you need to make sure that you close the zip-stream (compressor)
    >> > before looking at the memory-stream - it won't have finished writing
    >> > yet;
    >> > second, you then either need to rewind the memory stream, or just use
    >> > ToArray() to get the full contents.
    >> > Third - Read (on the file stream) is not strictly guaranteed to get
    >> > everything - and even if it did it isn't very efficient. But
    >> > File.ReadAllBytes would be a more reliable way of reading the entire
    >> > file
    >> > at once.
    >> >
    >> > You might also be allocating the FileData array one too short - I'm not
    >> > sure (VB...)
    >> >
    >> > Marc
    >> >
    >>

      My System SpecsSystem Spec

  7. #7


    Marc Gravell Guest

    Re: GZip Compression :(

    I wouldn't bother compressing 26 bytes... gzip itself has header overhead
    etc. This also isn't enough space to actually get many useful compression
    opportunities. Finally, it depends on what the data is: if it is fairly
    random (a complex image, a security token, etc) then it simply won't
    compress.

    Marc



      My System SpecsSystem Spec

  8. #8


    Marc Gravell Guest

    Re: GZip Compression :(

    Demo; outputs "125"; compression just isn't going to help you with very
    short inputs:

    using(MemoryStream dest = new MemoryStream()) {
    using(GZipStream zip = new GZipStream(dest,
    CompressionMode.Compress, true))
    using(StreamWriter writer = new StreamWriter(zip)) {
    writer.Write("Hi hi hi");
    writer.Close();
    zip.Close();
    }
    Console.WriteLine(dest.Length);
    }

    Marc



      My System SpecsSystem Spec

  9. #9


    Carlo Razzeto Guest

    Re: GZip Compression :(

    Ah, yeah hadn't been considering the compression headers. Thanks for
    reminding me of that, so that makes sense. IRL this code isn't going to be
    used to compress 25 byte files, more like several K to an M or two pdf files
    so it should be fine. Thanks,

    Carlo

    "Marc Gravell" <marc.gravell@xxxxxx> wrote in message
    news:e2HxrPCkIHA.484@xxxxxx

    > Demo; outputs "125"; compression just isn't going to help you with very
    > short inputs:
    >
    > using(MemoryStream dest = new MemoryStream()) {
    > using(GZipStream zip = new GZipStream(dest,
    > CompressionMode.Compress, true))
    > using(StreamWriter writer = new StreamWriter(zip)) {
    > writer.Write("Hi hi hi");
    > writer.Close();
    > zip.Close();
    > }
    > Console.WriteLine(dest.Length);
    > }
    >
    > Marc
    >

      My System SpecsSystem Spec

  10. #10


    Marc Gravell Guest

    Re: GZip Compression :(

    One approach would be to use the first byte to indicate whether compression
    is on (and what) - i.e. 0x00 = none, 0x01 = gzip, etc. I use this trick
    quite happily; pick a cutoff under which you won't even bother trying to
    compress... otherwise try compressing it and see if it got shorter (even
    some non-trivial data gets longer when "compressed"). Worth consideration
    perhaps... And in reverse check the first byte - if 0 return the rest of the
    stream vanilla, if 1 the gzip, etc...

    Marc



      My System SpecsSystem Spec

GZip Compression :( problems?

Similar Threads
Thread Thread Starter Forum Replies Last Post
Further compression? PrettyLadyWithBrains Vista music pictures video 2 05 May 2008
Working with Gzip in the same way as WinZip Ger F. Vista General 1 29 Jan 2008
WCF client->J2EE server - using GZip Lars Hulvej Indigo 0 14 Dec 2007
Compression Aboriginal Vista mail 8 24 Jul 2007
Reading Zip, GZip Rob Kenny PowerShell 6 13 Jun 2007