From: Mike Williams on
"TC" <getmyemails2(a)yahoo.com> wrote in message
news:uvqAlmm3IHA.1808(a)TK2MSFTNGP04.phx.gbl...

> I've got code that loops through a binary file byte-by-byte
> and creates a new byte array and then uses that to create
> a clone via the Put function.

Why are you doing that? Why don't you just create a copy of the file using
FileCopy? Having said that, if you do for some reason actually want a copy
of the file data in a Byte array then you don't need to loop through the
file byte by byte. You can get the entire file into the Byte array in one
operation, subject to available memory restrictions, which you won't need to
worry about unless the file is more than a few tens of megabytes. Try, for
example:

Dim fn As Long, b() As Byte
fn = FreeFile
Open "c:\temp\mike1.pdf" For Binary As fn
ReDim b(1 To LOF(fn))
Get fn, 1, b()
Close fn

> I'm having an issue regarding leaving certain records out.
> For example, in Office files that are BIFF format, is it fair
> to say one would not only have to understand the BIFF
> structure and how a record relates to that structure as
> well as what the associated application expects?

Well, yes. You would certainly need an in depth knowledge of the data format
of whatever file you are dealing with. A file is just a bunch of data bytes,
and unless you know what each of those bytes actually represent then there
isn't very much you can do with it as far as "leaving records out" or
"adding records" is concerned. Whatever file you are dealing with, you need
an extremely detailed knowledge of its data format to do anything really
useful with it.

Loading data from a file and altering that data and saving the modified data
to a new file is extremely easy, but it is knowing exactly how to alter the
data in order to achieve a useful effect that it the hard part! For example,
here is the first hit I got on Google regarding the Excel BIFF format. I
don't know whether it is a comprehensive explanation of the file format, or
whether it is just a rough outline, but it runs to about 250 A4 pages!
You're going to have serious trouble digesting that lot in such a way that
it will enable you to load the raw file data into a Byte array and do
anything useful with it!

http://sc.openoffice.org/excelfileformat.pdf

Mike



From: Peter T on
"Mike Williams" wrote in message
> "TC" wrote in message
>
> > I've got code that loops through a binary file byte-by-byte
> > and creates a new byte array and then uses that to create
> > a clone via the Put function.
>
> Why are you doing that?

Indeed

> > I'm having an issue regarding leaving certain records out.

Maybe do that with ADO etc (if the file is closed or Excel is n/a)

> > For example, in Office files that are BIFF format, is it fair
> > to say one would not only have to understand the BIFF
>
> Well, yes. You would certainly need an in depth knowledge of the data
format
> of whatever file you are dealing with. A file is just a bunch of data
bytes,
> and unless you know what each of those bytes actually represent then there
> isn't very much you can do with it as far as "leaving records out" or
> "adding records" is concerned. Whatever file you are dealing with, you
need
> an extremely detailed knowledge of its data format to do anything really
> useful with it.

There are plenty of BIFF8 VB projects out there, some do a pretty good job
not only of changing data but handling cell formulas and formats etc. BIFF8
is typically only of interest to handle Excel files when Excel is n/a.

> Here is the first hit I got on Google regarding the Excel BIFF format. I
> don't know whether it is a comprehensive explanation of the file format,
or
> whether it is just a rough outline, but it runs to about 250 A4 pages!
> You're going to have serious trouble digesting that lot in such a way that
> it will enable you to load the raw file data into a Byte array and do
> anything useful with it!
>
> http://sc.openoffice.org/excelfileformat.pdf

AFAIK that's the most complete documentation publically avaialble, though
some sections were never completed.

In the OP's 'other post' he was advised -

"The plot thickens in the new [Excel-12] version because besides the XML
in ZIP files that are XLSX and XLSM files there is the new binary in ZIP
format XLSB and, of course, the old binary format is still supported.

Reading Excel files directly is something very few people have done or need
to do but your task may be easier because the new formats are 'open' and so
publically documented. I would think your first step would be to unzip the
new files..."

Then examine the file in a Hex editor or even dump into Excel

Sub test() ' Excel - VBA
Dim fn As Long, b() As Byte
fn = FreeFile
Open "C:\My Documents\Excel\Book2.xls" For Binary As fn
ReDim b(1 To LOF(fn))
Get fn, 1, b()
Close fn

If UBound(b) > ActiveSheet.Rows.Count Then
' too big
Exit Sub
End If

ReDim arr(1 To UBound(b), 1 To 2)
For i = 1 To UBound(b)
arr(i, 1) = b(i)
If Len(b(i)) Then
If b(i) > 31 Then arr(i, 2) = Chr(b(i))
End If
Next

' In VBA the unqualified Range.etc implicitly refers
' to a cell range on the active sheet
With Range("A1").Resize(UBound(b), 2)
.Value = arr
End With

End Sub


Regards,
Peter T