From: Joseph N. Stackhouse on
I'm still desperately looking for a solution to this question. I need a
very intelligent way to find an array of bytes within a file, but not only
that preferably a regex expression of sorts.
For example my database has around 50,000 regex entries such as
"FFFF.*BB0403.*FF" that I want to turn into a byte array then search for. I
understand how to go about writing my own code for this, however it is a
little over my head when we're dealing with searching so many patterns
against many files while trying to be speedy about the operation. Does
anyone have any good example code that I can start with, or know of a 3rd
party .NET Component that will do this type of searching?

Thanks in advance, happy holidays!

-Joe

From: Patrice on
I don't remember right now how it is called but there is basically a state
based solution for searching multiple strings at once. A research gave me
http://tomasp.net/articles/ahocorasick.aspx which is the algorithm I was
thinking of.

http://www.informit.com/guides/content.aspx?g=dotnet&seqNum=769&ns=16291
could be a starting point...

--
Patrice


"Joseph N. Stackhouse" <junkmauler(a)hotmail.com> a �crit dans le message de
news:9F777703-D56E-42A5-8AC8-21796699332C(a)microsoft.com...
> I'm still desperately looking for a solution to this question. I need a
> very intelligent way to find an array of bytes within a file, but not only
> that preferably a regex expression of sorts.
> For example my database has around 50,000 regex entries such as
> "FFFF.*BB0403.*FF" that I want to turn into a byte array then search for.
> I understand how to go about writing my own code for this, however it is a
> little over my head when we're dealing with searching so many patterns
> against many files while trying to be speedy about the operation. Does
> anyone have any good example code that I can start with, or know of a 3rd
> party .NET Component that will do this type of searching?
>
> Thanks in advance, happy holidays!
>
> -Joe

From: Gregory A. Beamer on
"Joseph N. Stackhouse" <junkmauler(a)hotmail.com> wrote in
news:9F777703-D56E-42A5-8AC8-21796699332C(a)microsoft.com:

> I'm still desperately looking for a solution to this question. I need
> a very intelligent way to find an array of bytes within a file, but
> not only that preferably a regex expression of sorts.

how big are the files?

There are plenty of commmand line tools that can search for bytes in a
file. Searching for 50,000 different byte types takes time, however. But
with a fast enough tool, you can do it. And, it is an option if you are
cataloging and not doing this all the time. Or, after initial run, you
can run the new files one at a time.

If the files are small enough, "caching" the regex "strings" as bytes
and then looping is not too time consuming, but with larger files, it
will bog down the system, esp if there is a risk of accidentally
splitting these strings into two consecutive buffers. This might still
be an option for a once in awhile pass, but not for a tool that runs on
a regular basis.

Peace and Grace,

--
Gregory A. Beamer (MVP)

Twitter: @gbworld
Blog: http://gregorybeamer.spaces.live.com

*******************************************
| Think outside the box! |
*******************************************
 | 
Pages: 1
Prev: Moving objects in VB.NET
Next: vb.net