From: Rich P on
string[] files = Directory.GetFiles(@"C:\1A\1AA\", "*.txt",
SearchOption.AllDirectories);

char[] rgchDigits = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'
};

var grouped = from filename in files
group filename by Path
.GetFileName(filename)
.Substring(0, filename.IndexOfAny(rgchDigits));

//ignoring casing -- just typing
foreach(var group in grouped)
console.writeline(group.filename + " " + group.count)


this iterates the for body only once and prints

tes 7

in the console window. If I use myList with FileInfo
this for loop iterates twice through the body and prints

test 4
testA 3

How do I use Path.GetFileName()

in the code above to print the same results using the file array instead
of myList?

Rich

*** Sent via Developersdex http://www.developersdex.com ***
From: Peter Duniho on
Rich P wrote:
> string[] files = Directory.GetFiles(@"C:\1A\1AA\", "*.txt",
> SearchOption.AllDirectories);
>
> char[] rgchDigits = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'
> };
>
> var grouped = from filename in files
> group filename by Path
> .GetFileName(filename)
> .Substring(0, filename.IndexOfAny(rgchDigits));
> [...]
> How do I use Path.GetFileName()
>
> in the code above to print the same results using the file array instead
> of myList?

Sorry. My code example was flawed. Here's a correct version of the LINQ:

var grouped = from file in files
let name = Path.GetFileName(file)
group file by name.Substring(0, name.IndexOfAny(rgchDigits));

If you compare the above with the code I wrote earlier, you'll see that
in the code I wrote earlier, it was incorrectly calculating the index
based on the full file path, rather than just the file name, even though
I'd gotten just the file name for the purpose of calling the Substring()
method.

In fact, it was only by virtue of your path containing digits early in
the string that the code didn't just crash. A more normal path, without
digits in the directory names, would have resulted in an exception as
the second argument passed to Substring() would have been out of range!

Anyway, sorry for the confusion. Your use of FileInfo avoided the
problem because, of course, that class inherently is resistant to the
bug I put in the code. :) But if you want a more efficient solution
that still works (as opposed to the more efficient solution I provided
before that doesn't work :) ), then the above should do it.

Pete
From: Rich P on
Again, thank you so much for helping me out with this! The exercise
really points out my lack of intution (almost perturbs me - it perturbs
me :).


I tried

var grouped = from file in files
let name = Path.GetFileName()
group file by name.Substring(0, name.IndexOfAny(rgchDigits));

and VS complained (which I mean Path.GetFileName() had a read
underline). Even though I sort of read the error message I did not
understand it. It did not occur to me to enter the file object in
Path.GetFileName(). I think -- for me -- the real lesson in this
exercise has been on learning to read and interpret the error messages
(along with learning some linQ).

Thank you very much for your help and patience. This sample alone will
serve me well because I have a few projects that will be doing this, and
I just did not have an efficient way to deal with it. I was going to
interface with a sql server DB and use temp tables to perform the same
operation that the LinQ does (with the grouping). But that added way
too many dependencies on an external DB. This is sooo much better !

Rich

*** Sent via Developersdex http://www.developersdex.com ***
From: Peter Duniho on
Rich P wrote:
> [...]
> Thank you very much for your help and patience. This sample alone will
> serve me well because I have a few projects that will be doing this, and
> I just did not have an efficient way to deal with it. I was going to
> interface with a sql server DB and use temp tables to perform the same
> operation that the LinQ does (with the grouping). But that added way
> too many dependencies on an external DB. This is sooo much better !

Glad you got it to work.

For what it's worth, grouping manually is not really that difficult.
You would probably be well-served to study how it might be done
explicitly, without the help of LINQ or a database.

The basic idea is to maintain some kind of group of "bins" into which
items can be placed as they are being grouped. In .NET, one possible
approach (and I believe this is in fact how the LINQ
Enumerable.GroupBy() method is implemented) is to use a dictionary.

For example (disclaimer: as with every other code example I've posted to
this thread, I haven't bothered to compile or run this code�if it's not
correct, it should give you the basic idea even so :) ):

string strPath = /* initialized as appropriate */;
char[] rgchDigits =
{ '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
string[] files = Directory.GetFiles(strPath);
Dictionary<string, List<string>> dictGroups = new Dictionary<string,
string>();

foreach (string file in files)
{
string name = Path.GetFileName(file);
string key = name.Substring(0, name.IndexOfAny(rgchDigits));
List<string> group;

if (!dictGroups.TryGetValue(key, out group))
{
group = new List<string>();
dictGroups.Add(key, group);
}

group.Add(file);
}

foreach (KeyValuePair<string, List<string>> kvp in dictGroups)
{
Console.WriteLine(kvp.Key + " " + kvp.Value.Count);
}

The basic idea in the code above is to maintain a dictionary that can
map the group key to a list of file paths. For each file path, the key
is extracted from the filename, then the dictionary is checked to see if
the key's already been added. If it hasn't, a new List<string> is
created and added to the dictionary for that key.

In either case, the List<string> that goes with the key, either the one
newly created or the one that was already in the dictionary, has the
file path added to it.

The result is a dictionary for which each entry is in fact a single
group, with the key being the base of the filename for the group, and
the value associated with the key being the actual list of file paths
that go with that filename base for the group.

And yes, learning to properly interpret the errors reported to you by
the compiler and run-time will go a long way to making your programming
activities easier and more efficient. :) It may take some practice,
but it's definitely worth it.

Pete
From: Rich P on
>>
string strPath = @"C:\1A\1AA\";
char[] rgchDigits = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'
};
string[] files =
Directory.GetFiles(strPath,"*.txt",SearchOption.AllDirectories);
Dictionary<string, List<string>> dictGroups = new Dictionary<string,
List<string>>();

foreach (string file in files)
{
string name = Path.GetFileName(file);
string key = name.Substring(0, name.IndexOfAny(rgchDigits));
List<string> group;
if (!dictGroups.TryGetValue(key, out group))
{
group = new List<string>();
dictGroups.Add(key, group);
}

group.Add(file);
}

foreach (KeyValuePair<string, List<string>> kvp in dictGroups)
{
Console.WriteLine(kvp.Key + " " + kvp.Value.Count);
}

<<

Thanks for this code sample - works great. This is a little bit easier
to follow - what is going on than the LinQ -- although the LinQ was
several lines shorter. Dictionary is still great though, for when I
can't figure out a LinQ method.


Rich

*** Sent via Developersdex http://www.developersdex.com ***
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4
Prev: WPF Dispatcher queues - race condition
Next: explicit cast