From: MM on
Imagine the following list:

Apples green
Apples blue
Apples yellow
Apples red
Apples orange
Apples purple
Apples white
Apples black
Pears green
Pears blue
Pears yellow
Pears red
Pears orange
Pears purple
Pears white
Pears black
Cars green
Cars blue
Cars yellow
Cars red
Cars orange
Cars purple
Cars white
Cars black

How can I group them so that I just get the headings:
Apples
Pears
Cars

?

There are in the actual case tens of thousands of such keys, some of
which vary by only a sequential number. They don't necessarily appear
in any order. What I want to do is pick out the salient part, e.g.
Apples, Pears etc and list only them. I've tried a couple of
approaches, using the method of scanning the sorted list and comparing
current key with previous key, but I'm not sure I'm on the right
track.

MM
From: Dee Earley on
On 02/06/2010 09:23, MM wrote:
> Imagine the following list:
>
<SNIP>
>
> How can I group them so that I just get the headings:
> Apples
> Pears
> Cars
>
> ?
>
> There are in the actual case tens of thousands of such keys, some of
> which vary by only a sequential number. They don't necessarily appear
> in any order. What I want to do is pick out the salient part, e.g.
> Apples, Pears etc and list only them. I've tried a couple of
> approaches, using the method of scanning the sorted list and comparing
> current key with previous key, but I'm not sure I'm on the right
> track.

Just run through the list, parse out the prefix, then add it to a
collection as a key (handling any errors for duplicates)
You then have a collection containing all unique prefixes.

You can use an array of you wish but you need to do the unique checking
yourself.

Checking against the previous entry will only work if it is sorted.

--
Dee Earley (dee.earley(a)icode.co.uk)
i-Catcher Development Team

iCode Systems

(Replies direct to my email address will be ignored.
Please reply to the group.)
From: MM on
On Wed, 02 Jun 2010 10:03:05 +0100, Dee Earley
<dee.earley(a)icode.co.uk> wrote:

>On 02/06/2010 09:23, MM wrote:
>> Imagine the following list:
>>
><SNIP>
>>
>> How can I group them so that I just get the headings:
>> Apples
>> Pears
>> Cars
>>
>> ?
>>
>> There are in the actual case tens of thousands of such keys, some of
>> which vary by only a sequential number. They don't necessarily appear
>> in any order. What I want to do is pick out the salient part, e.g.
>> Apples, Pears etc and list only them. I've tried a couple of
>> approaches, using the method of scanning the sorted list and comparing
>> current key with previous key, but I'm not sure I'm on the right
>> track.
>
>Just run through the list, parse out the prefix, then add it to a
>collection as a key (handling any errors for duplicates)
>You then have a collection containing all unique prefixes.

But how do I tell where the "prefix" starts and "the remainder"
begins? The key "Apples yellow", for example, is just a series of
ASCII characters (plus any allowed punctuation). When we (humans) look
at such a list we can immediately spot patterns. Supposing the next
item in the list of keys is "Pears white", followed by "Pears red",
"Cars black", and then "Apples green". I think sorting is an absolute
must to begin with.

Also note that the keys may not be this simple. I might, for example,
see the following "Apples green round", "Apples green round small",
"Apples red small with stalk" and so on. The one thing common to the
condensed/reduced key, e.g. "Apples", "Cars", is that it is always at
the start of the whole key.

MM
From: Dee Earley on
On 02/06/2010 11:47, MM wrote:
> On Wed, 02 Jun 2010 10:03:05 +0100, Dee Earley
> <dee.earley(a)icode.co.uk> wrote:
>
>> On 02/06/2010 09:23, MM wrote:
>>> Imagine the following list:
>>>
>> <SNIP>
>>>
>>> How can I group them so that I just get the headings:
>>> Apples
>>> Pears
>>> Cars
>>>
>>> ?
>>>
>>> There are in the actual case tens of thousands of such keys, some of
>>> which vary by only a sequential number. They don't necessarily appear
>>> in any order. What I want to do is pick out the salient part, e.g.
>>> Apples, Pears etc and list only them. I've tried a couple of
>>> approaches, using the method of scanning the sorted list and comparing
>>> current key with previous key, but I'm not sure I'm on the right
>>> track.
>>
>> Just run through the list, parse out the prefix, then add it to a
>> collection as a key (handling any errors for duplicates)
>> You then have a collection containing all unique prefixes.
>
> But how do I tell where the "prefix" starts and "the remainder"
> begins? The key "Apples yellow", for example, is just a series of
> ASCII characters (plus any allowed punctuation). When we (humans) look
> at such a list we can immediately spot patterns. Supposing the next
> item in the list of keys is "Pears white", followed by "Pears red",
> "Cars black", and then "Apples green". I think sorting is an absolute
> must to begin with.
>
> Also note that the keys may not be this simple. I might, for example,
> see the following "Apples green round", "Apples green round small",
> "Apples red small with stalk" and so on. The one thing common to the
> condensed/reduced key, e.g. "Apples", "Cars", is that it is always at
> the start of the whole key.

That is entirely down to you and your naming scheme.
If it is simply the first word, look for the space character.
If it is the first half of all the words, split and talk the half way point.
If it is some arbitrary number of words in an arbitrary number of words,
you need to rethink it and figure out exactly what you think makes up a
"Heading".

We can't magically make up rules for your business logic, but if you
give real life examples and a clear description, we can help further.

--
Dee Earley (dee.earley(a)icode.co.uk)
i-Catcher Development Team

iCode Systems

(Replies direct to my email address will be ignored.
Please reply to the group.)
From: Larry Serflaten on

"MM" <kylix_is(a)yahoo.co.uk> wrote
> Imagine the following list:
<...>
> How can I group them so that I just get the headings:
> Apples
> Pears
> Cars

You create a list of the 'salient' parts, that does not allow duplicates.

For a quick example, add 2 listboxes and a command button to a new
form and paste in the code below. Clicking the button does the work
and fills the second listbox.

As Dee said, you have to decide on what constitutes a heading. Is it
just the first word? Then find the first space character and extract
the first word. (As shown below) If a heading is something more, then
you need to define the rules that determine what a heading is, so you can
generate code to test the items against those rules, etc...

LFS

Option Explicit

Private Sub Command1_Click()
Dim dup As Collection
Dim idx As Long
Dim txt As String

On Error Resume Next
Set dup = New Collection
' run through the entire list
For idx = 0 To List1.ListCount - 1
' extract the heading
txt = List1.List(idx)
txt = Left$(txt, InStr(txt, " "))
' add it to a de-dupped list (duplicate keys are not allowed)
dup.Add txt, txt
Next
On Error GoTo 0

' Show headings
For idx = 1 To dup.Count
List2.AddItem dup(idx)
Next
End Sub

Private Sub Form_Load()
With List1
.AddItem "Apples Red"
.AddItem "Pears Light Red"
.AddItem "Cars Candy Apple Red"
.AddItem "Apples Blue"
.AddItem "Pears Dark Blue"
.AddItem "Cars Navy Blue"
.AddItem "Cars Forest Green"
.AddItem "Pears Yellow Green"
.AddItem "Apples Green"
End With
End Sub