From: Schmidt on

"Jeff Johnson" <i.get(a)enough.spam> schrieb im Newsbeitrag
news:eqAW1Ma0KHA.260(a)TK2MSFTNGP05.phx.gbl...

> Enjoy:
>
> SELECT GCount, Genre, main.Artist
> FROM (SELECT Artist, Genre, COUNT(*) AS GCount FROM songs GROUP BY Artist,
> Genre) AS main INNER JOIN (SELECT Artist, Max(GCount2) AS MaxGenre FROM
> (SELECT Artist, Genre, COUNT(*) AS GCount2 FROM songs GROUP BY Artist,
> Genre) AS sub2 GROUP BY Artist) AS sub1 ON sub1.Artist = main.Artist AND
> main.GCount = sub1.MaxGenre
>
> This sort of thing is one of the most complex things you can do
> in SQL since it goes against the set theory that is the very heart
> of SQL. I recommend you break it apart into its component
> pieces to see what's actually happening. I actually recommend
> doing this sort of row-based filtering thing in code instead of
> convoluted SQL.

Actually this can be solved with less complexity by using one
of the two not that wellknown Aggregate-Functions, built into
the JET-SQL-language - meaning First() and Last()...

This example is using Last(), but First() would also do, if
one changes the Order By Clause of the following
SubSelect-Definition (which is basically the same SQL as
David has mentioned in its Opener-Posting).

strSubSelect = "Select Count(*) As GCount, Genre, Artist From Songs " & _
"Group By Genre, Artist Order By GCount, Genre Desc"

Ok what now remains is the construction of the final
(SubSelect-wrapping) Statement, which does look pretty
simple due to the usage of the Last() aggregate-functions:

strSQL = "Select Last(GCount), Last(Genre), Artist From (" & _
SubSelect & _
") Group By Artist"

Only tested here with SQLite (which has a little bit different
syntax with the First() and Last() aggregates), but the above
is already "JET-adapted from my memory" and should
give the correct results against an *.mdb too - would be
nice if you (or David) could check it on your side.

Olaf


From: Schmidt on

"Schmidt" <sss(a)online.de> schrieb im Newsbeitrag
news:%23m4Y5xd0KHA.3500(a)TK2MSFTNGP02.phx.gbl...

> This example is using Last(), but First() would also do, if
> one changes the Order By Clause of the following
> SubSelect-Definition (which is basically the same SQL as
> David has mentioned in its Opener-Posting).
>
> strSubSelect = "Select Count(*) As GCount, Genre, Artist From Songs " & _
> "Group By Genre, Artist Order By GCount, Genre Desc"
>
> Ok what now remains is the construction of the final
> (SubSelect-wrapping) Statement, which does look pretty
> simple due to the usage of the Last() aggregate-functions:
>
> strSQL = "Select Last(GCount), Last(Genre), Artist From (" & _
> SubSelect & _
> ") Group By Artist"

Ok, before you guys "go nuts" (just tested with an *.mdb here) ...
the approach *is* possible with JET, but only if the above
SubSelect is transferred beforehand (per Select Into ...)
into a temporary (but nonetheless physical) Table.
Only then does JET use the correctly ordered Records
from within the Last() or First() aggregates.

A dynamic interpretation of the SubSelect, given as
"mere string" (as in the above example) does not work
correctly in conjunction with these aggregates, no matter
how the SubSelect was ordered beforehand - so in this
"dynamic case" Last() and First() do *not* respect the
incoming sort-order of the records within the dynamic
SubSelect; instead the physical representation -
or better: the physical Insertion-Order within
the underlying table (in this case the table [Songs])
for Feeding these aggregates within the context of
an "artist-grouping" is used.

And that (working with a temp-table, to make the above
approach behave correctly) is probably not what David
had in mind.

Sorry for the confusion - seems this restriction to the
Insertion-Order of records in physical tables, brings the
usability of the First() and Last() aggregates in JET-SQL
to near zero.

But maybe this here can work in Davids special case
with a somewhat better performance (I have to come up
with something as short as the above now, do I? <g>):

'now all in one SQL-String (correct sorting is indirectly performed too)

SQL = "Select 99999-Left(Min(GInfo),5) As GenreCount," & _
"Mid(Min(GInfo),6) As Genre," & _
"Artist " & _
"From (" & _
"Select 99999-Count(Genre) & Genre As GInfo," & _
"Artist From Songs Group By Artist,Genre" & _
") Group By Artist"

Ok, looks a bit "weird" - place it in the "fun-category" ... ;-)

But as said, should work though - @David... would
be interested in the performance-results (should run
pretty well, even with all the String- and Math-Ops
involved).

Olaf


From: David Kaye on
"Schmidt" <sss(a)online.de> wrote:

>Ok, before you guys "go nuts" (just tested with an *.mdb here) ...
>the approach *is* possible with JET, but only if the above
>SubSelect is transferred beforehand (per Select Into ...)
>into a temporary (but nonetheless physical) Table.
>Only then does JET use the correctly ordered Records
>from within the Last() or First() aggregates.

Yeah, I tried it out and it didn't work for me. Again, though, the SQL
statement I raved about works very well. Just for ducks I compacted the
database, dropped the artist index and recreated it and the query now takes
only about 0.8 seconds to run. The actual record count at this time is
37,425, so I'd say that's fairly fast given the complexity of the query.

From: Schmidt on

"David Kaye" <sfdavidkaye2(a)yahoo.com> schrieb im Newsbeitrag
news:hp3gu0$eme$1(a)news.eternal-september.org...

[First(), Last() approach...]
> Yeah, I tried it out and it didn't work for me. Again,
> though, the SQL statement I raved about works very well.
> Just for ducks I compacted the database, dropped the artist
> index and recreated it and the query now takes only about 0.8
> seconds to run. The actual record count at this time is 37,425,
> so I'd say that's fairly fast given the complexity of the query.

Did you also tried my second attempt, at the end of my
previous post, which avoids the First() and Last() stuff?

BTW, there's also a new reply in the database-group now
(from Jamal Samedov), which maybe also worth a try.

Ok, your current 0.8 seconds based on Jeffs approach
is already pretty good, something one can live with,
but we all probably try to develop a better "performance-
picture" over time (on what works good - and what works
not that well with advanced SQL-queries) - and since my
second query-approach is trying to trade "count of needed
Sub-Queries" versus "more excessive Column-expressions"),
it would be nice, if you could time that too - you are the man
with the real-world-database, not that easy to create a large
test-set which mimicks what you already have... :-)

Olaf


From: David Kaye on
"Schmidt" <sss(a)online.de> wrote:

>Did you also tried my second attempt, at the end of my
>previous post, which avoids the First() and Last() stuff?

My version of Jet (3.6) doesn't seem to like it. I get a syntax error.
Unfortunately, I don't have much time to pick it apart and see what's failing.

>BTW, there's also a new reply in the database-group now
>(from Jamal Samedov), which maybe also worth a try.

No, his doesn't work. Again, I suggest people actually try the SQLs before
posting them. His SQL returns the same artist multiple times, grouped by
genre and not sorted by genre count.

What I need is what Jeff (Jeff?) provided. I need a list of distinct
(non-duplicated) list of artists. Next to each artist I need the genre
they're "most famous" for. Returning the count of genres in the recordset
isn't really necessary, but it serves as a check of whether the SQL is
returning the proper records. In Jeff's case it is.

In the example I cited, the following are a few records that come back:

7 Rap Andre Nickatina
62 Folk Ani DiFranco
89 Pop Annie Lennox

Now, people may not agree with the genres (I certainly don't), but the data is
gathered from many people, so it's sort of a consensus of what genres the
people feel these artists represent.

First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4
Prev: Project Explrer
Next: VB Editor in Excel