From: tshad on
I was looking at an article by Plamen Ratchev (which I had printed a while
ago but can't find it online anymore) where he states that the first thing a
query does if there are 2 tables involved is create a Cartesian Product
(cross Join) between each table. Then the ON filter is applied.

If Table A has 20,000,000 rows and Table B has 30,000,000 rows - the system
would read all the rows and put all the combinations together to get
600,000,000,000,000 rows? Then apply the ON filter???

Where would it put the data it just read?

Does it read all this into a temporary table in memory?

Then when it applies the ON predicate, does it them move the rows that match
to another temporary table?

And if an outer join, move the other rows to the 2nd temporary table?

Thanks,

Tom


From: Erland Sommarskog on
tshad (tfs(a)dslextreme.com) writes:
> I was looking at an article by Plamen Ratchev (which I had printed a
> while ago but can't find it online anymore) where he states that the
> first thing a query does if there are 2 tables involved is create a
> Cartesian Product (cross Join) between each table. Then the ON filter is
> applied.
>
> If Table A has 20,000,000 rows and Table B has 30,000,000 rows - the
> system would read all the rows and put all the combinations together to
> get 600,000,000,000,000 rows? Then apply the ON filter???

That is what would happen if there wasn't an optimizer.

Keep in mind that SQL is a *declarative* language. You tell what want. The
DBMS figures out how to perform the operation in the most efficient manner.

But in order to get you what you want, you need to understand the rules
the computer works according to, and those are the rules than Plamen
describes.

The optimizer is free to recast the computation order as long as the result
is guaranteed to be the same.


--
Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
From: tshad on

"Erland Sommarskog" <esquel(a)sommarskog.se> wrote in message
news:Xns9DD1637D0EA91Yazorman(a)127.0.0.1...
> tshad (tfs(a)dslextreme.com) writes:
>> I was looking at an article by Plamen Ratchev (which I had printed a
>> while ago but can't find it online anymore) where he states that the
>> first thing a query does if there are 2 tables involved is create a
>> Cartesian Product (cross Join) between each table. Then the ON filter is
>> applied.
>>
>> If Table A has 20,000,000 rows and Table B has 30,000,000 rows - the
>> system would read all the rows and put all the combinations together to
>> get 600,000,000,000,000 rows? Then apply the ON filter???
>
> That is what would happen if there wasn't an optimizer.
>
Why?

Why wouldn't it just read both tables and take the rows that match the ON
filter?

> Keep in mind that SQL is a *declarative* language. You tell what want. The
> DBMS figures out how to perform the operation in the most efficient
> manner.
>
> But in order to get you what you want, you need to understand the rules
> the computer works according to, and those are the rules than Plamen
> describes.
>
> The optimizer is free to recast the computation order as long as the
> result
> is guaranteed to be the same.
>

I agree that that is what happens which is why it is confusing when thinking
about how joins work. I have always been told that you do the joins left to
right and that each join becomes the left condition of the next join.

For example,

Select *
From tableA a
join tableB b on a.id = b.id
join tablec c on b.id = c.id
join tabled d on c.id = d.id
where something.

In this case, a is joined with b, b is joined with the result of a and b, d
is joined with the results of a and b and c.

If the optimizer changed the order this is done then couldn't you possibly
have a different result, especially if dealing with left joins or there is
multiple criteria in one of the joins.

Thanks,

Tom

>
> --
> Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx


From: --CELKO-- on
When you have more of a background, you might want to read some ANSI/
ISO Standards. They are dull, boring and complicated and in a strange
language we called "Standard-speak" when i was on the committee. One
of the technical terms is "effectively" which means that we are about
to describe a way to do something (like that CROSS JOIN thing) which
will have some effect. But any process that produces that same result
is just as good.

I work out a full SELECT statement in SQL FOR SMARTIES as per the
formal process just to show how this could be done.
From: Erland Sommarskog on
tshad (tfs(a)dslextreme.com) writes:
> Why?
>
> Why wouldn't it just read both tables and take the rows that match the ON
> filter?

But how it read them? How would it know which fits together? Logically the
only way to do this is to create all combinations and then filter.

Oh, in practice the optimizer does it another way, but here is the matter
of finding a neutral description.

But as Celko points out, the SQL standards are written in a very formalistic
language, and they are defintely not trivial to understand.

--
Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

 |  Next  |  Last
Pages: 1 2 3
Prev: Top 2 from count
Next: CREATE BLANK ROWS BETWEEN DATA