From: m on
If the transaction volume is low (i.e. less then ten per employee per year
for a small company), then this solution will be okay, but once the
transaction volume increases, then you will have no choice except to
benchmark account transactions to form a new datum. It may not be much to
compute the balance for a single employee, but in an organization with half
a million employees, some of whom may have 30+ years of transactions, it
becomes much more onerous to re-compute the totals on demand. And in
applications where there may be millions of transactions per day, then it
absolutely makes sense to realize and store the computed information -
which, paralleling the way as a Jacobin can transform n-space complex
relations to m-space functions, said benchmarked computed data can form the
basis of an 1nf schema - usually in different systems however.


"--CELKO--" <jcelko212(a)earthlink.net> wrote in message
news:64b96480-5514-4af8-8e1c-d14a3ff3ee9e(a)t32g2000prg.googlegroups.com...
> CREATE VIEW EducationAccountSummary (P.emp_id, claim_tot,
> claim_balance, most_recent_claim_date)
> AS
> SELECT P.emp_id, SUM(E.claim_amt) AS claim_tot,
> 600.00 � SUM(E.claim_amt) AS claim_balance,
> MAX(E.claim_date) AS most_recent_claim_date
> FROM Personnel AS P
> LEFT OUTER JOIN
> EducationAccounts AS E
> ON P.emp_id = E.emp_id
> GROUP BY P.emp_id;
>
From: Tony Rogerson on
Absolutely right m; I do a few days a week for a debt management company
over the last 3 years I've been steadily migrating them from flat files into
SQL Server.

We have accounts, accounts may have 100's of transactions; we need rolling
balances - to do it celko's way just does not scale; I use a trigger to
populate rolling totals as the transactions come in (there is additional
logic to calculate other things like retention which is a complex
calculation).

Some would argue put that complex calculation in the middle tier, I'd argue
back - have you actually tried that in a real world?

Use a trigger and have summary tables or additional columns on row.

Tony.

"m" <m(a)b.c> wrote in message news:ulJgt1hyKHA.3264(a)TK2MSFTNGP06.phx.gbl...
> If the transaction volume is low (i.e. less then ten per employee per year
> for a small company), then this solution will be okay, but once the
> transaction volume increases, then you will have no choice except to
> benchmark account transactions to form a new datum. It may not be much to
> compute the balance for a single employee, but in an organization with
> half a million employees, some of whom may have 30+ years of transactions,
> it becomes much more onerous to re-compute the totals on demand. And in
> applications where there may be millions of transactions per day, then it
> absolutely makes sense to realize and store the computed information -
> which, paralleling the way as a Jacobin can transform n-space complex
> relations to m-space functions, said benchmarked computed data can form
> the basis of an 1nf schema - usually in different systems however.
>
>
> "--CELKO--" <jcelko212(a)earthlink.net> wrote in message
> news:64b96480-5514-4af8-8e1c-d14a3ff3ee9e(a)t32g2000prg.googlegroups.com...
>> CREATE VIEW EducationAccountSummary (P.emp_id, claim_tot,
>> claim_balance, most_recent_claim_date)
>> AS
>> SELECT P.emp_id, SUM(E.claim_amt) AS claim_tot,
>> 600.00 � SUM(E.claim_amt) AS claim_balance,
>> MAX(E.claim_date) AS most_recent_claim_date
>> FROM Personnel AS P
>> LEFT OUTER JOIN
>> EducationAccounts AS E
>> ON P.emp_id = E.emp_id
>> GROUP BY P.emp_id;
>>
From: --CELKO-- on
>> It may not be much to compute the balance for a single employee, but in an organization with half a million employees, some of whom may have 30+ years of transactions, it becomes much more onerous to re-compute the totals on demand.  <<

Commercial applications do not have decades of data on line; they have
to close out their books at the end of a fiscal reporting period
(year, quarter or whatever) and archive it. The real work is done on
data in the current fiscal reporting period. Even data warehouses are
not that deep -- Wal-Mart keeps 18 months of data and they are one of
the largest DW on Earth.

The only scenario I can imagine is a scientific DB with a long history
of measurements. But the math you do on them is not a simple running
total; it would feed a statistical package. It woudl be a very
different DB than what we run on SQL Server.

During the current reporting period, there will be corrections that
appear out of temporal sequence. The most common one that comes to
mind is that deposits are applied before debits on the same day, no
matter when they were timestamped. When I worked at a bank, this was a
huge percentage of the retail banking volume (payroll deposits were
actually made after withdrawals when we loaded company payroll tapes
after the close of business). Keeping a running total in the accounts
in real time would have been a disaster on paydays.

I don't know what the percentage is today, but a computer can do the
math faster than it can read a disk.

If you want a more theoretical reason, you don't store computed
columns in an RDBMS. Different levels of aggregation, mixing data and
metadata, blah, blah, blah.
From: m on
Not wanting to provoke much further debate, but feeling compelled to
respond:

Retail sales (i.e. Wal-Mart) is certainly a high transaction volume
business, but the transactions are nearly all independent. In retail wealth
management however, you would be very upset if your broker didn't have
records of your last transactions just because they were last year; and even
more upset if your banker didn't keep complete records of your mortgage
payments, spanning perhaps the last couple of decades - unless of course
your performance on that loan wasn't quite what it ought to have been ;)
Similarly, escrow receipt and inventory reconciliation business functions
often span decades, and are fundamentally concerned with what is rather then
how it has changed.

The point is not that 1nf etc. is bad - absolutely it is not - but rather
that 1nf for a particular application can depend on one's point of view, and
what the system is designed to achieve. Transaction processing systems must
work with transactions; inventory systems must work with holdings; and
appointment calendars must work with days and meetings - and in my case,
work out how I am to attend the three overlapping ones where I am listed as
a required attendee!

For the OP's case, he has low transactional volume per employee, but the
time over which an employee's records must be kept is long (years). And
since it is unlikely that every employee will use the complete allowance
each year, and carry forward is allowed, the simplest design for a large
number of employees is a monthly or annual benchmark. This design requires
two systems in one (transaction processing + inventory) but results in a
storage efficient and easily queried schema, but the drawback is the
necessity for batch processing outside of the RDBMS and some latency from
transaction posting until inventory reconciliation.

BTW: the problems that you are likely discussing regarding payroll are
likely the effect of check clearing and the cash forward service offered by
many retail banks. The problem relates mostly to the inter-institution
delivery of physical currency, and the way that many banking systems book
transactions, but has nothing to do with the client liability or accounting
except in the minds of those forget that the bank is _paid_ to take that
risk.

"--CELKO--" <jcelko212(a)earthlink.net> wrote in message
news:a9441de8-4961-47d6-8135-50089f39dca3(a)a16g2000pre.googlegroups.com...
>>> It may not be much to compute the balance for a single employee, but in
>>> an organization with half a million employees, some of whom may have 30+
>>> years of transactions, it becomes much more onerous to re-compute the
>>> totals on demand. <<
>
> Commercial applications do not have decades of data on line; they have
> to close out their books at the end of a fiscal reporting period
> (year, quarter or whatever) and archive it. The real work is done on
> data in the current fiscal reporting period. Even data warehouses are
> not that deep -- Wal-Mart keeps 18 months of data and they are one of
> the largest DW on Earth.
>
> The only scenario I can imagine is a scientific DB with a long history
> of measurements. But the math you do on them is not a simple running
> total; it would feed a statistical package. It woudl be a very
> different DB than what we run on SQL Server.
>
> During the current reporting period, there will be corrections that
> appear out of temporal sequence. The most common one that comes to
> mind is that deposits are applied before debits on the same day, no
> matter when they were timestamped. When I worked at a bank, this was a
> huge percentage of the retail banking volume (payroll deposits were
> actually made after withdrawals when we loaded company payroll tapes
> after the close of business). Keeping a running total in the accounts
> in real time would have been a disaster on paydays.
>
> I don't know what the percentage is today, but a computer can do the
> math faster than it can read a disk.
>
> If you want a more theoretical reason, you don't store computed
> columns in an RDBMS. Different levels of aggregation, mixing data and
> metadata, blah, blah, blah.

From: Tony Rogerson on
> Commercial applications do not have decades of data on line; they have
> to close out their books at the end of a fiscal reporting period

Obviously never worked in the financial sector.

Obviously not much experience in business "period".

Do you actually try any of this theory on large data sets with multiple
concurrent users? Or, like I suspect you either tell your client something
entirely different when you consultant?

--ROGGIE--

"--CELKO--" <jcelko212(a)earthlink.net> wrote in message
news:a9441de8-4961-47d6-8135-50089f39dca3(a)a16g2000pre.googlegroups.com...
>>> It may not be much to compute the balance for a single employee, but in
>>> an organization with half a million employees, some of whom may have 30+
>>> years of transactions, it becomes much more onerous to re-compute the
>>> totals on demand. <<
>
> Commercial applications do not have decades of data on line; they have
> to close out their books at the end of a fiscal reporting period
> (year, quarter or whatever) and archive it. The real work is done on
> data in the current fiscal reporting period. Even data warehouses are
> not that deep -- Wal-Mart keeps 18 months of data and they are one of
> the largest DW on Earth.
>
> The only scenario I can imagine is a scientific DB with a long history
> of measurements. But the math you do on them is not a simple running
> total; it would feed a statistical package. It woudl be a very
> different DB than what we run on SQL Server.
>
> During the current reporting period, there will be corrections that
> appear out of temporal sequence. The most common one that comes to
> mind is that deposits are applied before debits on the same day, no
> matter when they were timestamped. When I worked at a bank, this was a
> huge percentage of the retail banking volume (payroll deposits were
> actually made after withdrawals when we loaded company payroll tapes
> after the close of business). Keeping a running total in the accounts
> in real time would have been a disaster on paydays.
>
> I don't know what the percentage is today, but a computer can do the
> math faster than it can read a disk.
>
> If you want a more theoretical reason, you don't store computed
> columns in an RDBMS. Different levels of aggregation, mixing data and
> metadata, blah, blah, blah.