From: david on
Hi,

Can anyone point me to a reference or discuss the best way to store a
vector of 120 to 480 numbers in the database? Rows seem to be out
since we would quickly top the billion row mark. A table with 480
columns is too unnormalized. A single varchar(max) column? This
seems the best answer for now unless there is a more efficiant way of
storing it.

Thanks for any help or opinions,

David
From: --CELKO-- on
I think of a vector as a particular kind of mathematical structure and
you seem to be talking about a list of some kind. Vectors have a fixed
number of dimensions, etc. Here is a guess:

CREATE TABLE Vectors
(vector_id CHAR(3) NOT NULL, --whatever
dim_nbr INTEGER NOT NULL,
CHECK (dim_nbr BETWEEN 1 AND 480),
PRIMARY KEY (vector_id, dim_nbr),
dim_val INTEGER NOT NULL);

Do you need to enforce the lower limit of 120 or not?
From: Erland Sommarskog on
david (daveg.1(a)comcast.net) writes:
> Can anyone point me to a reference or discuss the best way to store a
> vector of 120 to 480 numbers in the database? Rows seem to be out
> since we would quickly top the billion row mark. A table with 480
> columns is too unnormalized. A single varchar(max) column? This
> seems the best answer for now unless there is a more efficiant way of
> storing it.

It depends on what you want to do with the data.

There is no issue per se if you get a billion rows in a table. The table
wouldn't be that much bigger than if you instead have a 480-colunm table,
and the total size is the biggest concern. (The table would be bigger
with many narrow rows, though, since there is a certain overhead per row.)

The key is: do you have any need to access the individual points in the
vector from SQL? If you do, you should store them as rows, or else it
will be very painful.

But if all manipulation of the details of the vector will happen outside
SQL Server, there can indeed be reason for a more compact format. In
that case, I would use varbinary(MAX). An alternative that would permit
some access to indvidual points in the vector is to implement a user-
defined data type in the CLR. Assume that the points are floats, this
would only be an option if you are on SQL 2008, since UDTs are limited
to 8000 bytes on SQL 2005.

--
Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

From: david on
On Jul 8, 6:50 pm, --CELKO-- <jcelko...(a)earthlink.net> wrote:
> I think of a vector as a particular kind of mathematical structure and
> you seem to be talking about a list of some kind. Vectors have a fixed
> number of dimensions, etc. Here is a guess:
>
> CREATE TABLE Vectors
> (vector_id CHAR(3) NOT NULL, --whatever
>  dim_nbr INTEGER NOT NULL,
>    CHECK (dim_nbr BETWEEN 1 AND 480),
>  PRIMARY KEY (vector_id, dim_nbr),
>  dim_val INTEGER NOT NULL);
>
> Do you need to enforce the lower limit of 120 or not?

Joe,

Thank you for your response.

The vectors I was referring to had to do with the length in months of
loans, from 10 to 40 years hence 120 to 480 numbers to be used as
adjustments.

I spoke with the senior developer today and he had already solved the
problem and tested the speed of the solution (as rows). He was going
to submit it today for review and upgrade to QA. My fault for not
checking the need after the junior guy asked for help.

Thanks again.

David
From: david on
Erland,

Thank you for your response.

The vectors I was referring to had to do with the length in months of
loans, from 10 to 40 years hence 120 to 480 numbers to be used as
adjustments.

I spoke with the senior developer today and he had already solved the
problem and tested the speed of the solution (as rows). He was going
to submit it today for review and upgrade to QA. My fault for not
checking with him after the junior guy asked for help.

Thanks again.

David