From: Craig on
I'm using SQL 2008.

I have enormous CSV files that I'm importing into SQL. The format of all
the files is like this: data, data, , , data, data, , data, , , , data

There are no text qualifiers.

I create a new table by copying one that I've manually imported through
"import data", then I bulk insert with this code:

BULK INSERT filename
FROM '\\Servername\filename.csv'
WITH
(
FIRSTROW = 2, -- (first row is blank, fyi)
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\r\n'
)

However, I get these errors:

Msg 4866, Level 16, State 8, Line 1
The bulk load failed. The column is too long in the data file for row 1,
column 16. Verify that the field terminator and row terminator are specified
correctly.

Msg 7301, Level 16, State 2, Line 1
Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB
provider "BULK" for linked server "(null)".

I don't know what this means.

PS: The reason I'm using Bulk Insert is because all the csv files have a
blank row at the top. They're pretty big, too (over 10gb) so I can't just
open them up in a text editor and delete the first row).

Any ideas? Thanks.
--
Craig
From: David Hay on
Craig,

One I would specify the datafiletype. I'd also set the maxerrors. It
should keep processing, and report on what rows are in error. that
should give you and idea on how to find the problem rows. There are
some text editors out there like largeedit that will let you pull it
up. I hate saying this one, but one way we found bad records was to
pull it into access with the import wizard. It created a table with
all the import errors.

Good luck!

From BOL:

Copies a data file into a database table or view in a user-specified
format.

Syntax
BULK INSERT [ [ 'database_name'.] [ 'owner' ].] { 'table_name' FROM
'data_file' }
[ WITH
(
[ BATCHSIZE [ = batch_size ] ]
[ [ , ] CHECK_CONSTRAINTS ]
[ [ , ] CODEPAGE [ = 'ACP' | 'OEM' | 'RAW' |
'code_page' ] ]
[ [ , ] DATAFILETYPE [ =
{ 'char' | 'native'| 'widechar' | 'widenative' } ] ]
[ [ , ] FIELDTERMINATOR [ = 'field_terminator' ] ]
[ [ , ] FIRSTROW [ = first_row ] ]
[ [ , ] FIRE_TRIGGERS ]
[ [ , ] FORMATFILE = 'format_file_path' ]
[ [ , ] KEEPIDENTITY ]
[ [ , ] KEEPNULLS ]
[ [ , ] KILOBYTES_PER_BATCH [ = kilobytes_per_batch ] ]
[ [ , ] LASTROW [ = last_row ] ]
[ [ , ] MAXERRORS [ = max_errors ] ]
[ [ , ] ORDER ( { column [ ASC | DESC ] } [ ,...n ] ) ]
[ [ , ] ROWS_PER_BATCH [ = rows_per_batch ] ]
[ [ , ] ROWTERMINATOR [ = 'row_terminator' ] ]
[ [ , ] TABLOCK ]
)
]




From: Erland Sommarskog on
Craig (Craig(a)discussions.microsoft.com) writes:
> I have enormous CSV files that I'm importing into SQL. The format of all
> the files is like this: data, data, , , data, data, , data, , , , data
>
> There are no text qualifiers.
>
> I create a new table by copying one that I've manually imported through
> "import data", then I bulk insert with this code:
>
> BULK INSERT filename
> FROM '\\Servername\filename.csv'
> WITH
> (
> FIRSTROW = 2, -- (first row is blank, fyi)
> FIELDTERMINATOR = ',',
> ROWTERMINATOR = '\r\n'
> )
>
> However, I get these errors:
>
> Msg 4866, Level 16, State 8, Line 1
> The bulk load failed. The column is too long in the data file for row 1,
> column 16. Verify that the field terminator and row terminator are
> specified correctly.

That first row is just an empty line, or is there a ton of spaces on it?

When you say FIRSTROW=2, this does not mean that BULK INSERT will skip the
first line in the file. It means that it will skip the first record
it identifies. So that first blank line will be prepended to the
first field on the second line in the file. BULK INSERT reads a binary
stream, and does not think in terms of lines.

I would first try using the command-line tool BCP. BCP is the same concept
as BULK INSERT, and the two works largely the same. But they are different
implementations, and maybe BCP is able to cope with the file.

If that does not work out, I would try to use a format file that
goes:

9.0
5
1 SQLCHAR 0 0 "\n" 0 "" ""
2 SQLCHAR 0 0 "," 1 col1 ""
3 SQLCHAR 0 0 "," 2 col2 ""
4 SQLCHAR 0 0 "," 3 col3 ""
5 SQLCHAR 0 0 "\r" 4 col4 ""

This examples assumes that the table has five columns. I don't think
BULK INSERT will like this, but BCP may do.



--
Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

From: Craig on
Ok, thanks for the tip.
--
Craig


"David Hay" wrote:

> Craig,
>
> One I would specify the datafiletype. I'd also set the maxerrors. It
> should keep processing, and report on what rows are in error. that
> should give you and idea on how to find the problem rows. There are
> some text editors out there like largeedit that will let you pull it
> up. I hate saying this one, but one way we found bad records was to
> pull it into access with the import wizard. It created a table with
> all the import errors.
>
> Good luck!
>
> From BOL:
>
> Copies a data file into a database table or view in a user-specified
> format.
>
> Syntax
> BULK INSERT [ [ 'database_name'.] [ 'owner' ].] { 'table_name' FROM
> 'data_file' }
> [ WITH
> (
> [ BATCHSIZE [ = batch_size ] ]
> [ [ , ] CHECK_CONSTRAINTS ]
> [ [ , ] CODEPAGE [ = 'ACP' | 'OEM' | 'RAW' |
> 'code_page' ] ]
> [ [ , ] DATAFILETYPE [ =
> { 'char' | 'native'| 'widechar' | 'widenative' } ] ]
> [ [ , ] FIELDTERMINATOR [ = 'field_terminator' ] ]
> [ [ , ] FIRSTROW [ = first_row ] ]
> [ [ , ] FIRE_TRIGGERS ]
> [ [ , ] FORMATFILE = 'format_file_path' ]
> [ [ , ] KEEPIDENTITY ]
> [ [ , ] KEEPNULLS ]
> [ [ , ] KILOBYTES_PER_BATCH [ = kilobytes_per_batch ] ]
> [ [ , ] LASTROW [ = last_row ] ]
> [ [ , ] MAXERRORS [ = max_errors ] ]
> [ [ , ] ORDER ( { column [ ASC | DESC ] } [ ,...n ] ) ]
> [ [ , ] ROWS_PER_BATCH [ = rows_per_batch ] ]
> [ [ , ] ROWTERMINATOR [ = 'row_terminator' ] ]
> [ [ , ] TABLOCK ]
> )
> ]
>
>
>
>
> .
>