From: ksr on
Hello,

I am using WinInet API FtpFindFirstFile to enumerate files and folders
on FTP server. It works fine for filenames that have english
characters and filepath upto 260 characters. But for filenames that
have Japanese characters it fails.
For Japanese filenames it works fine upto 128 characters, but fails on
longer filenames. It is a unicode compiled project, my question is,
why is it failing to read upto 260 characters for japanese filenames.
I tried by explicitly using FtpFindFirstFileW, but it does not work.
Please help.

Thank you,
ksr
From: Joseph M. Newcomer on
There are some problems with Unicode file names. For example, Unicode file names can
start with \\? and can therefore be more than MAX_PATH (260) characters long.
Unfortunately, a huge number of APIs assume that buffers willl be MAX_PATH in length, and
a number of utility functions also assume the MAX_PATH limit.

Unfortunately, one such API is FindFirstFile; if you look at the WIN32_FIND_DATA
structure, you will see that the cFileName member is limited to MAX_PATH. So it sounds
like a bug that causes this to fail with long file names. Now when you say "long" file
names, I presume you mean just the filename part, e.g., you might have
d:\longnamehere\longernamehere\verylongname\*.*
as your pattern, and expect to get
AVeryLongFileNameThatContainsMoreThan...128Characters
back.

A couple tests you might run to isolate this:
Try a filename component in a non-Japanese version of WIndows that uses
only the characters A-Z0-9, which is more than 128 characters

Try a filename component in your Japanese version of Windows that uses
only the characters A-Z0-9

This might help Microsoft determine where the failure is. Also, check whether or not your
Japanese characters require Unicode surrogates for UTF-16 encoding.

It sounds like a bug, but the more you can report, the more likely it is you will get a
fix. This means that it *might* be dependent on specific characters in the file name (the
surrogate encoding).

I've not had experience here, but it sounds like a genuine bug. The fact that it happens
at *almost* half of MAX_PATH is very suspect (if some is sent a character count, thinks it
is a byte count, and truncates by dividing it by 2 again sounds like one guess).

Is it exactly 128, or is 129 or 130? This may be important. If you report this, you may
have to give the actual byte sequences that encode the file name, in case it is
data-sensitive.
joe
On Mon, 28 Sep 2009 16:20:35 -0700 (PDT), ksr <sujatha.kokkirala(a)gmail.com> wrote:

>Hello,
>
>I am using WinInet API FtpFindFirstFile to enumerate files and folders
>on FTP server. It works fine for filenames that have english
>characters and filepath upto 260 characters. But for filenames that
>have Japanese characters it fails.
>For Japanese filenames it works fine upto 128 characters, but fails on
>longer filenames. It is a unicode compiled project, my question is,
>why is it failing to read upto 260 characters for japanese filenames.
>I tried by explicitly using FtpFindFirstFileW, but it does not work.
>Please help.
>
>Thank you,
>ksr
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Scot T Brennecke on
Joseph M. Newcomer wrote:
> There are some problems with Unicode file names. For example, Unicode file names can
> start with \\? and can therefore be more than MAX_PATH (260) characters long.
> Unfortunately, a huge number of APIs assume that buffers willl be MAX_PATH in length, and
> a number of utility functions also assume the MAX_PATH limit.
>
> Unfortunately, one such API is FindFirstFile; if you look at the WIN32_FIND_DATA
> structure, you will see that the cFileName member is limited to MAX_PATH. So it sounds
> like a bug that causes this to fail with long file names. Now when you say "long" file
> names, I presume you mean just the filename part, e.g., you might have
> d:\longnamehere\longernamehere\verylongname\*.*
> as your pattern, and expect to get
> AVeryLongFileNameThatContainsMoreThan...128Characters
> back.
>
> A couple tests you might run to isolate this:
> Try a filename component in a non-Japanese version of WIndows that uses
> only the characters A-Z0-9, which is more than 128 characters
>
> Try a filename component in your Japanese version of Windows that uses
> only the characters A-Z0-9
>
> This might help Microsoft determine where the failure is. Also, check whether or not your
> Japanese characters require Unicode surrogates for UTF-16 encoding.
>
> It sounds like a bug, but the more you can report, the more likely it is you will get a
> fix. This means that it *might* be dependent on specific characters in the file name (the
> surrogate encoding).
>
> I've not had experience here, but it sounds like a genuine bug. The fact that it happens
> at *almost* half of MAX_PATH is very suspect (if some is sent a character count, thinks it
> is a byte count, and truncates by dividing it by 2 again sounds like one guess).
>
> Is it exactly 128, or is 129 or 130? This may be important. If you report this, you may
> have to give the actual byte sequences that encode the file name, in case it is
> data-sensitive.
> joe
> On Mon, 28 Sep 2009 16:20:35 -0700 (PDT), ksr <sujatha.kokkirala(a)gmail.com> wrote:
>
>> Hello,
>>
>> I am using WinInet API FtpFindFirstFile to enumerate files and folders
>> on FTP server. It works fine for filenames that have english
>> characters and filepath upto 260 characters. But for filenames that
>> have Japanese characters it fails.
>> For Japanese filenames it works fine upto 128 characters, but fails on
>> longer filenames. It is a unicode compiled project, my question is,
>> why is it failing to read upto 260 characters for japanese filenames.
>> I tried by explicitly using FtpFindFirstFileW, but it does not work.
>> Please help.
>>
>> Thank you,
>> ksr
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm

Note that the OP inquired about the FTP variant. Does your knowledge/advice on this subject also apply to
FtpFindFirstFile Function (Windows)?:
http://msdn.microsoft.com/en-us/library/aa384146(VS.85).aspx
From: Joseph M. Newcomer on
See below
On Tue, 29 Sep 2009 01:50:34 -0500, Scot T Brennecke <ScotB(a)Spamhater.MVPs.org> wrote:

>
>Joseph M. Newcomer wrote:
>> There are some problems with Unicode file names. For example, Unicode file names can
>> start with \\? and can therefore be more than MAX_PATH (260) characters long.
>> Unfortunately, a huge number of APIs assume that buffers willl be MAX_PATH in length, and
>> a number of utility functions also assume the MAX_PATH limit.
>>
>> Unfortunately, one such API is FindFirstFile; if you look at the WIN32_FIND_DATA
>> structure, you will see that the cFileName member is limited to MAX_PATH. So it sounds
>> like a bug that causes this to fail with long file names. Now when you say "long" file
>> names, I presume you mean just the filename part, e.g., you might have
>> d:\longnamehere\longernamehere\verylongname\*.*
>> as your pattern, and expect to get
>> AVeryLongFileNameThatContainsMoreThan...128Characters
>> back.
>>
>> A couple tests you might run to isolate this:
>> Try a filename component in a non-Japanese version of WIndows that uses
>> only the characters A-Z0-9, which is more than 128 characters
>>
>> Try a filename component in your Japanese version of Windows that uses
>> only the characters A-Z0-9
>>
>> This might help Microsoft determine where the failure is. Also, check whether or not your
>> Japanese characters require Unicode surrogates for UTF-16 encoding.
>>
>> It sounds like a bug, but the more you can report, the more likely it is you will get a
>> fix. This means that it *might* be dependent on specific characters in the file name (the
>> surrogate encoding).
>>
>> I've not had experience here, but it sounds like a genuine bug. The fact that it happens
>> at *almost* half of MAX_PATH is very suspect (if some is sent a character count, thinks it
>> is a byte count, and truncates by dividing it by 2 again sounds like one guess).
>>
>> Is it exactly 128, or is 129 or 130? This may be important. If you report this, you may
>> have to give the actual byte sequences that encode the file name, in case it is
>> data-sensitive.
>> joe
>> On Mon, 28 Sep 2009 16:20:35 -0700 (PDT), ksr <sujatha.kokkirala(a)gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I am using WinInet API FtpFindFirstFile to enumerate files and folders
>>> on FTP server. It works fine for filenames that have english
>>> characters and filepath upto 260 characters. But for filenames that
>>> have Japanese characters it fails.
>>> For Japanese filenames it works fine upto 128 characters, but fails on
>>> longer filenames. It is a unicode compiled project, my question is,
>>> why is it failing to read upto 260 characters for japanese filenames.
>>> I tried by explicitly using FtpFindFirstFileW, but it does not work.
>>> Please help.
>>>
>>> Thank you,
>>> ksr
>> Joseph M. Newcomer [MVP]
>> email: newcomer(a)flounder.com
>> Web: http://www.flounder.com
>> MVP Tips: http://www.flounder.com/mvp_tips.htm
>
>Note that the OP inquired about the FTP variant. Does your knowledge/advice on this subject also apply to
>FtpFindFirstFile Function (Windows)?:
>http://msdn.microsoft.com/en-us/library/aa384146(VS.85).aspx

Note that the FtpFindFile uses the same WIN32_FIND_DATA structure as FindFirstFile. I was
surprised when I looked this up, but it does mean there is the same limitation.

The reason I was asking about surrogates is a question about what the FTP layer might be
doing in filling in this structure.

Sadly, someone believed the completely silly nonsense about having only one hyperlink per
page for a hyperlinked term, and therefore it is hard to find the hyperlink that actually
leads you to the WIN32_FIND_DATA structure (it is at the top, and if you scroll it off,
the next several instances are not hyperlinked. I believe this idiocy is due to
unemployable English majors who feel they have to impose nonsensical standards unsuitable
for hypertext documents because someone once told them this in a class.
joe

Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Mihai N. on
> I am using WinInet API FtpFindFirstFile to enumerate files and folders
> on FTP server. It works fine for filenames that have english
> characters and filepath upto 260 characters. But for filenames that
> have Japanese characters it fails.
> For Japanese filenames it works fine upto 128 characters, but fails on
> longer filenames. It is a unicode compiled project, my question is,
> why is it failing to read upto 260 characters for japanese filenames.
> I tried by explicitly using FtpFindFirstFileW, but it does not work.
> Please help.

I would try to connect with a telnet to the ftp server and see if
it supports RFC 2640 ("Internationalization of the File Transfer Protocol")
Most servers don't.

If it is supported, then I would do some digging to see if FtpFindFirstFile
understands it. It is possible that it is not.

If it works for short Japanese file names, but not for longer ones,
I would suspect some buffer lenght parameter is wrong.


--
Mihai Nita [Microsoft MVP, Visual C++]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email