From: Trenton D. Adams on
If I mount my usb key with "sync" option, I get 500kb or less transfer
speeds. If I use the gnome defaults, I get 60M+ for awhile, and then
it continually drops over time, down to the 500kb/s again. Gnome
defaults are...

/dev/sdc1 on /media/FLASH type vfat
(rw,nosuid,nodev,noatime,uhelper=hal,shortname=lower,flush,uid=500)

I have done similar tests on both Rally2 64G usb stick and sandisk
ultra (15M/s) SDHC 8G cards. I get lousy performance on both, unless
I set dirty bytes. These are both FAT 32. But, as you can see below,
14 minutes to transfer less than a couple gigs is a little nutty. The
3 minutes is a lot nicer. I am using 2.6.33 with a patch from
https://bugzilla.kernel.org/show_bug.cgi?id=15374

As an example, checkout this rsync
time rsync -v --progress /home/share/*.avi /media/disk/
1.avi
709911016 100% 8.88MB/s 0:01:16 (xfer#1, to-check=1/2)
2.avi
621254748 100% 8.07MB/s 0:01:13 (xfer#2, to-check=0/2)

sent 1331328404 bytes received 50 bytes 1510298.87 bytes/sec
total size is 1331165764 speedup is 1.00

real 14m40.863s
user 0m8.473s
sys 0m9.525s

It really looks like there's a scheduling issue. It seems as if the
system is IO thrashing on the flash drive, and bounces all over the
place in terms of performance. Sometimes it's really low, like the
2.73M/s, and other times it's really fast, like the 28.86M/s.
Although you can't see it there, there were times when rsync was
registering 200kb/s. None of them are "really" accurate, as
everything is queued for writing, but the final results of 1.5M/s
(calculated from the "real" time) is terrible.

I have not seen this bad of performance on a normal USB drive, but
only on my USB flash drive, which is FAT32. In addition, Windows and
Mac systems transfer easily 9M/s write speeds on my rally 2.

If I do the following...
echo 16000000 > /proc/sys/vm/dirty_bytes
the performance is 9-12M/s all the way through the transfer. It is
also interesting to note that it hung for 2-5 seconds in the middle,
and then continued, for whatever reason. Perhaps it was waiting for a
response from the flash drive. Anyhow, I don't like this solution,
because it affects the entire system. And, it's really nice to have
100M+/s on the local disks, at times when I'm writing 50-500M and I'm
done.

Now check out this rsync, with the dirty_bytes set. I see a VERY
consistent transfer speed, which may be "marginally" slower than Mac
or Windows, but I'd have to test to be sure.

tdanotebook sync # time rsync -v --progress /home/share/*.avi /media/disk/
1.avi
709911016 100% 7.16MB/s 0:01:34 (xfer#1, to-check=1/2)
2.avi
621254748 100% 7.58MB/s 0:01:18 (xfer#2, to-check=0/2)

sent 1331328404 bytes received 50 bytes 7673362.85 bytes/sec
total size is 1331165764 speedup is 1.00

real 2m53.456s
user 0m8.325s
sys 0m9.453s

I'm wondering if there's a way of having flash drives default to
dirty_bytes of 16M or less, and all other IO devices use normal
default values? Perhaps it could become a mount option, or
configurable through /proc or sysctl? Or, is there a way of improving
"sync" option for these devices, seeing that's what they really should
be. Windows/Mac have them in "sync" mode, or at least a semi-sync
mode, by default. Does sync mode just default to 512 byte blocks, and
that's why it's sooo slow?

Or, it may be nice for the system to automatically match the
dirty_bytes to the actual speed of the physical device, adjusting it
dynamically upward until it matches the transfer MB/s.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Theodore Tso on

On May 3, 2010, at 11:52 PM, Trenton D. Adams wrote:
> If I mount my usb key with "sync" option, I get 500kb or less transfer
> speeds. If I use the gnome defaults, I get 60M+ for awhile, and then
> it continually drops over time, down to the 500kb/s again. Gnome
> defaults are...
>
> /dev/sdc1 on /media/FLASH type vfat
> (rw,nosuid,nodev,noatime,uhelper=hal,shortname=lower,flush,uid=500)
>
> I have done similar tests on both Rally2 64G usb stick and sandisk
> ultra (15M/s) SDHC 8G cards. I get lousy performance on both, unless
> I set dirty bytes. These are both FAT 32. But, as you can see below,
> 14 minutes to transfer less than a couple gigs is a little nutty. The
> 3 minutes is a lot nicer. I am using 2.6.33 with a patch from
> https://bugzilla.kernel.org/show_bug.cgi?id=15374
>
> It really looks like there's a scheduling issue. It seems as if the
> system is IO thrashing on the flash drive, and bounces all over the
> place in terms of performance. Sometimes it's really low, like the
> 2.73M/s, and other times it's really fast, like the 28.86M/s.
> Although you can't see it there, there were times when rsync was
> registering 200kb/s. None of them are "really" accurate, as
> everything is queued for writing, but the final results of 1.5M/s
> (calculated from the "real" time) is terrible.
>
> I have not seen this bad of performance on a normal USB drive, but
> only on my USB flash drive, which is FAT32. In addition, Windows and
> Mac systems transfer easily 9M/s write speeds on my rally 2.
>
> If I do the following...
> echo 16000000 > /proc/sys/vm/dirty_bytes
> the performance is 9-12M/s all the way through the transfer....

Very interesting. How much memory do you have? (The core tuning parameter is dirty_ratio, which defaults to 20):

dirty_bytes: Contains the amount of dirty memory at which a process generating disk writes will itself start writeback. If dirty_bytes is written, dirty_ratio becomes a function of its value (dirty_bytes / the amount of dirtyable system memory).

dirty_ratio: Contains, as a percentage of total system memory, the number of pages at which a process which is generating disk writes will itself start writing out dirty data.

Or put another way, what is dirty_ratio after you set dirty_bytes to 16000000?

Something else that would be very interesting is blktrace runs with and without dirty_bytes set to 16000000.

Also, what was the average size of the files which you were writing out with that rsync command? Were they all avi files that were tens of megabytes? Hundreds of megabytes?

It does sound like the writeback code is doing something disastrously wrong on USB drives, perhaps interacting with the fat fs code.

-- Ted


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul Hartman on
On Mon, May 3, 2010 at 10:52 PM, Trenton D. Adams
<trenton.d.adams(a)gmail.com> wrote:
> It really looks like there's a scheduling issue. It seems as if the
> system is IO thrashing on the flash drive, and bounces all over the
> place in terms of performance. Sometimes it's really low, like the
> 2.73M/s, and other times it's really fast, like the 28.86M/s.
> Although you can't see it there, there were times when rsync was
> registering 200kb/s. None of them are "really" accurate, as
> everything is queued for writing, but the final results of 1.5M/s
> (calculated from the "real" time) is terrible.

I have a similar experience (posted to this list a few months ago)
with mounting a flash device (mobile phone) in USB mass storage mode.
When I/O scheduler for that device is CFQ, write performance is really
terrible. When I change the scheduler to deadline, performance is
several times better. In 2.6.32 pdflush was replaced and CFQ
performance saw a 4x increase but still far too slow.

CFQ in <=2.6.31: 450KB/sec
CFQ in >=2.6.32: 2MB/sec
Deadline in all: 17MB/sec

I didn't try anything with dirty_bytes.

FWIW :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Trenton D. Adams on
On Tue, May 4, 2010 at 5:20 AM, Theodore Tso <tytso(a)mit.edu> wrote:
>
> On May 3, 2010, at 11:52 PM, Trenton D. Adams wrote:
>> If I mount my usb key with "sync" option, I get 500kb or less transfer
>> speeds. �If I use the gnome defaults, I get 60M+ for awhile, and then
>> it continually drops over time, down to the 500kb/s again. �Gnome
>> defaults are...
>>
>> /dev/sdc1 on /media/FLASH type vfat
>> (rw,nosuid,nodev,noatime,uhelper=hal,shortname=lower,flush,uid=500)
>>
>> I have done similar tests on both Rally2 64G usb stick and sandisk
>> ultra (15M/s) SDHC 8G cards. �I get lousy performance on both, unless
>> I set dirty bytes. �These are both FAT 32. �But, as you can see below,
>> 14 minutes to transfer less than a couple gigs is a little nutty. �The
>> 3 minutes is a lot nicer. �I am using 2.6.33 with a patch from
>> https://bugzilla.kernel.org/show_bug.cgi?id=15374
>>
>> It really looks like there's a scheduling issue. �It seems as if the
>> system is IO thrashing on the flash drive, and bounces all over the
>> place in terms of performance. �Sometimes it's really low, like the
>> 2.73M/s, and other times it's really fast, like the 28.86M/s.
>> Although you can't see it there, there were times when rsync was
>> registering 200kb/s. �None of them are "really" accurate, as
>> everything is queued for writing, but the final results of 1.5M/s
>> (calculated from the "real" time) is terrible.
>>
>> I have not seen this bad of performance on a normal USB drive, but
>> only on my USB flash drive, which is FAT32. �In addition, Windows and
>> Mac systems transfer easily 9M/s write speeds on my rally 2.
>>
>> If I do the following...
>> � � � � �echo 16000000 > /proc/sys/vm/dirty_bytes
>> the performance is 9-12M/s all the way through the transfer....
>

Ahh, this one didn't get to LKML either, oops. I don't use gmail often. :P

Yeah, I should have mentioned that. I have 4G.

>
> dirty_bytes: Contains the amount of dirty memory at which a process generating disk writes will itself start writeback. If dirty_bytes is written, dirty_ratio becomes a function of its value (dirty_bytes / the amount of dirtyable system memory).
>
> dirty_ratio: Contains, as a percentage of total system memory, the number of pages at which a process which is generating disk writes will itself start writing out dirty data.
>
> Or put another way, what is dirty_ratio after you set dirty_bytes to 16000000?

dirty_ratio goes to 0 whenever I set dirty_bytes to anything, and
dirty_bytes goes to 0 when I set dirty_ratio. I even tried setting
dirty_bytes to ~1G.

>
> Something else that would be very interesting is blktrace runs with and without dirty_bytes set to 16000000.

tdanotebook linux # zcat /proc/config.gz | grep DEBUG_FS
CONFIG_DEBUG_FS=y

tdanotebook linux # mount | grep debug
none on /sys/kernel/debug type debugfs (rw)

tdanotebook linux # blktrace -d /dev/mmcblk0p1
BLKTRACESETUP: Inappropriate ioctl for device
Failed to start trace on /dev/mmcblk0p1

>
> Also, what was the average size of the files which you were writing out with that rsync command? Were they all avi files that were tens of megabytes? Hundreds of megabytes?

Usually hundreds of megabytes. I can do a few hundred without a
problem. Looks like 20% is 800M.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul Hartman on
On Tue, May 4, 2010 at 1:00 PM, Trenton D. Adams
<trenton.d.adams(a)gmail.com> wrote:
> On Tue, May 4, 2010 at 11:34 AM, Paul Hartman
> <paul.hartman+linux(a)gmail.com> wrote:
>> On Mon, May 3, 2010 at 10:52 PM, Trenton D. Adams
>> <trenton.d.adams(a)gmail.com> wrote:
>>> It really looks like there's a scheduling issue. It seems as if the
>>> system is IO thrashing on the flash drive, and bounces all over the
>>> place in terms of performance. Sometimes it's really low, like the
>>> 2.73M/s, and other times it's really fast, like the 28.86M/s.
>>> Although you can't see it there, there were times when rsync was
>>> registering 200kb/s. None of them are "really" accurate, as
>>> everything is queued for writing, but the final results of 1.5M/s
>>> (calculated from the "real" time) is terrible.
>>
>> I have a similar experience (posted to this list a few months ago)
>> with mounting a flash device (mobile phone) in USB mass storage mode.
>> When I/O scheduler for that device is CFQ, write performance is really
>> terrible. When I change the scheduler to deadline, performance is
>> several times better. In 2.6.32 pdflush was replaced and CFQ
>> performance saw a 4x increase but still far too slow.
>>
>> CFQ in <=2.6.31: 450KB/sec
>> CFQ in >=2.6.32: 2MB/sec
>> Deadline in all: 17MB/sec
>>
>> I didn't try anything with dirty_bytes.
>>
>> FWIW :)
>
> Oops, my message didn't reach the LKML, sorry for the spam Paul.
>
> I switched to deadline and dirty_ratio 20 for my flash device, and I
> am seeing VERY slow performance as well. I get a lot of freezing up
> of rsync, where the progress just stops (visually anyhow), which is
> the same as what I see with cfq. However, it's not 14 minutes as it
> was in my original email...
>
> [11:44 trenta(a)tdanotebook web] $ time rsync -v --progress
> /home/share/DVD/*.avi /media/disk/
> facing-the-giants.avi
> 709911016 100% 5.49MB/s 0:02:03 (xfer#1, to-check=1/2)
> jonah.avi
> 621254748 100% 15.97MB/s 0:00:37 (xfer#2, to-check=0/2)
>
> sent 1331328404 bytes received 50 bytes 4430377.55 bytes/sec
> total size is 1331165764 speedup is 1.00
>
> real 4m59.657s
> user 0m8.553s
> sys 0m9.501s
>
>
> with dirty_bytes 16000000, I still get twice the speed out of deadline.
>
> [11:53 trenta(a)tdanotebook web] $ time rsync -v --progress
> /home/share/DVD/*.avi /media/disk/
> facing-the-giants.avi
> 709911016 100% 7.62MB/s 0:01:28 (xfer#1, to-check=1/2)
> jonah.avi
> 621254748 100% 7.64MB/s 0:01:17 (xfer#2, to-check=0/2)
>
> sent 1331328404 bytes received 50 bytes 7948229.58 bytes/sec
> total size is 1331165764 speedup is 1.00
>
> real 2m47.244s
> user 0m8.429s
> sys 0m9.377s
>
>
> So, perhaps it's a combination of the schedulers and something else in
> the kernel? And perhaps, CFQ just amplifies something else in the
> kernel, more than deadline does?

In my case I also noticed that if I'm using CFQ and leave everything
as normal, the problem only shows up when I copy more than 1 file
before syncing. For example, with 2 test files 700M each in size:

# one file at a time with sync in-between, fast speeds:
$ sync; time sh -c "cp file1 /mnt/usb; sync; cp file2 /mnt/usb; sync"

real 1m25.697s
user 0m0.005s
sys 0m2.509s

# copy two files in a row, then sync, speed is bad:
$ sync; time sh -c "cp file1 file2 /mnt/usb; sync"

real 6m51.439s
user 0m0.007s
sys 0m2.615s

(and, like you, if I mount with "sync" option the speed is basically terrible)

I've tested on 2 machines and had the same results on both, almost
identical timings in fact. Both 64-bit (Core 2 E6600, Core i7 920).
Others who have the same device have tested and some experience the
problem, some do not. I'm not sure of their system specs.

In my case the first machine had 8GB or RAM and second had 12GB of RAM
and in both cases actual RAM use by system was around 1G, leaving the
rest to be used for disk caching etc. in case it is related to having
a large amount of RAM.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/