From: Nikanth Karthikesan on
On Thursday 11 March 2010 19:58:11 Theodore Tso wrote:
> On Mar 11, 2010, at 8:57 AM, Nikanth Karthikesan wrote:
> > I guess, what he meant was, to keep filesystem blocks aligned, even if
> > the partition is not. Say if the partition is mis-aligned by 512-bytes,
> > let the filesystem waste 4k-512bytes and keep it's blocks aligned. But it
> > might be a case of over-engineering, possibly requiring disk format
> > change.
>
> Ah, yes, I agree with you; that's probably what he meant.
>
> Sure, that's theoretically possible, but it would mean changing every
> single filesystem, and it would require a file system format change --- or
> at least a file system format extension.
>
> It would seem to be way easier to simply fix the partitioning tools to do
> the right thing, though.
>

Yes. May be, just a simple but transparent device-mapper like mapping on top
of the mis-aligned partition, to do the alignment. Then the file-system code
need not change much.

But Linux already has device-mapper and Linux will not be affected with mis-
aligned partitions, when we use LVM.

But the actual problem here is that partitioning tools might create partitions
that wont allow other operating-systems to boot. So it might be enough, if the
partitioning tools just create partitions with (mis-)alignment requirement for
Windows.

Thanks
Nikanth
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Nikanth Karthikesan on
On Thursday 11 March 2010 20:09:34 James Bottomley wrote:
> On Thu, 2010-03-11 at 09:28 -0500, Theodore Tso wrote:
> > On Mar 11, 2010, at 8:57 AM, Nikanth Karthikesan wrote:
> > > I guess, what he meant was, to keep filesystem blocks aligned, even if
> > > the partition is not. Say if the partition is mis-aligned by 512-bytes,
> > > let the filesystem waste 4k-512bytes and keep it's blocks aligned. But
> > > it might be a case of over-engineering, possibly requiring disk format
> > > change.
> >
> > Ah, yes, I agree with you; that's probably what he meant.
> >
> > Sure, that's theoretically possible, but it would mean changing every
> > single filesystem, and it would require a file system format change
> > --- or at least a file system format extension.
> >
> > It would seem to be way easier to simply fix the partitioning tools to
> > do the right thing, though.
>
> Actually, it's a layering violation. The filesystem shouldn't need to
> probe the device layout ... particularly when there are complexities
> like is it logical 512 or physical, and if logical 512 on 4k does it
> have an offset exponent or not.
>
> We can transmit certain abstractions of information up the stack (like
> stripe width for RAID arrays which should be the fs optimal write size),
> but for this type of alignment, which can be completely solved at the
> partition layer, the information should really stay there and the
> filesystem should "just work".
>

Right. It would be layering violation and we have LVM to solve it already.

The real problem, here is just that partitioning-tools should create
partitions that can work with both XP as well as Windows7. May be distro
installers, should ask the user which compatibility he needs.

Thanks
Nikanth
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

On 03/12/2010 12:00 AM, Nikanth Karthikesan wrote:
> But the actual problem here is that partitioning tools might create
> partitions that wont allow other operating-systems to boot. So it
> might be enough, if the partitioning tools just create partitions
> with (mis-)alignment requirement for Windows.

Turns out XP is generally OK. The reported problem was only on
specific configurations (some BIOS stuff). Windows 2000 reportedly
would be hurt but I really think we don't have to care about that too
much. So, it seems like we wouldn't have to worry too much about it
and just go ahead with new alignment schemes. I'll update the doc
this weekend with new information from this now rather large thread.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: tytso on
On Thu, Mar 11, 2010 at 08:35:26PM +0530, Nikanth Karthikesan wrote:
> The real problem, here is just that partitioning-tools should create
> partitions that can work with both XP as well as Windows7. May be distro
> installers, should ask the user which compatibility he needs.

4k aligned sectors will *work* with Windows XP, will it not? It's
just simply a matter of Windows XP, being really ancient, doesn't
create properly alligned partitions by default.

And how often are we going to see Windows XP systems with these new 4k
physical sector drives anyway, where the first OS to touch the
partition is Windows XP? And in the case where this does happy, the
resulting partition will be result in terribly performance for Windows
XP as well as Linux.

What's the specific scenario which you are trying to solve, and how
likely is it to occur in real life?

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mike Snitzer on
On Thu, Mar 11, 2010 at 10:00 AM, Nikanth Karthikesan <knikanth(a)suse.de> wrote:
> On Thursday 11 March 2010 19:58:11 Theodore Tso wrote:
>> On Mar 11, 2010, at 8:57 AM, Nikanth Karthikesan wrote:
>> > I guess, what he meant was, to keep filesystem blocks aligned, even if
>> > the partition is not. Say if the partition is mis-aligned by 512-bytes,
>> > let the filesystem waste 4k-512bytes and keep it's blocks aligned. But it
>> > might be a case of over-engineering, possibly requiring disk format
>> > change.
>>
>> Ah, yes, I agree with you; that's probably what he meant.
>>
>> Sure, that's theoretically possible, but it would mean changing every
>> �single filesystem, and it would require a file system format change --- or
>> �at least a file system format extension.
>>
>> It would seem to be way easier to simply fix the partitioning tools to do
>> �the right thing, though.
>>
>
> Yes. May be, just a simple but transparent device-mapper like mapping on top
> of the mis-aligned partition, to do the alignment. Then the file-system code
> need not change much.
>
> But Linux already has device-mapper and Linux will not be affected with mis-
> aligned partitions, when we use LVM.

Well, device-mapper and LVM needed to be updated to make them "just
work" but yes that work has been done.

> But the actual problem here is that partitioning tools might create partitions
> that wont allow other operating-systems to boot. So it might be enough, if the
> partitioning tools just create partitions with (mis-)alignment requirement for
> Windows.

I'm not following...

Anyway, 4K drives that are 512b logical and 4K physical may or may not
also have "DOS partition compensation" that use LBA -1 as the first
naturally (4K) aligned start. This means that the partition tools
need to shift the start of the first primary partition to be offset by
3584 bytes (7 512b sectors) for use with Linux. But for windows,
AFAIK windows XP and windows 7 create all partitions aligned on 1MB
boundaries. Linux's parted and fdisk create 1MB aligned partitions
now too.

So the only outlier is older versions of windows (< XP) and Linux (old
fdisk and parted, etc also use DOS partitioning) that don't use
naturally aligned (e.g. 1MB) partition boundaries. In those versions
of Windows and LInux there are ways to change the default start of
sector 63. That said, there is an opportunity to improve
documentation for how to workaround DOS partitioning on these
operating systems.

One other piece worth mentioning on this "IO Toplogy" support in the
entire Linux I/O Stack is the virt layers. hch has already extended
the virt-io protocol and qemu is in the finishing stages of being
updated to properly consume the "IO Topology" information. So we
really don't have any gaps in the Linux I/O stack.

mkp in particular, Jens, James, myself, and others implemented and
refined the SCSI and block changes. kzak, jim meyering, hans de
goede, hch, eric sandeen, bob peterson, myself and others updated all
other I/O stack layers ranging from DM to LVM, libblkid, fdisk, parted
to anaconda to mkfs.ext[234], mkfs.xfs, mkfs.gfs2 to virt-io and qemu.
FYI, all of these advances will be in Fedora 13 (quite a few are
already in Fedora 12).

There are obviously other Linux systems and userland tools (likely
Xen, other mkfs.* and more) that should be updated. Hopefully
maintainers and/or contributors of these projects will follow-up to
address those that need updating.

Again please see:
http://oss.oracle.com/~mkp/docs/linux-advanced-storage.pdf
http://people.redhat.com/msnitzer/docs/io-limits.txt
Some omissions include: Linux MD, which has been updated as mkp
pointed out, and I neglected to talk about virt-io and qemu (but like
I said they have been updated too).

Hopefully we're all closer to being on the same page now.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/