From: Martin K. Petersen on
>>>>> "Tejun" == Tejun Heo <tj(a)kernel.org> writes:

Tejun> The [Windows Vista/7] partitioner seems to be using 1M as the
Tejun> basic alignment unit and offsetting from there if explicitly
Tejun> requested by the drive

Yep.


Tejun> Please note that hdparm is misreporting the alignment offset. It
Tejun> should be reporting 512 instead of 256 for offset-by-one drives.

Already fixed. Your hdparm must be old.



Tejun> Partitioners maybe should only align partitions which will be
Tejun> used by Linux and default to the traditional layout for others
Tejun> while allowing explicit override.

I don't think we take the partition type into account. Karel?


Tejun> Reportedly, commonly used partitioners aren't ready to handle
Tejun> drives larger than 2 TiB in any configuration and alignment isn't
Tejun> done properly for drives with 4 KiB physical sectors. 4 KiB
Tejun> logical sector support is broken in both the kernel

Huh, what? My homedir is on a 4KiB LBS/PBS drive and has been for ~2
years.


Tejun> (need more details and probably a whole section on partitioner
Tejun> behaviors)

I'm Cc:'ing Karel Zak and Jim Meyering who have been doing all the
alignment work for fdisk and parted respectively. Karel, Jim: The full
writeup is here:

http://ata.wiki.kernel.org/index.php/ATA_4_KiB_sector_issues

It'd be great if you guys could share what you have been doing to the
tooling.


Tejun> Unfortunately, the transition to 4 KiB sector size, physical only
Tejun> or logical too, is looking fairly ugly. Hopefully, a reasonable
Tejun> solution can be reached in not too distant future but even with
Tejun> all the software side updated, it looks like it's gonna cause
Tejun> significant amount of confusion and frustration.

With regards to XP compatibility I don't think we should go too much out
of our way to accommodate it. XP has been disowned by its master and I
think virtualization will take care of the rest.

FWIW, recent fdisk has a command line flag that will enable/disable DOS
compatible layout.

--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on
>>>>> "James" == James Bottomley <James.Bottomley(a)suse.de> writes:

James> However, for 4k sectors, the main issues which have shown up in
James> testing by others (mostly Martin) are

James> 1. In native 4k mode, we work perfectly fine. *however*,
James> most BIOSs can't boot native 4k drives.

Correct. I have engaged with pretty much all the big OEMs in the
industry and so far the interest has been near zero.


James> 4. The aligment problem is made more complex by drives that
James> make use of the offset exponent feature (what you refer
James> to as offset by one) ... fortunately very few of these
James> have been seen in the wild and we're hopeful they can be
James> shot before they breed.

This topic is constantly up for debate in IDEMA. However, it looks like
we might win because of the impending demise of XP.


James> so the bottom line seems to be that if you want the device as a
James> non boot disk, use native 4k sectors and a non-msdos partition
James> label. If you want to boot from the drive and your bios won't
James> book 4k natively, partition everything using the 512 emulation
James> and try to align the partitions correctly. If your bios/uefi
James> will boot 4k natively, just use it and whatever partition label
James> the bios/uefi supports.

James> Martin can fill in the pieces I've left out.

Here's my latest take given what I hear on the grapevine:

1. 512-byte logical block size drives will be around forever for legacy
deployments because nobody is willing to do the required BIOS int13
work. It's not just a BIOS thing, this requires heavy changes to HBA
boot ROMs as well.

2. Some vendors are working on EFI firmware and will support booting off
of 4KB LBS drives there. This is mostly aimed at the server space.

3. 4 KB logical block size drives will mainly be targeted for use inside
arrays. Off the shelf enterprise drive models will most likely
continue to ship with a 512-byte LBS.

4. Part of the hesitation to work on booting off of 4 KB lbs drives is
motivated by a general trend in the industry to move boot
functionality to SSD. There are 4 KB LBS SSDs out there but in
general the industry is sticking to ATA for local boot.

--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on
>>>>> "hpa" == H Peter Anvin <hpa(a)zytor.com> writes:

hpa> I would very much like a reference for a platform which has
hpa> firmware which can successfully boot from 4K-logical media. It
hpa> would be very useful for bootloader testing.

I have yet to find one.


hpa> Aligning partitions is something we should have done long ago. It
hpa> affects RAID and many flash drives just as much or more than
hpa> 4K-sectored disks.

Yup.


hpa> As far as partitioning... I believe we should be using GPT
hpa> partition tables where possible. Even on non-EFI systems, it's
hpa> simply a much better partition table format.

Agreed.

--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on
>>>>> "Martin" == Martin K Petersen <martin.petersen(a)oracle.com> writes:

Martin> There are 4 KB LBS SSDs out there but in general the industry is
Martin> sticking to ATA for local boot.

Thus implying that ATA doesn't support 4 KB LBS, just that people stick
to the tried-and-true 512.

--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 03/08/2010 07:18 AM, Martin K. Petersen wrote:
>
> Tejun> Partitioners maybe should only align partitions which will be
> Tejun> used by Linux and default to the traditional layout for others
> Tejun> while allowing explicit override.
>
> I don't think we take the partition type into account. Karel?
>

We should not take the partition type into account. The other aspect is
that FAT partitions need to be formatted differently to maintain the
alignment once set; I have recently contributed patches (which were
accepted) into mkdosfs to do the right thing there.

Looking at the Windows XP article, it looks like it is limited to
certain BIOSes; unfortunately it doesn't say what the particular BIOS
issue is. If we can find a system which actually exhibits the bug it
might be possible to reverse-engineer a solution.

> Tejun> Reportedly, commonly used partitioners aren't ready to handle
> Tejun> drives larger than 2 TiB in any configuration and alignment isn't
> Tejun> done properly for drives with 4 KiB physical sectors. 4 KiB
> Tejun> logical sector support is broken in both the kernel
>
> Huh, what? My homedir is on a 4KiB LBS/PBS drive and has been for ~2
> years.

For > 2 TiB drives with 4 KiB logical sectors and MS-DOS partition
tables, it is.

> Tejun> Unfortunately, the transition to 4 KiB sector size, physical only
> Tejun> or logical too, is looking fairly ugly. Hopefully, a reasonable
> Tejun> solution can be reached in not too distant future but even with
> Tejun> all the software side updated, it looks like it's gonna cause
> Tejun> significant amount of confusion and frustration.
>
> With regards to XP compatibility I don't think we should go too much out
> of our way to accommodate it. XP has been disowned by its master and I
> think virtualization will take care of the rest.

I think that's is wildly optimistic, but I do observe there is a fix
from Microsoft in the article you reference.

> FWIW, recent fdisk has a command line flag that will enable/disable DOS
> compatible layout.

Yes, unfortunately it is still on by default.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/