From: Christoph Anton Mitterer on
Hi.


I'm currently investigating in Debian's boot process with the goal to
allow having the root-fs on perhaps even multiple stacked block devices
(e.g. something like disk->md/raid->lvm2->dm-crypt->fs or
disk->md/raid->dm-crypt->lvm2->fs).

For booting this works in principle quite fine via initramfs images
(although it's currently yet very configurable and has a quite fixed
order of the block devices)..

I stumbled however across a problem for the shutdown/reboot:
What Debian does about is the following (via sysvinit 0 or 6):
1. cryptdisks stop
2. lvm2 stop
3. umountroot
4. halt/reboot

That 1 and 2 are before 3 is (I guess) because they simply don't expect
root-fs to be on the stacked block devices, and want to cleanly close
everything else, before umounting the root-fs

Step 3 is actually a remount,ro of / ... followed by the halt/reboot...
and I guess there is no other way to do this (e.g. doing a real umount).


I guess it's quite obvious, that if the root-fs is e.g. on top of lvm
and/or dm-crypt,... closing of some LV/VG/dm-crypt-devices will fail
(which is what I see and why I wrote here).


Now my question:
Is it strictly guaranteed, that when the mount -o remount,ro / in
umountroot returns,... everything that the filesystem flushed out,...
has already went down throught all the different block layers to the
disk?

I could imagine that future block layers to some caching, or that
encryption at dm-crypt takes some CPU time,... so if the mount would
return before everything gone through all layers,... the halt/reboot
would come "immediately" next... and the date would be gone without
being cleanly flushed.

Now I guess with a filesystem having barriers... it's secure, right? But
what about filesystem not having them?


So I think in the end my question is:
Is it by design secured, that I do _NOT_ cleanly disable any (possible
stacked block layers like lvm/md/dm-crypt/etc), when halting/rebooting
the system and when I do an remount,ro in the end.



Thanks,
Chris.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Anton Mitterer on
On Sat, 2010-06-26 at 18:17 +0200, Milan Broz wrote:
> For the device-mapper device (and this applies to other type devices too),
> you cannot remove device (unload mapping table) when device is still open.
You just mean I cannot remove/close it, until something above (e.g.
filesystem) is still open/mounted? Yeah that was clear (and that's good,
isn't it?!)


> This applies even for active stacked mapping of devices (LVM over LUKS)
> - you cannot remove LUKS device while LVs are active on top of it.
> (even unmounted)
clear clear ... :)


> remount RO will not help here - it still keeps the device open.
of course :)


> With recent kernel and flush (issuing barrier internally) device-mapper
> properly propagates barrier request.
a) What is recent? ;)
b) The barrier thingy,... does it have to be supported by the thing
(e.g. filesystem, LV, etc.) on top? Or is this something generically
implemented for flushing?


> But note that you are running shutdown scripts from device itself
> if it is root-fs script itself produces reads to the device.
> ...
Uhm what exactly do you mean?


> btw block device flush is implemented using barrier too.
So I understand,.. this means it is something "separate"... and
regardless of whether the filesystem on top supports barriers itself,...
I'll have everything flushed out to disk when doing the remount,ro...
even if the block layer devices below are not yet closed.


> From the data integrity point of view, remounting to RO should probably
> be enough (correct me please if I am wrong:-).
Great :) And I guess you can speak for both lvm and dm-crypt ?! :)
And it should probably also flush through md,... as it's also dm?


> But from the security point of view dm-crypt encryption key remains in memory
> because you cannot properly remove LUKS device thus wipe the key.
>
> Anyone with proper boot image can recover such key from RAM memory using
> so called cold-boot attack.
How long is this about staying in the RAM (after poweroff)?
And after reboot.... isn't everything set to 0x0? Otherwise,... booting
e.g. another OS or a compromised Linux could leak the key...


> You have several options how to solve this, but I am afraid all require
> some kind of ramdisk, where are the basic tools are copied before unmounting
> root-fs and unmapping devices and reboot.
I've already feared that... so we need de-initramfs? ;)


> (For non-root devices it is easy, you can even call luksSuspend to wipe
> key on still active device as workaround before reboot.
I guess non-root devices should be cleanly closed, with luksClose, or
not?


> After luksSuspend
> device is frozen - until the key is provided back using luksResume.
> So only some e.g. page cache leaks of plaintext data are possible -
> but not encryption key itself.)
Isn't it possible to patch the kernel,.. that always when halting or
rebooting,.. it "simply" wipes _ALL_ dm-cryptkeys available,...
And why/how are plaintext leaks possible?


> I mean something like this on shutdown:
>
> - create ramdisk containing basic utilities
> (mount, sync, lvm, cryptsetup, halt, etc)
> - remount device read-only, iow sync and flush write IO
> - switch to ramdisk, all command now must run from there
> - try to cleanly unmout root-fs, deactivate underlying LV, deactivate LUKS
> - if deactivation fails, fallback to wipe LUKS device key in memory
> using luksSuspend
> (more options here, like trying to dmsetup remove -f do remap to error target,
> which disconnects underlying devices and allows deactivate them,
> but it is quite dangerous)
> - reboot
>
> (sounds like we need shutdownramfs but initramfs can be probably reused here:-)
Already thought about that before,... but it seems impossible to me,...
to convice distros to do that...
And it's quite complex I guess,... given the fact that there are
basically arbitrary ways to stack your block devices...


Right now when I shutdown,... I get errors for lvm/dm-crypt/md,... as
they all can't close there devices,... as the root-fs is just ro-mounted
(ok the Debian cryptsetup package seems to not display that error,.. but
it's probably there).
Nevertheless,... what should "we" do now?
- Always seeing a "failed" is rather ugly
- One could simply not call the appropriate initscripts for stopping in
rc0 and rc6.
This would however affect all such devices,... not only those where the
root-fs is on top.
But I guess it's rather complex to find out the correct ones and skip
the error-message only for them...
And it does not solve the crypto-leak issue :(



Cheers,
Chris.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Anton Mitterer on
On Sat, 2010-06-26 at 21:17 +0200, Milan Broz wrote:
> IIRC barriers are fully supported for DM devices since 2.6.31
thx.

> but every fs must properly flush IO on umount or remount RO
> (using whatever method is possible).
But... shouldn't literally every filesystem do that? At least all the
ones that are used as rootfs (ext234,btrfs,reiser34,xfs,jfs,... I guess)


> Just that every command run will issue some reads when it need to load
> libraries etc.
But that shouldn't harm, if you just remount,ro and as long as all
different block layers (dm-crypt/lvm2/md/etc) are still there,.. which
is as far as I understood you definitely the case,.. as one cannot
remove them as long as something above uses them.


> See http://citp.princeton.edu/memory/
thx =)


> (just FYI: luksSuspend is simple wrapper for dm-crypt wipe key functionality;
> it was written to provide tool to safely wipe all dm-crypt encryption keys
> in memory without need to close device, it is de-initialising cryptoAPI
> modules etc.
Then why can't one do this at the dm-crypt device which has root-fs on
top? If the data flushing is secure as you've said before,... we should
be fine...
Oh... wait... if the key was gone we could not call halt / reboot
anymore, right?


> It was intended to provide something which can help "suspend to RAM" to be
> at least partially immune to cold boot attack on RAM-suspended laptop.)
I've always thought suspend-to-RAM would be generally insecure with full
disk encryption,... and only suspend-to-disk could be made secure (with
encrypted swap)?!


> > I guess non-root devices should be cleanly closed, with luksClose, or
> > not?
> yes. But for security reasons there still should be some fallback if it
> fails - so at least key is properly wiped if all other attempts to unmout fails.
So you mean,... if non-root devices are shut down,.. but the umount
doesn't (still blocking or so)... then we should still clear the keys?
Wouldn't that lead to data corruption?


> That will solve only one problem. Then you will have some some device which
> need ioctl to wipe its state (including possible sensitive data in
> its buffers) etc. It is just an example, hack to wipe memory on reboot is
> not proper solution.
But we don't do all this right now either, right?

> I think this should be solved systematically - clean shutdown should keep
> machine in safe state.
Would be great....


> It is always amusing these discussions which super-hyper encryption mode
> to use and then we are not able to properly shutdown it, keeping the root-fs
> encryption key in memory.
> Seriously, this is problem and must be solved if we are serious with
> full disk encryption.
It's especially interesting,.. as with modern checksummed filesystems
fully encrypted systems should also provide some level of
integrity/authenticity... as it's like a MAC.


> > And it's quite complex I guess,... given the fact that there are
> > basically arbitrary ways to stack your block devices...
> It is exactly the same complexity like in initramfs, just steps are reversed.
Yes...


> But I think that solution which 1) flush page cache 2) remount read-only
> and 3) wipe all encryption keys for remaining devices is enough.
What about the "some device which need ioctl to wipe its state
(including possible sensitive data in its buffers)" ?
After step (3) we won't be anymore able to invoke the init-scripts which
came after "unmountroot" and e.g. reboot/hald, would we?


> > Right now when I shutdown,... I get errors for lvm/dm-crypt/md,... as
> > they all can't close there devices,... as the root-fs is just ro-mounted
> > (ok the Debian cryptsetup package seems to not display that error,.. but
> > it's probably there).
>
> cryptsetup should print error when device is busy on luksClose,
> report it to upstream (currently probably me) if not:-)
> (It prints "Device <dev> is busy." in that situtation and fails.)
>
> Anyway, I am sure that other distro maintainers must solved similar problem...
For Debian I was planing to organise some round table bringing all the
related maintainers (lvm2, cryptsetup, mdadm, perhaps even initscripts
because of the shutdown stuff together).

Less for that security related problem,... but more for the question,
how can we provide a generic way, that root/non-root devices can be
stacked,... that the initramfs is fitted with only the required tools
(which means we must build a tree starting from the fstab entry down to
the bootom... to really get any device which can have parts of root)...
that the different stuff (e.g. vgchange) is called in the right order in
it... and the same for the non-root devices,... which are mounted
outside initramfs.

And of course the same vice versa when shutting down...


Which place would you suggest for such discussions (not sure if lkml is
the right place ;) )? I mean it would be great if already people form
multiple distros (e.g. you from RedHat) would take part. I know that at
least Jonas from Debian is already following this :)
And I guess it's generally great if you take part,... as dm-crypt/lvm
developer :) you easily see if something stupid is done ;)
I mean for some parts of the whole picture I have already some ideas how
things should work,.. but I guess I have far to less technical knowledge
(as you see from my questions ;) ) to do this (alone).


Best wishes,
Chris.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Anton Mitterer on
FYI:

On Sun, 2010-06-27 at 01:10 +0200, Christoph Anton Mitterer wrote:
> Which place would you suggest for such discussions (not sure if lkml is
> the right place ;) )?
I've put up a wiki site which tries to describe and discuss this and
related issues.

This could serve as a firs place, until something better is suggested.

It does of course not mean that we have to stopp discussion here at
lkml :)

It's on the Debian wiki, but I guess (as most of them are probably
inter-distro-issues ^^) all people are happily invited to join
discussion :) .

I also tried to separate the generic discussion from every Debian
related.


Best wishes,
Chris.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Anton Mitterer on
Argl,... forgot to include the URI ;)

http://wiki.debian.org/AdvancedStartupShutdownWithMultilayeredBlockDevices

Cheers,
Chris.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/