From: Massimiliano Galanti on
> using XIP (where possible, e.g. on NORs)? That would not put .data

err... _.ro_data (and .text)


--
Massimiliano
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Massimiliano Galanti on
Well, not quite. See i.e. AXFS, or squashfs that are ro and flash
oriented by design.

Anyway, if you're stuck with NTFS/VFAT and can't use NOR...

(Just curious of what technology are you relying on for storage)

Il 10/06/2010 21:42, Brian Gordon ha scritto:
> Sorry, I take it back. This wont work for me because I wont have
> NOR. Also, I only want the "in-place" to apply to read-only pages.
> This looks like all reads and writes get passed to the underlying
> storage and I can't suffer flash page erase/writes to update a
> variable. :) The device will wear out and meaningful work would be
> starved.

--
Massimiliano
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Brian Gordon on
Storage will probably be something really cheap. So I assume flash.
But, possibly a USB stick type device. Maybe an IDE based solid
state storage device.


On Thu, Jun 10, 2010 at 1:52 PM, Massimiliano Galanti
<massiblue(a)libero.it> wrote:
> Well, not quite. See i.e. AXFS, or squashfs that are ro and flash oriented
> by design.
>
> Anyway, if you're stuck with NTFS/VFAT and can't use NOR...
>
> (Just curious of what technology are you relying on for storage)
>
> Il 10/06/2010 21:42, Brian Gordon ha scritto:
>>
>> Sorry, I take it back. � This wont work for me because I wont have
>> NOR. � Also, I only want the "in-place" to apply to read-only pages.
>> This looks like all reads and writes get passed to the underlying
>> storage and I can't suffer flash page erase/writes to update a
>> variable. :) �The device will wear out and meaningful work would be
>> starved.
>
> --
> Massimiliano
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Henrique de Moraes Holschuh on
On Thu, 10 Jun 2010, Chris Friesen wrote:
> On 06/10/2010 11:29 AM, Brian Gordon wrote:
> > When these SEU can be detected some action may be taken to improve
> > the behaviour of the system (log a fault and reset in order to
> > refresh things from scratch?). So the first question becomes how to
> > detect an SEU.
>
> I do work in telco stuff. We use ECC RAM, turn on ECC/parity on the
> various buses, enable error-checking in the hardware, etc.

Let's not forget that the hardware better have unassisted scrubbing
(rewrite cells where an CE is detected), because we don't scrub when
we are notified of a CE.

Background scrubbing might also be something to look for (run over all
RAM over a large period of time, to catch dormant CEs and fix them
before they become UEs).

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Borislav Petkov on
From: Brian Gordon <legerde(a)gmail.com>
Date: Thu, Jun 10, 2010 at 12:38:10PM -0600

Hi,

> > It's also a serious consideration for standard servers.
> Yes. Good point.
>
> > On server class systems with ECC memory hardware does that.
>
> > Normally server class hardware handles this and the kernel then reports
> > memory errors (e.g. through mcelog or through EDAC)
>
> Agreed. EDAC is a good and sane solution and most companies do this.
> Some do not due to naivity or cost reduction. EDAC doesn't cover
> processor registers and I have fairly good solutions on how to deal
> with that in tiny "home-grown" tasking systems.

No, not processor registers but all cache levels of modern class x86
processors have ECC checking capability so that the possibility for the
data to go up dirty in the core is minimized. Now, if a bit flip is
caused by SEU while the data is passing the execution units then you
loose I guess. For such cases, some sort of processor redundancy is
needed to compare and validate results, as you say below.

> On the more exotic end, I have also seen systems that have dual
> redundant processors / memories. Then they add compare logic between
> the redundant processors that compare most pins each clock cycle. If
> any pins are not identical at a clock cycle, then something has gone
> wrong (SEU, hardware failure, etc..)
>
> > Lower end systems which are optimized for cost generally ignore the
> > problem though and any flipped bit in memory will result
> > in a crash (if you're lucky) or silent data corruption (if you're unlucky)
>
> Right! And this is the area that I am interested in. Some people
> insist on lowering the cost of the hardware without considering these
> issues. One thing I want to do is to be as diligent as possible (even
> in these low cost situations) and do the best job I can in spite of
> the low cost hardware.
>
> So, some pages of RAM are going to be read-only and the data in those
> pages came from some source (file system?). Can anyone describe a
> high level strategy to occasionaly provide some coverage of this data?
>
> So far I have thought about page descriptors adding an MD5 hash
> whenever they are read-only and first being "loaded/mapped?" and then
> a background daemon could occasionaly verify.

.... and if a SEU corrupts the MD5 hash itself, this should cause a page
reload, right?

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/