From: Martin Mokrejs on
Hi,
I bought a external harddrive with firewire and USB interfaces (IcyBOX IB-250StUE-B).
If I connect it to a desktop computer A I get kernel crash during boot (see
both attached dmesg-*.txt files).

Further, a laptop computer B is connected to A via firewire as well through
firewire-net module. I do not understand why but on computer B I see in dmesg
complains from firewire_sbp about the external drive physically connected to
computer A! Is that a bug or feature? Nevertheless, the host B cannot really
talk to the drive (see below snippet from 2.6.34.1 kernel on the laptop below
in the body of this email).

Sorry for mixing the two issue into a single email. Maybe this is because
of similar underlying issues? The desktop has 2 firewire ports and the laptop
also 2 ports. While taking into account that both have firewire_net inserted
into the running kernel and on both machines I see only firewire0 interface
and not additional firewire1 interface I wonder whether the kernels realizes
there are two physical ports on each computer and maybe it mixes together
some data or takes an action on the wrong port. You may think of my yesterdays
email as of yet another kernel crash and bug in JuJu firewire stack under subject
"2.6.31.14: firewire_net issue in generic_sync_sb_inodes".

Thanks for any clues,
Martin



firewire_core: created device fw0: GUID 00e018000305e5fc, S400
firewire_core: created device fw1: GUID 0011d80001762a80, S400
firewire_core: created device fw2: GUID 001b8c8000000105, S400
firewire_core: refreshed device fw0
firewire_net: firewire0: IPv4 over FireWire on device 00e018000305e5fc
usb 1-1: new low speed USB device using uhci_hcd and address 2
scsi2 : SBP-2 IEEE-1394
usb 1-1: New USB device found, idVendor=0458, idProduct=0036
usb 1-1: New USB device strings: Mfr=2, Product=1, SerialNumber=0
usb 1-1: Product: NetScroll + Mini Traveler
usb 1-1: Manufacturer: Genius
firewire_sbp2: fw2.0: logged in to LUN 0000 (0 retries)
scsi 2:0:0:0: Direct-Access-RBC JMicron HDD PQ: 0 ANSI: 4
sd 2:0:0:0: Attached scsi generic sg2 type 14
sd 2:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
sd 2:0:0:0: [sdb] Write Protect is off
sd 2:0:0:0: [sdb] Mode Sense: 10 00 00 00
sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdb: sdb1
sd 2:0:0:0: [sdb] Attached SCSI disk
firewire_sbp2: fw2.0: sbp2_scsi_abort
firewire_sbp2: fw2.0: sbp2_scsi_abort
sd 2:0:0:0: Device offlined - not ready after error recovery
sd 2:0:0:0: [sdb] Unhandled error code
sd 2:0:0:0: [sdb] Result: hostbyte=0x02 driverbyte=0x00
sd 2:0:0:0: [sdb] CDB: cdb[0]=0x28: 28 00 00 00 00 00 00 00 20 00
end_request: I/O error, dev sdb, sector 0
Buffer I/O error on device sdb, logical block 0
Buffer I/O error on device sdb, logical block 1
Buffer I/O error on device sdb, logical block 2
Buffer I/O error on device sdb, logical block 3
firewire_ohci: isochronous cycle inconsistent
firewire_sbp2: fw2.0: reconnected to LUN 0000 (0 retries)
firewire_core: refreshed device fw1
firewire_core: phy config: card 0, new root=ffc2, gap_count=7
firewire_sbp2: fw2.0: reconnected to LUN 0000 (0 retries)
firewire_ohci: isochronous cycle inconsistent
firewire_sbp2: fw2.0: reconnected to LUN 0000 (0 retries)
firewire_core: phy config: card 0, new root=ffc2, gap_count=7
firewire_sbp2: fw2.0: reconnected to LUN 0000 (0 retries)
firewire_core: phy config: card 0, new root=ffc1, gap_count=5
sd 2:0:0:0: [sdb] Synchronizing SCSI cache
sd 2:0:0:0: [sdb] Result: hostbyte=0x02 driverbyte=0x00
sd 2:0:0:0: [sdb] Stopping disk
sd 2:0:0:0: [sdb] START_STOP FAILED
sd 2:0:0:0: [sdb] Result: hostbyte=0x02 driverbyte=0x00
firewire_sbp2: released fw2.0, target 2:0:0
firewire_ohci: isochronous cycle inconsistent
firewire_core: phy config: card 0, new root=ffc1, gap_count=5
firewire_ohci: isochronous cycle inconsistent
firewire_core: created device fw1: GUID 0011d80001762a80, S400
firewire_core: refreshed device fw1
From: Martin Mokrejs on
Hi Jay,
thank you for you thorough explanation. Let me just briefly re-phrase what
I have. The topology is as of now:

A B

VT6306 R5C552
| | | |
| ------------- firewire-net+sbp2--------------- |
| --- unused port
|
------ external drive enclosure (2 FW ports, 1USB port, one PWR port)


In other words, I did not plugin two firewire cables into the two sockets on the
external drive enclosure, each coming from a different computer. I am not that
desperate user. ;) I suspect you thought I have the external drive in between
both computers. No, I don't.

Computer A (desktop) has VT6306 Fire II IEEE 1394 chip, 3 ports, one connected
to the external hard drive, another to computer B (laptop) used for the TCP IP
networking.

Computer B has Ricoh Co Ltd R5C552 IEEE 1394 chip. I should blacklist firewire_sbp
driver so that the laptop does not try to access the external hard drive.

Yes, I have realized that the old firewire modules take precedence over the new
JuJu stuff. I used only the JuJu driver but after experiencing problems I decided
to compile as modules also the old drivers. I will repoduce this with the JuJu
drivers alone once again. (I have given that up meanwhile and I use the USB port
to transfer the data now - but will re-try and re-post.)

Thanks,
Martin


Jay Fenlason wrote:
> On Fri, Jul 23, 2010 at 04:09:21PM +0200, Martin Mokrejs wrote:
>> Hi,
>> I bought a external harddrive with firewire and USB interfaces (IcyBOX IB-250StUE-B).
>> If I connect it to a desktop computer A I get kernel crash during boot (see
>> both attached dmesg-*.txt files).
>>
>> Further, a laptop computer B is connected to A via firewire as well through
>> firewire-net module. I do not understand why but on computer B I see in dmesg
>> complains from firewire_sbp about the external drive physically connected to
>> computer A! Is that a bug or feature? Nevertheless, the host B cannot really
>> talk to the drive (see below snippet from 2.6.34.1 kernel on the laptop below
>> in the body of this email).
>>
>> Sorry for mixing the two issue into a single email. Maybe this is because
>> of similar underlying issues? The desktop has 2 firewire ports and the laptop
>> also 2 ports. While taking into account that both have firewire_net inserted
>> into the running kernel and on both machines I see only firewire0 interface
>> and not additional firewire1 interface I wonder whether the kernels realizes
>> there are two physical ports on each computer and maybe it mixes together
>> some data or takes an action on the wrong port. You may think of my yesterdays
>> email as of yet another kernel crash and bug in JuJu firewire stack under subject
>> "2.6.31.14: firewire_net issue in generic_sync_sb_inodes".
>
> I think you are confused about how firewire works. Firewire is a bus,
> not a point-to-point technology. Any device on a firewire bus may
> talk to any other device on the same bus, whether the are directly
> physically connected or not. Otherwise you would not be able to
> daisy-chain disks, cameras, audio devices, etc. The only way you can
> have multiple firewire busses on a device is to have multiple firewire
> controllers. (You can do this by putting two firewire PCI cards in a
> computer, or by putting a FirWire CardBus card in a laptop with an
> on-board firewire controller, but I don't know of any machines that
> ship with multiple firewire busses.) Each controller can have any
> number (*up to 63, with 1-3 being the most comment) of ports on it.
>
>>From what you've said above, each of your computers has a single
> firewire controller in it (lspci will tell you for sure). One of the
> computers has two ports on its controller, and the other has three.
> (This in not uncommon on many firewire based systems because the
> commonly used PHY chips support up to three ports.)
>
> Hard disks (and things that emulate them) generally allow only a
> single host to control them at a time. (Ignoring for the moment
> specialized "multi-initiator" capable hardware used for shared storage
> in clustering applications.) This is because if two machines mount
> the same (non clustering-aware) filesystem at the same time, they will
> write over each others changes to the filesystem and eventually trash
> the filesystem's data structures beyond repair. So when you have
> created a single bus with two computers and a single hard disk on it,
> it's unsurprising that only one of the computers can successfully talk
> to it.
>
> I see in your dmesg that your 2.6.32.16-default computer is using the
> old ieee1394 stack, and not the the firewire stack, so it should not
> have loaded firewire-net. It should have loaded eth1394 instead. I'm
> troubled by the traceback in nodemgr, but since the old stack is
> unmaintained and buggy, your first step should be to completely
> eliminate iee1394, ohci1394, sbp2 and eth1394 from it and replace them
> with firewire-core, firewire-ohci, firewire-sbp2, and firewire-net on
> it. Nobody is going to bother to debug the old stack at this point.
>
> You should then either blacklist firewire-sbp2 on the computer that
> you do not want to use the external disk from, or tell firewire-sbp2
> not to try to attach to it (I believe Stefan Richter wrote directions
> on how to do that a year or two ago. Check the linux1394-devel
> archives). Otherwise both machines will race to connect to it, one of
> them will win, and the other will get errors.
>
> -- JF
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Martin Mokrejs on
Hi Jay,
I removed the old firewire modules from kernel .config and recompiled&reinstalled
the kernel file and modules on compiuter A, and disabled the firewire_sbp2 on host B.
I have power problems with the drive when on firewire, though. It seems the desktop
PC (ASUS P5K WS 1.0004) is not able to feed the WD 1.0TB 2.5 5200rpm" drive. I tried
the firewire ports on the front of the box as well those from the motherboard. Even,
plugged in the USB power "jack", no luck. If I use the USB port + USB power it works
fine. I just unplugged the device after not being able to even run "fdisk /dev/sdh".
At the moment I have screwed superblock on the filesystem and will have to re-start
from scratch. The attached dmesg talks about "Device offlined - not ready after error
recovery" but I hope this is a temporary issue, and I just disconnected the device
at the very end.
BTW, some driver is not ACPI compliant according to dmesg.

The chip in the external IcyBox IB-250StUE-B is JMicron JMB 353 doing the USB+FireWire+SATA
work for the 2.5" WD drive.
Martin

Jay Fenlason wrote:
> On Fri, Jul 23, 2010 at 08:38:26PM +0200, Martin Mokrejs wrote:
>> Hi Jay,
>> thank you for you thorough explanation. Let me just briefly re-phrase what
>> I have. The topology is as of now:
>>
>> A B
>>
>> VT6306 R5C552
>> | | | |
>> | ------------- firewire-net+sbp2--------------- |
>> | --- unused port
>> |
>> ------ external drive enclosure (2 FW ports, 1USB port, one PWR port)
>>
>>
>> In other words, I did not plugin two firewire cables into the two sockets on the
>> external drive enclosure, each coming from a different computer. I am not that
>> desperate user. ;) I suspect you thought I have the external drive in between
>> both computers. No, I don't.
>
> The firewire bus is very egalitarian (unlike USB). All devices on the
> bus (disk, camera, computer, etc) are devices on the bus. The bus
> doesn't care which device is in the middle of a three-node
> configuration. (Well, unless some of the devices are capable of
> speeds that the other devices can't do, but that's a special case.)
>
>> Computer A (desktop) has VT6306 Fire II IEEE 1394 chip, 3 ports, one connected
>> to the external hard drive, another to computer B (laptop) used for the TCP IP
>> networking.
>>
>> Computer B has Ricoh Co Ltd R5C552 IEEE 1394 chip. I should blacklist firewire_sbp
>> driver so that the laptop does not try to access the external hard drive.
>
> Yes.
>
>> Yes, I have realized that the old firewire modules take precedence over the new
>> JuJu stuff. I used only the JuJu driver but after experiencing problems I decided
>> to compile as modules also the old drivers. I will repoduce this with the JuJu
>> drivers alone once again. (I have given that up meanwhile and I use the USB port
>> to transfer the data now - but will re-try and re-post.)
>
> I'm curious about how firewire-net is doing. I know eth1394 can be
> taken down with a simple ping flood, so I hope it is more resiliant
> than that.
From: Stefan Richter on
Martin Mokrejs wrote at LKML:
> Hi Jay,
> Jay Fenlason wrote:
>> On Fri, Jul 23, 2010 at 04:09:21PM +0200, Martin Mokrejs wrote:
>>> Hi,
>>> I bought a external harddrive with firewire and USB interfaces (IcyBOX IB-250StUE-B).
>>> If I connect it to a desktop computer A I get kernel crash during boot (see
>>> both attached dmesg-*.txt files).

The crash which you reported is in sbp2 (of the old ieee1394 stack alias
linux1394, not in firewire-sbp2 (of the new firewire stack alias juju).

>>> Further, a laptop computer B is connected to A via firewire as well through
>>> firewire-net module. I do not understand why but on computer B I see in dmesg
>>> complains from firewire_sbp about the external drive physically connected to
>>> computer A! Is that a bug or feature? Nevertheless, the host B cannot really
>>> talk to the drive (see below snippet from 2.6.34.1 kernel on the laptop below
>>> in the body of this email).

I comment on this further below.

>>> Sorry for mixing the two issue into a single email. Maybe this is because
>>> of similar underlying issues? The desktop has 2 firewire ports and the laptop
>>> also 2 ports. While taking into account that both have firewire_net inserted
>>> into the running kernel and on both machines I see only firewire0 interface
>>> and not additional firewire1 interface I wonder whether the kernels realizes
>>> there are two physical ports on each computer and maybe it mixes together
>>> some data or takes an action on the wrong port. You may think of my yesterdays
>>> email as of yet another kernel crash and bug in JuJu firewire stack under subject
>>> "2.6.31.14: firewire_net issue in generic_sync_sb_inodes".

I missed that thread, and amost missed this one. You could have Cc'd
linux1394-devel. Chances to get help on specific driver issues on LKML
are slim.

The crashlog from "2.6.31.14: firewire_net issue in
generic_sync_sb_inodes" does not point to firewire-net directly. But
perhaps firewire-net corrupted some memory before that crash.

There was a bugfix for firewire-net in 2.6.33. But I believe that fix
is only necessary on SMP/ multicore machines; your notebook seems to be
a singlecore machine.

>> I think you are confused about how firewire works. Firewire is a bus,
>> not a point-to-point technology. Any device on a firewire bus may
>> talk to any other device on the same bus, whether the are directly
>> physically connected or not. Otherwise you would not be able to
>> daisy-chain disks, cameras, audio devices, etc. The only way you can
>> have multiple firewire busses on a device is to have multiple firewire
>> controllers. (You can do this by putting two firewire PCI cards in a
>> computer, or by putting a FirWire CardBus card in a laptop with an
>> on-board firewire controller, but I don't know of any machines that
>> ship with multiple firewire busses.) Each controller can have any
>> number (*up to 63, with 1-3 being the most comment) of ports on it.
>>
>> From what you've said above, each of your computers has a single
>> firewire controller in it (lspci will tell you for sure). One of the
>> computers has two ports on its controller, and the other has three.
>> (This in not uncommon on many firewire based systems because the
>> commonly used PHY chips support up to three ports.)

Absolutely; FireWire devices (including PCs/ laptops) almost always only
have a single FireWire link-layer interface, even if they have multiple
FireWire physical interfaces. A FireWire device with several ports
repeats all traffic between these ports. (Except in case of speed
capability differences of different bus segments.)

Furthermore, unlike the host-centric USB, FireWire is a peer-to-peer bus
or network. All nodes that are present on one bus see each other and
can communicate with each other regardless of the particular topology.

>> Hard disks (and things that emulate them) generally allow only a
>> single host to control them at a time. (Ignoring for the moment
>> specialized "multi-initiator" capable hardware used for shared storage
>> in clustering applications.) This is because if two machines mount
>> the same (non clustering-aware) filesystem at the same time, they will
>> write over each others changes to the filesystem and eventually trash
>> the filesystem's data structures beyond repair. So when you have
>> created a single bus with two computers and a single hard disk on it,
>> it's unsurprising that only one of the computers can successfully talk
>> to it.
>>
>> I see in your dmesg that your 2.6.32.16-default computer is using the
>> old ieee1394 stack, and not the the firewire stack, so it should not
>> have loaded firewire-net. It should have loaded eth1394 instead.

On Gentoo Linux and many other distributions, eth1394 is blacklisted
(i.e. never automatically loaded). This is because distributors don't
like it when eth1394 messes up the "eth%d" networking interface namespace.

firewire-net on the other hand is not blacklisted (but also won't
intermix with the names of Ethernet interfaces). Hence, if a Linux PC
which has firewire-net installed is plugged into a bus with an
IPv4-over-1394 capable node present, firewire-net will be auto-loaded
regardless whether the FireWire controller is driven by ohci1394 or
firewire-ohci at that time.

If ohci1394 is at the helm at that moment, firewire-net will of course
do nothing but take up space.

>> I'm troubled by the traceback in nodemgr, but since the old stack is
>> unmaintained and buggy, your first step should be to completely
>> eliminate iee1394, ohci1394, sbp2 and eth1394 from it and replace them
>> with firewire-core, firewire-ohci, firewire-sbp2, and firewire-net on
>> it. Nobody is going to bother to debug the old stack at this point.

Exactly. ieee1394, sbp2, ohci1394 etc. are planed to be deleted in
2.6.37(rc-1) which will apparently be in less than 3 months.

While a crash bug is something pretty severe, there are simply no
resources to chase them anymore.

>> You should then either blacklist firewire-sbp2 on the computer that
>> you do not want to use the external disk from, or tell firewire-sbp2
>> not to try to attach to it (I believe Stefan Richter wrote directions
>> on how to do that a year or two ago. Check the linux1394-devel
>> archives).

Did I? Right now I would say, just blacklist firewire-sbp2 (and sbp2)
on the machine that is not supposed to log into the disk.

>> Otherwise both machines will race to connect to it, one of
>> them will win, and the other will get errors.
>>
>> -- JF

(Which is harmless except for the fact that which of the two initiators
wins the login might not be the one that you wanted.)

> thank you for you thorough explanation. Let me just briefly re-phrase what
> I have. The topology is as of now:
>
> A B
>
> VT6306 R5C552
> | | | |
> | ------------- firewire-net+sbp2--------------- |
> | --- unused port
> |
> ------ external drive enclosure (2 FW ports, 1USB port, one PWR port)
>
>
> In other words, I did not plugin two firewire cables into the two sockets on the
> external drive enclosure, each coming from a different computer. I am not that
> desperate user. ;) I suspect you thought I have the external drive in between
> both computers. No, I don't.
>
> Computer A (desktop) has VT6306 Fire II IEEE 1394 chip, 3 ports, one connected
> to the external hard drive, another to computer B (laptop) used for the TCP IP
> networking.

For IPv4 over 1394 as well as for SBP-2 it does not matter whether the
physical order is disk--A--B or A--disk--B or A--B--disk.

> Computer B has Ricoh Co Ltd R5C552 IEEE 1394 chip. I should blacklist firewire_sbp
> driver so that the laptop does not try to access the external hard drive.
>
> Yes, I have realized that the old firewire modules take precedence over the new
> JuJu stuff. I used only the JuJu driver but after experiencing problems I decided
> to compile as modules also the old drivers. I will repoduce this with the JuJu
> drivers alone once again. (I have given that up meanwhile and I use the USB port
> to transfer the data now - but will re-try and re-post.)

Older dual 1394a + USB 2.0 IcyBoxes were based on the infamous Prolific
PL3507 chip. That one's FireWire part is extremely unreliable under any OS.

Some PL3507 based disks could be made to work /somewhat/ better by
installing the latest firmware from Prolific on it. Have a look at
https://ieee1394.wiki.kernel.org/index.php/Firmware_Downloads .
Prolific's FireWire firmware updater utility works via USB 2.0 and
unsurprisingly only runs on Windows.
--
Stefan Richter
-=====-==-=- -=== =-===
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stefan Richter on
Martin Mokrejs wrote at LKML:
> Hi Jay,
> I removed the old firewire modules from kernel .config and recompiled&reinstalled
> the kernel file and modules on compiuter A, and disabled the firewire_sbp2 on host B.
> I have power problems with the drive when on firewire, though. It seems the desktop
> PC (ASUS P5K WS 1.0004) is not able to feed the WD 1.0TB 2.5 5200rpm" drive.

Although it should, unless if the ASUS motherboard is miswired. A
standards compliant bus power provider must be able to supply at least
1.5 A. Most PCs wire their FireWir ports up to 12 Volts, which gives
you at least 18 Watts. You can driver several of these WD drives off that.

> I tried
> the firewire ports on the front of the box as well those from the motherboard.

Front panel connectors may be unreliable. Often the connections from
mortherboard to front panel are cheap and outside the IEEE 1394
electrical specification.

> Even,
> plugged in the USB power "jack", no luck. If I use the USB port + USB power it works
> fine. I just unplugged the device after not being able to even run "fdisk /dev/sdh".
> At the moment I have screwed superblock on the filesystem and will have to re-start
> from scratch. The attached dmesg talks about "Device offlined - not ready after error
> recovery" but I hope this is a temporary issue, and I just disconnected the device
> at the very end.
> BTW, some driver is not ACPI compliant according to dmesg.
>
> The chip in the external IcyBox IB-250StUE-B is JMicron JMB 353 doing the USB+FireWire+SATA
> work for the 2.5" WD drive.
> Martin

Oh, I haven't heard of that chip before. So far I was only aware of
PCIe--FireWire chips from JMicron. PCIe--FireWire chips are of course
entirely different beasts than SATA--FireWire chips, but those other
FireWire offerings from JMicron do not inspire confidence.

Your dmesg shows multiple bus resets and one of the three nodes
disappearing from the bus from time to time. This points to a probelm
at the physical layer (i.e. highly unreliable hardware) which
fundamentally cannot be solved by software.
--
Stefan Richter
-=====-==-=- -=== =-===
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/