From: Johannes Hirte on
Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> Neither Yan nor I have been able to reproduce this locally, but a few
> people have now hit it. Johannes, are you available to try out a
> debugging kernel to try and track this down?
>
> -chris
>
> On Thu, Jul 08, 2010 at 04:27:23PM +0200, Johannes Hirte wrote:
> > When doing a 'rm -r /var/tmp/portage/sys-devel' I get the following Oops:
> >
> > ------------[ cut here ]------------
> > kernel BUG at fs/btrfs/extent-tree.c:1353!
> > invalid opcode: 0000 [#1] PREEMPT SMP
> > last sysfs file:
> > /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0A:00/power_supply/BAT0/charge_
> > full Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event
> > snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss nfs lockd nfs_acl
> > auth_rpcgss sunrpc sco rfcomm bnep l2cap crc16 xts gf128mul usb_storage
> > dm_crypt dm_mod coretemp hwmon acpi_cpufreq mperf snd_hda_codec_realtek
> > uvcvideo iwl3945 snd_hda_intel snd_hda_codec iwlcore videodev r8169
> > snd_hwdep btusb snd_pcm v4l1_compat mac80211 snd_timer bluetooth snd mii
> > cfg80211 soundcore sg rfkill ac i2c_i801 snd_page_alloc uhci_hcd battery
> > [last unloaded: microcode]
> >
> > Pid: 2358, comm: rm Not tainted 2.6.35-rc4 #32 M912/M912
> > EIP: 0060:[<c10c383b>] EFLAGS: 00010202 CPU: 1
> > EIP is at lookup_inline_extent_backref+0xf2/0x406
> > EAX: 00000001 EBX: 00000007 ECX: 00000000 EDX: 00000000
> > ESI: 00000004 EDI: f7268150 EBP: 00000004 ESP: f5aa5d08
> >
> > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> >
> > Process rm (pid: 2358, ti=f5aa4000 task=f6f0fa70 task.ti=f5aa4000)
> >
> > Stack:
> > f702f8c0 f744e080 f665f380 000000b0 00000000 00000000 ffffffff f6c80f00
> >
> > <0> f744e080 c10ec226 e98acfff f6c98000 00001001 0e987000 00000004
> > 00000000 <0> 00000850 040e9870 a8000000 00001000 00000000 00000007
> > 00000000 0e987000
> >
> > Call Trace:
> > [<c10ec226>] ? set_extent_dirty+0x19/0x1d
> > [<c10c5081>] ? __btrfs_free_extent+0xda/0x675
> > [<c10c88bf>] ? run_clustered_refs+0x699/0x6d7
> > [<c10d239f>] ? btrfs_mark_buffer_dirty+0xa3/0xef
> > [<c1101454>] ? btrfs_find_ref_cluster+0xf9/0x13a
> > [<c10c89bc>] ? btrfs_run_delayed_refs+0xbf/0x155
> > [<c10d3a73>] ? __btrfs_end_transaction+0x53/0x16c
> > [<c10db480>] ? btrfs_delete_inode+0x166/0x17e
> > [<c102280d>] ? get_parent_ip+0x8/0x19
> > [<c108fe5c>] ? generic_delete_inode+0x6f/0xbd
> > [<c108f5b3>] ? iput+0x46/0x48
> > [<c10893a8>] ? do_unlinkat+0xc7/0x109
> > [<c102280d>] ? get_parent_ip+0x8/0x19
> > [<c10822e3>] ? fput+0x12/0x15c
> > [<c10a2f30>] ? dnotify_flush+0x41/0xc2
> > [<c107fe85>] ? filp_close+0x4c/0x52
> > [<c107feed>] ? sys_close+0x62/0x9b
> > [<c1002550>] ? sysenter_do_call+0x12/0x26
> >
> > Code: 80 4e 68 02 8d 4c 24 43 89 f8 6a 01 ff 74 24 1c ff 74 24 08 8b 54
> > 24 38 e8 01 c2 ff ff 83 c4 0c 83 f8 00 0f 8c e1 02 00 00 74 02 <0f> 0b
> > 8b 04 24 8b 34 24 8b 00 8b 56 20 89 44 24 08 e8 2e fa ff
> > EIP: [<c10c383b>] lookup_inline_extent_backref+0xf2/0x406 SS:ESP
> > 0068:f5aa5d08 ---[ end trace d97601f0b455ca72 ]---
> > note: rm[2358] exited with preempt_count 2
> > BUG: scheduling while atomic: rm/2358/0x10000003
> > Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
> > snd_seq_device snd_pcm_oss snd_mixer_oss nfs lockd nfs_acl auth_rpcgss
> > sunrpc sco rfcomm bnep l2cap crc16 xts gf128mul usb_storage dm_crypt
> > dm_mod coretemp hwmon acpi_cpufreq mperf snd_hda_codec_realtek uvcvideo
> > iwl3945 snd_hda_intel snd_hda_codec iwlcore videodev r8169 snd_hwdep
> > btusb snd_pcm v4l1_compat mac80211 snd_timer bluetooth snd mii cfg80211
> > soundcore sg rfkill ac i2c_i801 snd_page_alloc uhci_hcd battery [last
> > unloaded: microcode]
> > Pid: 2358, comm: rm Tainted: G D 2.6.35-rc4 #32
> >
> > Call Trace:
> > [<c12de6b3>] ? schedule+0x88/0x332
> > [<c10237c1>] ? __cond_resched+0xf/0x19
> > [<c12de9e2>] ? _cond_resched+0x12/0x18
> > [<c106ceec>] ? unmap_vmas+0x4e7/0x534
> > [<c1070c8f>] ? exit_mmap+0x64/0xa4
> > [<c1026089>] ? mmput+0x21/0x96
> > [<c102938e>] ? exit_mm+0xe7/0xf0
> > [<c12dfa28>] ? _raw_spin_unlock_irqrestore+0x1a/0x24
> > [<c103aaa1>] ? hrtimer_try_to_cancel+0x31/0x3a
> > [<c102a42e>] ? do_exit+0x17b/0x57d
> > [<c1028e78>] ? kmsg_dump+0x81/0xf9
> > [<c1002d06>] ? do_invalid_op+0x0/0x76
> > [<c1004fa0>] ? oops_end+0x72/0x75
> > [<c1002d6f>] ? do_invalid_op+0x69/0x76
> > [<c10c383b>] ? lookup_inline_extent_backref+0xf2/0x406
> > [<c10bdc9a>] ? generic_bin_search.clone.0+0x145/0x150
> > [<c10bcf30>] ? btrfs_cow_block+0x106/0x112
> > [<c10bdcdc>] ? bin_search+0x37/0x3d
> > [<c10bfe33>] ? btrfs_search_slot+0x405/0x477
> > [<c12e031a>] ? error_code+0x66/0x6c
> > [<c1002d06>] ? do_invalid_op+0x0/0x76
> > [<c10c383b>] ? lookup_inline_extent_backref+0xf2/0x406
> > [<c10ec226>] ? set_extent_dirty+0x19/0x1d
> > [<c10c5081>] ? __btrfs_free_extent+0xda/0x675
> > [<c10c88bf>] ? run_clustered_refs+0x699/0x6d7
> > [<c10d239f>] ? btrfs_mark_buffer_dirty+0xa3/0xef
> > [<c1101454>] ? btrfs_find_ref_cluster+0xf9/0x13a
> > [<c10c89bc>] ? btrfs_run_delayed_refs+0xbf/0x155
> > [<c10d3a73>] ? __btrfs_end_transaction+0x53/0x16c
> > [<c10db480>] ? btrfs_delete_inode+0x166/0x17e
> > [<c102280d>] ? get_parent_ip+0x8/0x19
> > [<c108fe5c>] ? generic_delete_inode+0x6f/0xbd
> > [<c108f5b3>] ? iput+0x46/0x48
> > [<c10893a8>] ? do_unlinkat+0xc7/0x109
> > [<c102280d>] ? get_parent_ip+0x8/0x19
> > [<c10822e3>] ? fput+0x12/0x15c
> > [<c10a2f30>] ? dnotify_flush+0x41/0xc2
> > [<c107fe85>] ? filp_close+0x4c/0x52
> > [<c107feed>] ? sys_close+0x62/0x9b
> > [<c1002550>] ? sysenter_do_call+0x12/0x26

I'm not sure if btrfs is to blame for this error. After the errors I switched
to XFS on this system and got now this error:

ls -l .kde4/share/apps/akregator/data/
ls: cannot access .kde4/share/apps/akregator/data/feeds.opml: Structure needs
cleaning
total 4
?????????? ? ? ? ? ? feeds.opml

xfs_check is showing this:

xfs_check /dev/sda3
link count mismatch for inode 219998792 (name ?), nlink 0, counted 1
disconnected inode 220064328, nlink 1

So this is the second FS I've got suddenly errors, so I think the problem lies
deeper. Adding some CCs for this.


regards,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dave Chinner on
On Wed, Jul 14, 2010 at 05:25:23PM +0200, Johannes Hirte wrote:
> Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> I'm not sure if btrfs is to blame for this error. After the errors I switched
> to XFS on this system and got now this error:
>
> ls -l .kde4/share/apps/akregator/data/
> ls: cannot access .kde4/share/apps/akregator/data/feeds.opml: Structure needs
> cleaning
> total 4
> ?????????? ? ? ? ? ? feeds.opml

What is the error reported in dmesg when the XFS filesytem shuts down?

Cheers,

Dave.
--
Dave Chinner
david(a)fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Johannes Hirte on
Am Donnerstag 15 Juli 2010, 02:11:04 schrieb Dave Chinner:
> On Wed, Jul 14, 2010 at 05:25:23PM +0200, Johannes Hirte wrote:
> > Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> > I'm not sure if btrfs is to blame for this error. After the errors I
> > switched to XFS on this system and got now this error:
> >
> > ls -l .kde4/share/apps/akregator/data/
> > ls: cannot access .kde4/share/apps/akregator/data/feeds.opml: Structure
> > needs cleaning
> > total 4
> > ?????????? ? ? ? ? ? feeds.opml
>
> What is the error reported in dmesg when the XFS filesytem shuts down?

Nothing. I double checked the logs. There are only the messages when mounting
the filesystem. No other errors are reported than the inaccessible file and the
output from xfs_check.

regards,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Johannes Hirte on
Am Donnerstag 15 Juli 2010, 20:14:51 schrieb Johannes Hirte:
> Am Donnerstag 15 Juli 2010, 02:11:04 schrieb Dave Chinner:
> > On Wed, Jul 14, 2010 at 05:25:23PM +0200, Johannes Hirte wrote:
> > > Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> > > I'm not sure if btrfs is to blame for this error. After the errors I
> > > switched to XFS on this system and got now this error:
> > >
> > > ls -l .kde4/share/apps/akregator/data/
> > > ls: cannot access .kde4/share/apps/akregator/data/feeds.opml: Structure
> > > needs cleaning
> > > total 4
> > > ?????????? ? ? ? ? ? feeds.opml
> >
> > What is the error reported in dmesg when the XFS filesytem shuts down?
>
> Nothing. I double checked the logs. There are only the messages when
> mounting the filesystem. No other errors are reported than the
> inaccessible file and the output from xfs_check.

I'm running now a kernel with more debug options enabled and got this:

[ 6794.810935]
[ 6794.810941] =================================
[ 6794.810955] [ INFO: inconsistent lock state ]
[ 6794.810966] 2.6.35-rc4-btrfs-debug #7
[ 6794.810975] ---------------------------------
[ 6794.810984] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[ 6794.810996] kswapd0/361 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 6794.811006] (&(&ip->i_iolock)->mr_lock#2){++++?+}, at: [<c10fa82d>]
xfs_ilock+0x22/0x67
[ 6794.811039] {RECLAIM_FS-ON-W} state was registered at:
[ 6794.811046] [<c104ebc1>] mark_held_locks+0x42/0x5e
[ 6794.811046] [<c104f1f7>] lockdep_trace_alloc+0x99/0xb0
[ 6794.811046] [<c10740b8>] __alloc_pages_nodemask+0x6a/0x4a1
[ 6794.811046] [<c106edc2>] __page_cache_alloc+0x11/0x13
[ 6794.811046] [<c106fb43>] grab_cache_page_write_begin+0x47/0x81
[ 6794.811046] [<c10b2050>] block_write_begin_newtrunc+0x2e/0x9c
[ 6794.811046] [<c10b233a>] block_write_begin+0x23/0x5d
[ 6794.811046] [<c1114a9d>] xfs_vm_write_begin+0x26/0x28
[ 6794.811046] [<c106f15d>] generic_file_buffered_write+0xb5/0x1bd
[ 6794.811046] [<c1117e31>] xfs_file_aio_write+0x40e/0x66d
[ 6794.811046] [<c10950b4>] do_sync_write+0x8b/0xc6
[ 6794.811046] [<c109568b>] vfs_write+0x77/0xa4
[ 6794.811046] [<c10957f3>] sys_write+0x3c/0x5e
[ 6794.811046] [<c1002690>] sysenter_do_call+0x12/0x36
[ 6794.811046] irq event stamp: 141369
[ 6794.811046] hardirqs last enabled at (141369): [<c13639d2>]
_raw_spin_unlock_irqrestore+0x36/0x5b
[ 6794.811046] hardirqs last disabled at (141368): [<c13634c5>]
_raw_spin_lock_irqsave+0x14/0x68
[ 6794.811046] softirqs last enabled at (141300): [<c1032d69>]
__do_softirq+0xfe/0x10d
[ 6794.811046] softirqs last disabled at (141295): [<c1032da7>]
do_softirq+0x2f/0x47
[ 6794.811046]
[ 6794.811046] other info that might help us debug this:
[ 6794.811046] 2 locks held by kswapd0/361:
[ 6794.811046] #0: (shrinker_rwsem){++++..}, at: [<c10774db>]
shrink_slab+0x25/0x13f
[ 6794.811046] #1: (&xfs_mount_list_lock){++++.-}, at: [<c111cc78>]
xfs_reclaim_inode_shrink+0x2a/0xe8
[ 6794.811046]
[ 6794.811046] stack backtrace:
[ 6794.811046] Pid: 361, comm: kswapd0 Not tainted 2.6.35-rc4-btrfs-debug #7
[ 6794.811046] Call Trace:
[ 6794.811046] [<c13616c0>] ? printk+0xf/0x17
[ 6794.811046] [<c104e988>] valid_state+0x134/0x142
[ 6794.811046] [<c104ea66>] mark_lock+0xd0/0x1e9
[ 6794.811046] [<c104e2a7>] ? check_usage_forwards+0x0/0x5f
[ 6794.811046] [<c105003d>] __lock_acquire+0x374/0xc80
[ 6794.811046] [<c1044942>] ? sched_clock_local+0x12/0x121
[ 6794.811046] [<c1044c0b>] ? sched_clock_cpu+0x122/0x133
[ 6794.811046] [<c1050d4d>] lock_acquire+0x5f/0x76
[ 6794.811046] [<c10fa82d>] ? xfs_ilock+0x22/0x67
[ 6794.811046] [<c1043974>] down_write_nested+0x32/0x63
[ 6794.811046] [<c10fa82d>] ? xfs_ilock+0x22/0x67
[ 6794.811046] [<c10fa82d>] xfs_ilock+0x22/0x67
[ 6794.811046] [<c10faa48>] xfs_ireclaim+0x98/0xbb
[ 6794.811046] [<c1043a1e>] ? up_write+0x16/0x2b
[ 6794.811046] [<c111c78c>] xfs_reclaim_inode+0x1a7/0x1b1
[ 6794.811046] [<c111cafe>] xfs_inode_ag_walk+0x77/0xbc
[ 6794.811046] [<c111c5e5>] ? xfs_reclaim_inode+0x0/0x1b1
[ 6794.811046] [<c111cc07>] xfs_inode_ag_iterator+0x52/0x99
[ 6794.811046] [<c111cc78>] ? xfs_reclaim_inode_shrink+0x2a/0xe8
[ 6794.811046] [<c111c5e5>] ? xfs_reclaim_inode+0x0/0x1b1
[ 6794.811046] [<c111cc99>] xfs_reclaim_inode_shrink+0x4b/0xe8
[ 6794.811046] [<c1077588>] shrink_slab+0xd2/0x13f
[ 6794.811046] [<c1078cef>] kswapd+0x37d/0x4e9
[ 6794.811046] [<c104028f>] ? autoremove_wake_function+0x0/0x2f
[ 6794.811046] [<c1078972>] ? kswapd+0x0/0x4e9
[ 6794.811046] [<c103ffbc>] kthread+0x60/0x65
[ 6794.811046] [<c103ff5c>] ? kthread+0x0/0x65
[ 6794.811046] [<c1002bba>] kernel_thread_helper+0x6/0x10

Don't know if this is related to the problem.


regards,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Miao Xie on
On Thu, 15 Jul 2010 20:14:51 +0200, Johannes Hirte wrote:
> Am Donnerstag 15 Juli 2010, 02:11:04 schrieb Dave Chinner:
>> On Wed, Jul 14, 2010 at 05:25:23PM +0200, Johannes Hirte wrote:
>>> Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
>>> I'm not sure if btrfs is to blame for this error. After the errors I
>>> switched to XFS on this system and got now this error:
>>>
>>> ls -l .kde4/share/apps/akregator/data/
>>> ls: cannot access .kde4/share/apps/akregator/data/feeds.opml: Structure
>>> needs cleaning
>>> total 4
>>> ?????????? ? ? ? ? ? feeds.opml
>>
>> What is the error reported in dmesg when the XFS filesytem shuts down?
>
> Nothing. I double checked the logs. There are only the messages when mounting
> the filesystem. No other errors are reported than the inaccessible file and the
> output from xfs_check.

Is there anything wrong with your disks or memory?
Sometimes the bad memory can break the filesystem. I have met this kind of problem
some time ago.

If there is no problem with your disk and memory, Could you tell us the parameter of
mkfs.btrfs and mount?

Thanks
Miao Xie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/