From: Jens Axboe on
On 2010-06-11 10:32, Ingo Molnar wrote:
>
> * Jens Axboe <jaxboe(a)fusionio.com> wrote:
>
>> On 2010-06-09 03:53, Linus Torvalds wrote:
>>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16129
>>>> Subject : BUG: using smp_processor_id() in preemptible [00000000] code: jbd2/sda2
>>>> Submitter : Jan Kreuzer <kontrollator(a)gmx.de>
>>>> Date : 2010-06-05 06:15 (4 days old)
>>>
>>> This seems to have been introduced by
>>>
>>> commit 7cbaef9c83e58bbd4bdd534b09052b6c5ec457d5
>>> Author: Ingo Molnar <mingo(a)elte.hu>
>>> Date: Sat Nov 8 17:05:38 2008 +0100
>>>
>>> sched: optimize sched_clock() a bit
>>>
>>> sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
>>> variant of __cycles_2_ns().
>>>
>>> Most of the time sched_clock() is called with irqs disabled already.
>>> The few places that call it with irqs enabled need to be updated.
>>>
>>> Signed-off-by: Ingo Molnar <mingo(a)elte.hu>
>>>
>>> and this seems to be one of those calling cases that need to be updated..
>>>
>>> Ingo? The call trace is:
>>>
>>> BUG: using smp_processor_id() in preemptible [00000000] code: jbd2/sda2-8/337
>>> caller is native_sched_clock+0x3c/0x68
>>> Pid: 337, comm: jbd2/sda2-8 Not tainted 2.6.35-rc1jan+ #4
>>> Call Trace:
>>> [<ffffffff812362c5>] debug_smp_processor_id+0xc9/0xe4
>>> [<ffffffff8101059d>] native_sched_clock+0x3c/0x68
>>> [<ffffffff8101043d>] sched_clock+0x9/0xd
>>> [<ffffffff81212d7a>] blk_rq_init+0x97/0xa3
>>> [<ffffffff81214d71>] get_request+0x1c4/0x2d0
>>> [<ffffffff81214ea6>] get_request_wait+0x29/0x1a6
>>> [<ffffffff81215537>] __make_request+0x338/0x45b
>>> [<ffffffff812147c2>] generic_make_request+0x2bb/0x330
>>> [<ffffffff81214909>] submit_bio+0xd2/0xef
>>> [<ffffffff811413cb>] submit_bh+0xf4/0x116
>>> [<ffffffff81144853>] block_write_full_page_endio+0x89/0x96
>>> [<ffffffff81144875>] block_write_full_page+0x15/0x17
>>> [<ffffffff8119b00a>] ext4_writepage+0x356/0x36b
>>> [<ffffffff810e1f91>] __writepage+0x1a/0x39
>>> [<ffffffff810e32a6>] write_cache_pages+0x20d/0x346
>>> [<ffffffff810e3406>] generic_writepages+0x27/0x29
>>> [<ffffffff811ca279>] journal_submit_data_buffers+0x110/0x17d
>>> [<ffffffff811ca986>] jbd2_journal_commit_transaction+0x4cb/0x156d
>>> [<ffffffff811d0cba>] kjournald2+0x147/0x37a
>>>
>>> (from the bugzilla thing)
>>
>> This should be fixed by commit 28f4197e which was merged on friday.
>
> Hm, it's still not entirely fixed, as of 2.6.35-rc2-00131-g7908a9e. With some
> configs i get bad spinlock warnings during bootup:
>
> [ 28.968013] initcall net_olddevs_init+0x0/0x82 returned 0 after 93750 usecs
> [ 28.972003] calling b44_init+0x0/0x55 @ 1
> [ 28.976009] bus: 'pci': add driver b44
> [ 28.976374] sda:
> [ 28.978157] BUG: spinlock bad magic on CPU#1, async/0/117
> [ 28.980000] lock: 7e1c5bbc, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [ 28.980000] Pid: 117, comm: async/0 Not tainted 2.6.35-rc2-tip-01092-g010e7ef-dirty #8183
> [ 28.980000] Call Trace:
> [ 28.980000] [<41ba6d55>] ? printk+0x20/0x24
> [ 28.980000] [<4134b7b7>] spin_bug+0x7c/0x87
> [ 28.980000] [<4134b853>] do_raw_spin_lock+0x1e/0x123
> [ 28.980000] [<41ba92ca>] ? _raw_spin_lock_irqsave+0x12/0x20
> [ 28.980000] [<41ba92d2>] _raw_spin_lock_irqsave+0x1a/0x20
> [ 28.980000] [<4133476f>] blkiocg_update_io_add_stats+0x25/0xfb
> [ 28.980000] [<41335dae>] ? cfq_prio_tree_add+0xb1/0xc1
> [ 28.980000] [<41337bc7>] cfq_insert_request+0x8c/0x425
> [ 28.980000] [<41ba9271>] ? _raw_spin_unlock_irqrestore+0x17/0x23
> [ 28.980000] [<41ba9271>] ? _raw_spin_unlock_irqrestore+0x17/0x23
> [ 28.980000] [<41329225>] elv_insert+0x107/0x1a0
> [ 28.980000] [<41329354>] __elv_add_request+0x96/0x9d
> [ 28.980000] [<4132bb8c>] ? drive_stat_acct+0x9d/0xc6
> [ 28.980000] [<4132dd64>] __make_request+0x335/0x376
> [ 28.980000] [<4132c726>] generic_make_request+0x336/0x39d
> [ 28.980000] [<410ad422>] ? kmem_cache_alloc+0xa1/0x105
> [ 28.980000] [<41089285>] ? mempool_alloc_slab+0xe/0x10
> [ 28.980000] [<41089285>] ? mempool_alloc_slab+0xe/0x10
> [ 28.980000] [<41089285>] ? mempool_alloc_slab+0xe/0x10
> [ 28.980000] [<41089347>] ? mempool_alloc+0x57/0xe2
> [ 28.980000] [<4132c804>] submit_bio+0x77/0x8f
> [ 28.980000] [<410d2cbc>] ? bio_alloc_bioset+0x37/0x94
> [ 28.980000] [<410ceb90>] submit_bh+0xc3/0xe2
> [ 28.980000] [<410d1474>] block_read_full_page+0x249/0x259
> [ 28.980000] [<410d31fb>] ? blkdev_get_block+0x0/0xc6
> [ 28.980000] [<41087bfa>] ? add_to_page_cache_locked+0x94/0xb5
> [ 28.980000] [<410d3d92>] blkdev_readpage+0xf/0x11
> [ 28.980000] [<41088823>] do_read_cache_page+0x7d/0x11a
> [ 28.980000] [<410d3d83>] ? blkdev_readpage+0x0/0x11
> [ 28.980000] [<410888f4>] read_cache_page_async+0x16/0x1b
> [ 28.980000] [<41088904>] read_cache_page+0xb/0x12
> [ 28.980000] [<410e80e1>] read_dev_sector+0x2a/0x63
> [ 28.980000] [<410e92e8>] adfspart_check_ICS+0x2e/0x166
> [ 28.980000] [<41ba6d55>] ? printk+0x20/0x24
> [ 28.980000] [<410e8d23>] rescan_partitions+0x196/0x3e4
> [ 28.980000] [<41ba7dc7>] ? __mutex_unlock_slowpath+0x98/0x9f
> [ 28.980000] [<410e92ba>] ? adfspart_check_ICS+0x0/0x166
> [ 28.980000] [<410d4277>] __blkdev_get+0x1e7/0x292
> [ 28.980000] [<4133a201>] ? kobject_put+0x14/0x16
> [ 28.980000] [<410d432c>] blkdev_get+0xa/0xc
> [ 28.980000] [<410e81fb>] register_disk+0x94/0xe5
> [ 28.980000] [<413326c6>] ? blk_register_region+0x1b/0x20
> [ 28.980000] [<41332815>] add_disk+0x57/0x95
> [ 28.980000] [<41331fc6>] ? exact_match+0x0/0x8
> [ 28.980000] [<4133233f>] ? exact_lock+0x0/0x11
> [ 28.980000] [<41643848>] sd_probe_async+0x108/0x1be
> [ 28.980000] [<41048865>] async_thread+0xf5/0x1e6
> [ 28.980000] [<4102cbcb>] ? default_wake_function+0x0/0xd
> [ 28.980000] [<41048770>] ? async_thread+0x0/0x1e6
> [ 28.980000] [<410433df>] kthread+0x5f/0x64
> [ 28.980000] [<41043380>] ? kthread+0x0/0x64
> [ 28.980000] [<41002cc6>] kernel_thread_helper+0x6/0x10
> [ 29.264071] async/1 used greatest stack depth: 2336 bytes left
> [ 29.267020] bus: 'ssb': add driver b44
> [ 29.267072] initcall b44_init+0x0/0x55 returned 0 after 281250 usecs
> [ 29.267076] calling init_nic+0x0/0x16 @ 1
>
> Caused by the same blkiocg_update_io_add_stats() function. Bootlog and config
> attached. Reproducible on that sha1 and with that config.

I think I see it, the internal CFQ blkg groups are not properly
initialized... Will send a patch shortly.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Jens Axboe <jaxboe(a)fusionio.com> wrote:

> On 2010-06-11 10:32, Ingo Molnar wrote:
> >
> > * Jens Axboe <jaxboe(a)fusionio.com> wrote:
> >
> >> On 2010-06-09 03:53, Linus Torvalds wrote:
> >>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16129
> >>>> Subject : BUG: using smp_processor_id() in preemptible [00000000] code: jbd2/sda2
> >>>> Submitter : Jan Kreuzer <kontrollator(a)gmx.de>
> >>>> Date : 2010-06-05 06:15 (4 days old)
> >>>
> >>> This seems to have been introduced by
> >>>
> >>> commit 7cbaef9c83e58bbd4bdd534b09052b6c5ec457d5
> >>> Author: Ingo Molnar <mingo(a)elte.hu>
> >>> Date: Sat Nov 8 17:05:38 2008 +0100
> >>>
> >>> sched: optimize sched_clock() a bit
> >>>
> >>> sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
> >>> variant of __cycles_2_ns().
> >>>
> >>> Most of the time sched_clock() is called with irqs disabled already.
> >>> The few places that call it with irqs enabled need to be updated.
> >>>
> >>> Signed-off-by: Ingo Molnar <mingo(a)elte.hu>
> >>>
> >>> and this seems to be one of those calling cases that need to be updated..
> >>>
> >>> Ingo? The call trace is:
> >>>
> >>> BUG: using smp_processor_id() in preemptible [00000000] code: jbd2/sda2-8/337
> >>> caller is native_sched_clock+0x3c/0x68
> >>> Pid: 337, comm: jbd2/sda2-8 Not tainted 2.6.35-rc1jan+ #4
> >>> Call Trace:
> >>> [<ffffffff812362c5>] debug_smp_processor_id+0xc9/0xe4
> >>> [<ffffffff8101059d>] native_sched_clock+0x3c/0x68
> >>> [<ffffffff8101043d>] sched_clock+0x9/0xd
> >>> [<ffffffff81212d7a>] blk_rq_init+0x97/0xa3
> >>> [<ffffffff81214d71>] get_request+0x1c4/0x2d0
> >>> [<ffffffff81214ea6>] get_request_wait+0x29/0x1a6
> >>> [<ffffffff81215537>] __make_request+0x338/0x45b
> >>> [<ffffffff812147c2>] generic_make_request+0x2bb/0x330
> >>> [<ffffffff81214909>] submit_bio+0xd2/0xef
> >>> [<ffffffff811413cb>] submit_bh+0xf4/0x116
> >>> [<ffffffff81144853>] block_write_full_page_endio+0x89/0x96
> >>> [<ffffffff81144875>] block_write_full_page+0x15/0x17
> >>> [<ffffffff8119b00a>] ext4_writepage+0x356/0x36b
> >>> [<ffffffff810e1f91>] __writepage+0x1a/0x39
> >>> [<ffffffff810e32a6>] write_cache_pages+0x20d/0x346
> >>> [<ffffffff810e3406>] generic_writepages+0x27/0x29
> >>> [<ffffffff811ca279>] journal_submit_data_buffers+0x110/0x17d
> >>> [<ffffffff811ca986>] jbd2_journal_commit_transaction+0x4cb/0x156d
> >>> [<ffffffff811d0cba>] kjournald2+0x147/0x37a
> >>>
> >>> (from the bugzilla thing)
> >>
> >> This should be fixed by commit 28f4197e which was merged on friday.
> >
> > Hm, it's still not entirely fixed, as of 2.6.35-rc2-00131-g7908a9e. With some
> > configs i get bad spinlock warnings during bootup:
> >
> > [ 28.968013] initcall net_olddevs_init+0x0/0x82 returned 0 after 93750 usecs
> > [ 28.972003] calling b44_init+0x0/0x55 @ 1
> > [ 28.976009] bus: 'pci': add driver b44
> > [ 28.976374] sda:
> > [ 28.978157] BUG: spinlock bad magic on CPU#1, async/0/117
> > [ 28.980000] lock: 7e1c5bbc, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> > [ 28.980000] Pid: 117, comm: async/0 Not tainted 2.6.35-rc2-tip-01092-g010e7ef-dirty #8183
> > [ 28.980000] Call Trace:
> > [ 28.980000] [<41ba6d55>] ? printk+0x20/0x24
> > [ 28.980000] [<4134b7b7>] spin_bug+0x7c/0x87
> > [ 28.980000] [<4134b853>] do_raw_spin_lock+0x1e/0x123
> > [ 28.980000] [<41ba92ca>] ? _raw_spin_lock_irqsave+0x12/0x20
> > [ 28.980000] [<41ba92d2>] _raw_spin_lock_irqsave+0x1a/0x20
> > [ 28.980000] [<4133476f>] blkiocg_update_io_add_stats+0x25/0xfb
> > [ 28.980000] [<41335dae>] ? cfq_prio_tree_add+0xb1/0xc1
> > [ 28.980000] [<41337bc7>] cfq_insert_request+0x8c/0x425
> > [ 28.980000] [<41ba9271>] ? _raw_spin_unlock_irqrestore+0x17/0x23
> > [ 28.980000] [<41ba9271>] ? _raw_spin_unlock_irqrestore+0x17/0x23
> > [ 28.980000] [<41329225>] elv_insert+0x107/0x1a0
> > [ 28.980000] [<41329354>] __elv_add_request+0x96/0x9d
> > [ 28.980000] [<4132bb8c>] ? drive_stat_acct+0x9d/0xc6
> > [ 28.980000] [<4132dd64>] __make_request+0x335/0x376
> > [ 28.980000] [<4132c726>] generic_make_request+0x336/0x39d
> > [ 28.980000] [<410ad422>] ? kmem_cache_alloc+0xa1/0x105
> > [ 28.980000] [<41089285>] ? mempool_alloc_slab+0xe/0x10
> > [ 28.980000] [<41089285>] ? mempool_alloc_slab+0xe/0x10
> > [ 28.980000] [<41089285>] ? mempool_alloc_slab+0xe/0x10
> > [ 28.980000] [<41089347>] ? mempool_alloc+0x57/0xe2
> > [ 28.980000] [<4132c804>] submit_bio+0x77/0x8f
> > [ 28.980000] [<410d2cbc>] ? bio_alloc_bioset+0x37/0x94
> > [ 28.980000] [<410ceb90>] submit_bh+0xc3/0xe2
> > [ 28.980000] [<410d1474>] block_read_full_page+0x249/0x259
> > [ 28.980000] [<410d31fb>] ? blkdev_get_block+0x0/0xc6
> > [ 28.980000] [<41087bfa>] ? add_to_page_cache_locked+0x94/0xb5
> > [ 28.980000] [<410d3d92>] blkdev_readpage+0xf/0x11
> > [ 28.980000] [<41088823>] do_read_cache_page+0x7d/0x11a
> > [ 28.980000] [<410d3d83>] ? blkdev_readpage+0x0/0x11
> > [ 28.980000] [<410888f4>] read_cache_page_async+0x16/0x1b
> > [ 28.980000] [<41088904>] read_cache_page+0xb/0x12
> > [ 28.980000] [<410e80e1>] read_dev_sector+0x2a/0x63
> > [ 28.980000] [<410e92e8>] adfspart_check_ICS+0x2e/0x166
> > [ 28.980000] [<41ba6d55>] ? printk+0x20/0x24
> > [ 28.980000] [<410e8d23>] rescan_partitions+0x196/0x3e4
> > [ 28.980000] [<41ba7dc7>] ? __mutex_unlock_slowpath+0x98/0x9f
> > [ 28.980000] [<410e92ba>] ? adfspart_check_ICS+0x0/0x166
> > [ 28.980000] [<410d4277>] __blkdev_get+0x1e7/0x292
> > [ 28.980000] [<4133a201>] ? kobject_put+0x14/0x16
> > [ 28.980000] [<410d432c>] blkdev_get+0xa/0xc
> > [ 28.980000] [<410e81fb>] register_disk+0x94/0xe5
> > [ 28.980000] [<413326c6>] ? blk_register_region+0x1b/0x20
> > [ 28.980000] [<41332815>] add_disk+0x57/0x95
> > [ 28.980000] [<41331fc6>] ? exact_match+0x0/0x8
> > [ 28.980000] [<4133233f>] ? exact_lock+0x0/0x11
> > [ 28.980000] [<41643848>] sd_probe_async+0x108/0x1be
> > [ 28.980000] [<41048865>] async_thread+0xf5/0x1e6
> > [ 28.980000] [<4102cbcb>] ? default_wake_function+0x0/0xd
> > [ 28.980000] [<41048770>] ? async_thread+0x0/0x1e6
> > [ 28.980000] [<410433df>] kthread+0x5f/0x64
> > [ 28.980000] [<41043380>] ? kthread+0x0/0x64
> > [ 28.980000] [<41002cc6>] kernel_thread_helper+0x6/0x10
> > [ 29.264071] async/1 used greatest stack depth: 2336 bytes left
> > [ 29.267020] bus: 'ssb': add driver b44
> > [ 29.267072] initcall b44_init+0x0/0x55 returned 0 after 281250 usecs
> > [ 29.267076] calling init_nic+0x0/0x16 @ 1
> >
> > Caused by the same blkiocg_update_io_add_stats() function. Bootlog and config
> > attached. Reproducible on that sha1 and with that config.
>
> I think I see it, the internal CFQ blkg groups are not properly
> initialized... Will send a patch shortly.

Cool - can test it with a short turnaround, the bug is easy to reproduce.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jens Axboe on
On 2010-06-11 10:55, Ingo Molnar wrote:
>
> * Jens Axboe <jaxboe(a)fusionio.com> wrote:
>
>> On 2010-06-11 10:32, Ingo Molnar wrote:
>>>
>>> * Jens Axboe <jaxboe(a)fusionio.com> wrote:
>>>
>>>> On 2010-06-09 03:53, Linus Torvalds wrote:
>>>>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16129
>>>>>> Subject : BUG: using smp_processor_id() in preemptible [00000000] code: jbd2/sda2
>>>>>> Submitter : Jan Kreuzer <kontrollator(a)gmx.de>
>>>>>> Date : 2010-06-05 06:15 (4 days old)
>>>>>
>>>>> This seems to have been introduced by
>>>>>
>>>>> commit 7cbaef9c83e58bbd4bdd534b09052b6c5ec457d5
>>>>> Author: Ingo Molnar <mingo(a)elte.hu>
>>>>> Date: Sat Nov 8 17:05:38 2008 +0100
>>>>>
>>>>> sched: optimize sched_clock() a bit
>>>>>
>>>>> sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
>>>>> variant of __cycles_2_ns().
>>>>>
>>>>> Most of the time sched_clock() is called with irqs disabled already.
>>>>> The few places that call it with irqs enabled need to be updated..
>>>>>
>>>>> Signed-off-by: Ingo Molnar <mingo(a)elte.hu>
>>>>>
>>>>> and this seems to be one of those calling cases that need to be updated..
>>>>>
>>>>> Ingo? The call trace is:
>>>>>
>>>>> BUG: using smp_processor_id() in preemptible [00000000] code: jbd2/sda2-8/337
>>>>> caller is native_sched_clock+0x3c/0x68
>>>>> Pid: 337, comm: jbd2/sda2-8 Not tainted 2.6.35-rc1jan+ #4
>>>>> Call Trace:
>>>>> [<ffffffff812362c5>] debug_smp_processor_id+0xc9/0xe4
>>>>> [<ffffffff8101059d>] native_sched_clock+0x3c/0x68
>>>>> [<ffffffff8101043d>] sched_clock+0x9/0xd
>>>>> [<ffffffff81212d7a>] blk_rq_init+0x97/0xa3
>>>>> [<ffffffff81214d71>] get_request+0x1c4/0x2d0
>>>>> [<ffffffff81214ea6>] get_request_wait+0x29/0x1a6
>>>>> [<ffffffff81215537>] __make_request+0x338/0x45b
>>>>> [<ffffffff812147c2>] generic_make_request+0x2bb/0x330
>>>>> [<ffffffff81214909>] submit_bio+0xd2/0xef
>>>>> [<ffffffff811413cb>] submit_bh+0xf4/0x116
>>>>> [<ffffffff81144853>] block_write_full_page_endio+0x89/0x96
>>>>> [<ffffffff81144875>] block_write_full_page+0x15/0x17
>>>>> [<ffffffff8119b00a>] ext4_writepage+0x356/0x36b
>>>>> [<ffffffff810e1f91>] __writepage+0x1a/0x39
>>>>> [<ffffffff810e32a6>] write_cache_pages+0x20d/0x346
>>>>> [<ffffffff810e3406>] generic_writepages+0x27/0x29
>>>>> [<ffffffff811ca279>] journal_submit_data_buffers+0x110/0x17d
>>>>> [<ffffffff811ca986>] jbd2_journal_commit_transaction+0x4cb/0x156d
>>>>> [<ffffffff811d0cba>] kjournald2+0x147/0x37a
>>>>>
>>>>> (from the bugzilla thing)
>>>>
>>>> This should be fixed by commit 28f4197e which was merged on friday.
>>>
>>> Hm, it's still not entirely fixed, as of 2.6.35-rc2-00131-g7908a9e. With some
>>> configs i get bad spinlock warnings during bootup:
>>>
>>> [ 28.968013] initcall net_olddevs_init+0x0/0x82 returned 0 after 93750 usecs
>>> [ 28.972003] calling b44_init+0x0/0x55 @ 1
>>> [ 28.976009] bus: 'pci': add driver b44
>>> [ 28.976374] sda:
>>> [ 28.978157] BUG: spinlock bad magic on CPU#1, async/0/117
>>> [ 28.980000] lock: 7e1c5bbc, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
>>> [ 28.980000] Pid: 117, comm: async/0 Not tainted 2.6.35-rc2-tip-01092-g010e7ef-dirty #8183
>>> [ 28.980000] Call Trace:
>>> [ 28.980000] [<41ba6d55>] ? printk+0x20/0x24
>>> [ 28.980000] [<4134b7b7>] spin_bug+0x7c/0x87
>>> [ 28.980000] [<4134b853>] do_raw_spin_lock+0x1e/0x123
>>> [ 28.980000] [<41ba92ca>] ? _raw_spin_lock_irqsave+0x12/0x20
>>> [ 28.980000] [<41ba92d2>] _raw_spin_lock_irqsave+0x1a/0x20
>>> [ 28.980000] [<4133476f>] blkiocg_update_io_add_stats+0x25/0xfb
>>> [ 28.980000] [<41335dae>] ? cfq_prio_tree_add+0xb1/0xc1
>>> [ 28.980000] [<41337bc7>] cfq_insert_request+0x8c/0x425
>>> [ 28.980000] [<41ba9271>] ? _raw_spin_unlock_irqrestore+0x17/0x23
>>> [ 28.980000] [<41ba9271>] ? _raw_spin_unlock_irqrestore+0x17/0x23
>>> [ 28.980000] [<41329225>] elv_insert+0x107/0x1a0
>>> [ 28.980000] [<41329354>] __elv_add_request+0x96/0x9d
>>> [ 28.980000] [<4132bb8c>] ? drive_stat_acct+0x9d/0xc6
>>> [ 28.980000] [<4132dd64>] __make_request+0x335/0x376
>>> [ 28.980000] [<4132c726>] generic_make_request+0x336/0x39d
>>> [ 28.980000] [<410ad422>] ? kmem_cache_alloc+0xa1/0x105
>>> [ 28.980000] [<41089285>] ? mempool_alloc_slab+0xe/0x10
>>> [ 28.980000] [<41089285>] ? mempool_alloc_slab+0xe/0x10
>>> [ 28.980000] [<41089285>] ? mempool_alloc_slab+0xe/0x10
>>> [ 28.980000] [<41089347>] ? mempool_alloc+0x57/0xe2
>>> [ 28.980000] [<4132c804>] submit_bio+0x77/0x8f
>>> [ 28.980000] [<410d2cbc>] ? bio_alloc_bioset+0x37/0x94
>>> [ 28.980000] [<410ceb90>] submit_bh+0xc3/0xe2
>>> [ 28.980000] [<410d1474>] block_read_full_page+0x249/0x259
>>> [ 28.980000] [<410d31fb>] ? blkdev_get_block+0x0/0xc6
>>> [ 28.980000] [<41087bfa>] ? add_to_page_cache_locked+0x94/0xb5
>>> [ 28.980000] [<410d3d92>] blkdev_readpage+0xf/0x11
>>> [ 28.980000] [<41088823>] do_read_cache_page+0x7d/0x11a
>>> [ 28.980000] [<410d3d83>] ? blkdev_readpage+0x0/0x11
>>> [ 28.980000] [<410888f4>] read_cache_page_async+0x16/0x1b
>>> [ 28.980000] [<41088904>] read_cache_page+0xb/0x12
>>> [ 28.980000] [<410e80e1>] read_dev_sector+0x2a/0x63
>>> [ 28.980000] [<410e92e8>] adfspart_check_ICS+0x2e/0x166
>>> [ 28.980000] [<41ba6d55>] ? printk+0x20/0x24
>>> [ 28.980000] [<410e8d23>] rescan_partitions+0x196/0x3e4
>>> [ 28.980000] [<41ba7dc7>] ? __mutex_unlock_slowpath+0x98/0x9f
>>> [ 28.980000] [<410e92ba>] ? adfspart_check_ICS+0x0/0x166
>>> [ 28.980000] [<410d4277>] __blkdev_get+0x1e7/0x292
>>> [ 28.980000] [<4133a201>] ? kobject_put+0x14/0x16
>>> [ 28.980000] [<410d432c>] blkdev_get+0xa/0xc
>>> [ 28.980000] [<410e81fb>] register_disk+0x94/0xe5
>>> [ 28.980000] [<413326c6>] ? blk_register_region+0x1b/0x20
>>> [ 28.980000] [<41332815>] add_disk+0x57/0x95
>>> [ 28.980000] [<41331fc6>] ? exact_match+0x0/0x8
>>> [ 28.980000] [<4133233f>] ? exact_lock+0x0/0x11
>>> [ 28.980000] [<41643848>] sd_probe_async+0x108/0x1be
>>> [ 28.980000] [<41048865>] async_thread+0xf5/0x1e6
>>> [ 28.980000] [<4102cbcb>] ? default_wake_function+0x0/0xd
>>> [ 28.980000] [<41048770>] ? async_thread+0x0/0x1e6
>>> [ 28.980000] [<410433df>] kthread+0x5f/0x64
>>> [ 28.980000] [<41043380>] ? kthread+0x0/0x64
>>> [ 28.980000] [<41002cc6>] kernel_thread_helper+0x6/0x10
>>> [ 29.264071] async/1 used greatest stack depth: 2336 bytes left
>>> [ 29.267020] bus: 'ssb': add driver b44
>>> [ 29.267072] initcall b44_init+0x0/0x55 returned 0 after 281250 usecs
>>> [ 29.267076] calling init_nic+0x0/0x16 @ 1
>>>
>>> Caused by the same blkiocg_update_io_add_stats() function. Bootlog and config
>>> attached. Reproducible on that sha1 and with that config.
>>
>> I think I see it, the internal CFQ blkg groups are not properly
>> initialized... Will send a patch shortly.
>
> Cool - can test it with a short turnaround, the bug is easy to reproduce.

Thanks, I need to ensure what the best way to solve it is. The problem
is that if you have BLK_CGROUP set but don't enable the CFQ cgroup
stuff, then you end up calling the real update functions but CFQ has not
initialized them.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jens Axboe on
On 2010-06-11 10:55, Ingo Molnar wrote:
>>> Caused by the same blkiocg_update_io_add_stats() function. Bootlog and config
>>> attached. Reproducible on that sha1 and with that config.
>>
>> I think I see it, the internal CFQ blkg groups are not properly
>> initialized... Will send a patch shortly.
>
> Cool - can test it with a short turnaround, the bug is easy to reproduce.

Here's a nasty patch that should fix it. Not optimal, since we really
just want empty functions for these when cfq group scheduling is not
defined.

CC'ing the guilty parties to come up with a better patch that does NOT
involve ifdefs in cfq-iosched.c. We want blk-cgroup.[ch] fixed up.
And trimming the CC list a bit.


diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 5ff4f48..7067c97 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -879,7 +879,9 @@ cfq_group_service_tree_del(struct cfq_data *cfqd, struct cfq_group *cfqg)
if (!RB_EMPTY_NODE(&cfqg->rb_node))
cfq_rb_erase(&cfqg->rb_node, st);
cfqg->saved_workload_slice = 0;
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_dequeue_stats(&cfqg->blkg, 1);
+#endif
}

static inline unsigned int cfq_cfqq_slice_usage(struct cfq_queue *cfqq)
@@ -939,8 +941,10 @@ static void cfq_group_served(struct cfq_data *cfqd, struct cfq_group *cfqg,

cfq_log_cfqg(cfqd, cfqg, "served: vt=%llu min_vt=%llu", cfqg->vdisktime,
st->min_vdisktime);
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_timeslice_used(&cfqg->blkg, used_sl);
blkiocg_set_start_empty_time(&cfqg->blkg);
+#endif
}

#ifdef CONFIG_CFQ_GROUP_IOSCHED
@@ -1421,12 +1425,17 @@ static void cfq_reposition_rq_rb(struct cfq_queue *cfqq, struct request *rq)
{
elv_rb_del(&cfqq->sort_list, rq);
cfqq->queued[rq_is_sync(rq)]--;
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_io_remove_stats(&(RQ_CFQG(rq))->blkg, rq_data_dir(rq),
rq_is_sync(rq));
+#endif
cfq_add_rq_rb(rq);
+
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_io_add_stats(&(RQ_CFQG(rq))->blkg,
&cfqq->cfqd->serving_group->blkg, rq_data_dir(rq),
rq_is_sync(rq));
+#endif
}

static struct request *
@@ -1482,8 +1491,10 @@ static void cfq_remove_request(struct request *rq)
cfq_del_rq_rb(rq);

cfqq->cfqd->rq_queued--;
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_io_remove_stats(&(RQ_CFQG(rq))->blkg, rq_data_dir(rq),
rq_is_sync(rq));
+#endif
if (rq_is_meta(rq)) {
WARN_ON(!cfqq->meta_pending);
cfqq->meta_pending--;
@@ -1518,8 +1529,10 @@ static void cfq_merged_request(struct request_queue *q, struct request *req,
static void cfq_bio_merged(struct request_queue *q, struct request *req,
struct bio *bio)
{
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_io_merged_stats(&(RQ_CFQG(req))->blkg, bio_data_dir(bio),
cfq_bio_sync(bio));
+#endif
}

static void
@@ -1539,8 +1552,10 @@ cfq_merged_requests(struct request_queue *q, struct request *rq,
if (cfqq->next_rq == next)
cfqq->next_rq = rq;
cfq_remove_request(next);
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_io_merged_stats(&(RQ_CFQG(rq))->blkg, rq_data_dir(next),
rq_is_sync(next));
+#endif
}

static int cfq_allow_merge(struct request_queue *q, struct request *rq,
@@ -1571,7 +1586,9 @@ static int cfq_allow_merge(struct request_queue *q, struct request *rq,
static inline void cfq_del_timer(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
del_timer(&cfqd->idle_slice_timer);
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_idle_time_stats(&cfqq->cfqg->blkg);
+#endif
}

static void __cfq_set_active_queue(struct cfq_data *cfqd,
@@ -1580,7 +1597,9 @@ static void __cfq_set_active_queue(struct cfq_data *cfqd,
if (cfqq) {
cfq_log_cfqq(cfqd, cfqq, "set_active wl_prio:%d wl_type:%d",
cfqd->serving_prio, cfqd->serving_type);
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_avg_queue_size_stats(&cfqq->cfqg->blkg);
+#endif
cfqq->slice_start = 0;
cfqq->dispatch_start = jiffies;
cfqq->allocated_slice = 0;
@@ -1911,7 +1930,9 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
sl = cfqd->cfq_slice_idle;

mod_timer(&cfqd->idle_slice_timer, jiffies + sl);
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_set_idle_time_stats(&cfqq->cfqg->blkg);
+#endif
cfq_log_cfqq(cfqd, cfqq, "arm_idle: %lu", sl);
}

@@ -1931,8 +1952,10 @@ static void cfq_dispatch_insert(struct request_queue *q, struct request *rq)
elv_dispatch_sort(q, rq);

cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]++;
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_dispatch_stats(&cfqq->cfqg->blkg, blk_rq_bytes(rq),
rq_data_dir(rq), rq_is_sync(rq));
+#endif
}

/*
@@ -3248,8 +3271,10 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq,
cfq_clear_cfqq_wait_request(cfqq);
__blk_run_queue(cfqd->queue);
} else {
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_idle_time_stats(
&cfqq->cfqg->blkg);
+#endif
cfq_mark_cfqq_must_dispatch(cfqq);
}
}
@@ -3276,9 +3301,11 @@ static void cfq_insert_request(struct request_queue *q, struct request *rq)
rq_set_fifo_time(rq, jiffies + cfqd->cfq_fifo_expire[rq_is_sync(rq)]);
list_add_tail(&rq->queuelist, &cfqq->fifo);
cfq_add_rq_rb(rq);
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_io_add_stats(&(RQ_CFQG(rq))->blkg,
&cfqd->serving_group->blkg, rq_data_dir(rq),
rq_is_sync(rq));
+#endif
cfq_rq_enqueued(cfqd, cfqq, rq);
}

@@ -3364,9 +3391,11 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
WARN_ON(!cfqq->dispatched);
cfqd->rq_in_driver--;
cfqq->dispatched--;
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_update_completion_stats(&cfqq->cfqg->blkg, rq_start_time_ns(rq),
rq_io_start_time_ns(rq), rq_data_dir(rq),
rq_is_sync(rq));
+#endif

cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]--;

@@ -3730,7 +3759,9 @@ static void cfq_exit_queue(struct elevator_queue *e)

cfq_put_async_queues(cfqd);
cfq_release_cfq_groups(cfqd);
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
blkiocg_del_blkio_group(&cfqd->root_group.blkg);
+#endif

spin_unlock_irq(q->queue_lock);


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on
On Fri, Jun 11, 2010 at 11:18:47AM +0200, Jens Axboe wrote:
> On 2010-06-11 10:55, Ingo Molnar wrote:
> >>> Caused by the same blkiocg_update_io_add_stats() function. Bootlog and config
> >>> attached. Reproducible on that sha1 and with that config.
> >>
> >> I think I see it, the internal CFQ blkg groups are not properly
> >> initialized... Will send a patch shortly.
> >
> > Cool - can test it with a short turnaround, the bug is easy to reproduce.
>
> Here's a nasty patch that should fix it. Not optimal, since we really
> just want empty functions for these when cfq group scheduling is not
> defined.
>
> CC'ing the guilty parties to come up with a better patch that does NOT
> involve ifdefs in cfq-iosched.c. We want blk-cgroup.[ch] fixed up.
> And trimming the CC list a bit.

Jens, Ingo, I am sorry for this mess.

Jens,

How about introducing "block/cfq.h" and declaring additional set of wrapper
functions to update blkiocg stats and make these do nothing if
CFQ_GROUP_IOSCHED=n.

For example, in linux-2.6/block/cfq.h, we can define functions as follows.

#ifdef CONFIG_CFQ_GROUP_IOSCHED
cfq_blkiocg_update_dequeue_stats () {
blkiocg_update_dequeue_stats()
}
#else
cfq_blkiocg_update_dequeue_stats () {}
#endif

Fixing it blk-cgroup.[ch] might not be best as BLK_CGROUP is set.
Secondly, if there are other IO control policies later, they might
want to make use of BLK_CGROUP while cfq has disabled the group io
scheduling.

I will prepare a patch and see how does it look.

Thanks
Vivek

>
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 5ff4f48..7067c97 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -879,7 +879,9 @@ cfq_group_service_tree_del(struct cfq_data *cfqd, struct cfq_group *cfqg)
> if (!RB_EMPTY_NODE(&cfqg->rb_node))
> cfq_rb_erase(&cfqg->rb_node, st);
> cfqg->saved_workload_slice = 0;
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_dequeue_stats(&cfqg->blkg, 1);
> +#endif
> }
>
> static inline unsigned int cfq_cfqq_slice_usage(struct cfq_queue *cfqq)
> @@ -939,8 +941,10 @@ static void cfq_group_served(struct cfq_data *cfqd, struct cfq_group *cfqg,
>
> cfq_log_cfqg(cfqd, cfqg, "served: vt=%llu min_vt=%llu", cfqg->vdisktime,
> st->min_vdisktime);
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_timeslice_used(&cfqg->blkg, used_sl);
> blkiocg_set_start_empty_time(&cfqg->blkg);
> +#endif
> }
>
> #ifdef CONFIG_CFQ_GROUP_IOSCHED
> @@ -1421,12 +1425,17 @@ static void cfq_reposition_rq_rb(struct cfq_queue *cfqq, struct request *rq)
> {
> elv_rb_del(&cfqq->sort_list, rq);
> cfqq->queued[rq_is_sync(rq)]--;
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_io_remove_stats(&(RQ_CFQG(rq))->blkg, rq_data_dir(rq),
> rq_is_sync(rq));
> +#endif
> cfq_add_rq_rb(rq);
> +
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_io_add_stats(&(RQ_CFQG(rq))->blkg,
> &cfqq->cfqd->serving_group->blkg, rq_data_dir(rq),
> rq_is_sync(rq));
> +#endif
> }
>
> static struct request *
> @@ -1482,8 +1491,10 @@ static void cfq_remove_request(struct request *rq)
> cfq_del_rq_rb(rq);
>
> cfqq->cfqd->rq_queued--;
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_io_remove_stats(&(RQ_CFQG(rq))->blkg, rq_data_dir(rq),
> rq_is_sync(rq));
> +#endif
> if (rq_is_meta(rq)) {
> WARN_ON(!cfqq->meta_pending);
> cfqq->meta_pending--;
> @@ -1518,8 +1529,10 @@ static void cfq_merged_request(struct request_queue *q, struct request *req,
> static void cfq_bio_merged(struct request_queue *q, struct request *req,
> struct bio *bio)
> {
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_io_merged_stats(&(RQ_CFQG(req))->blkg, bio_data_dir(bio),
> cfq_bio_sync(bio));
> +#endif
> }
>
> static void
> @@ -1539,8 +1552,10 @@ cfq_merged_requests(struct request_queue *q, struct request *rq,
> if (cfqq->next_rq == next)
> cfqq->next_rq = rq;
> cfq_remove_request(next);
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_io_merged_stats(&(RQ_CFQG(rq))->blkg, rq_data_dir(next),
> rq_is_sync(next));
> +#endif
> }
>
> static int cfq_allow_merge(struct request_queue *q, struct request *rq,
> @@ -1571,7 +1586,9 @@ static int cfq_allow_merge(struct request_queue *q, struct request *rq,
> static inline void cfq_del_timer(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> del_timer(&cfqd->idle_slice_timer);
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_idle_time_stats(&cfqq->cfqg->blkg);
> +#endif
> }
>
> static void __cfq_set_active_queue(struct cfq_data *cfqd,
> @@ -1580,7 +1597,9 @@ static void __cfq_set_active_queue(struct cfq_data *cfqd,
> if (cfqq) {
> cfq_log_cfqq(cfqd, cfqq, "set_active wl_prio:%d wl_type:%d",
> cfqd->serving_prio, cfqd->serving_type);
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_avg_queue_size_stats(&cfqq->cfqg->blkg);
> +#endif
> cfqq->slice_start = 0;
> cfqq->dispatch_start = jiffies;
> cfqq->allocated_slice = 0;
> @@ -1911,7 +1930,9 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
> sl = cfqd->cfq_slice_idle;
>
> mod_timer(&cfqd->idle_slice_timer, jiffies + sl);
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_set_idle_time_stats(&cfqq->cfqg->blkg);
> +#endif
> cfq_log_cfqq(cfqd, cfqq, "arm_idle: %lu", sl);
> }
>
> @@ -1931,8 +1952,10 @@ static void cfq_dispatch_insert(struct request_queue *q, struct request *rq)
> elv_dispatch_sort(q, rq);
>
> cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]++;
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_dispatch_stats(&cfqq->cfqg->blkg, blk_rq_bytes(rq),
> rq_data_dir(rq), rq_is_sync(rq));
> +#endif
> }
>
> /*
> @@ -3248,8 +3271,10 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq,
> cfq_clear_cfqq_wait_request(cfqq);
> __blk_run_queue(cfqd->queue);
> } else {
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_idle_time_stats(
> &cfqq->cfqg->blkg);
> +#endif
> cfq_mark_cfqq_must_dispatch(cfqq);
> }
> }
> @@ -3276,9 +3301,11 @@ static void cfq_insert_request(struct request_queue *q, struct request *rq)
> rq_set_fifo_time(rq, jiffies + cfqd->cfq_fifo_expire[rq_is_sync(rq)]);
> list_add_tail(&rq->queuelist, &cfqq->fifo);
> cfq_add_rq_rb(rq);
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_io_add_stats(&(RQ_CFQG(rq))->blkg,
> &cfqd->serving_group->blkg, rq_data_dir(rq),
> rq_is_sync(rq));
> +#endif
> cfq_rq_enqueued(cfqd, cfqq, rq);
> }
>
> @@ -3364,9 +3391,11 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
> WARN_ON(!cfqq->dispatched);
> cfqd->rq_in_driver--;
> cfqq->dispatched--;
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_update_completion_stats(&cfqq->cfqg->blkg, rq_start_time_ns(rq),
> rq_io_start_time_ns(rq), rq_data_dir(rq),
> rq_is_sync(rq));
> +#endif
>
> cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]--;
>
> @@ -3730,7 +3759,9 @@ static void cfq_exit_queue(struct elevator_queue *e)
>
> cfq_put_async_queues(cfqd);
> cfq_release_cfq_groups(cfqd);
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> blkiocg_del_blkio_group(&cfqd->root_group.blkg);
> +#endif
>
> spin_unlock_irq(q->queue_lock);
>
>
> --
> Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/