From: Frederic Weisbecker on
On Fri, Jul 23, 2010 at 10:49:20AM -0500, Jason Wessel wrote:
> On 07/23/2010 09:07 AM, Frederic Weisbecker wrote:
> > On Fri, Jul 23, 2010 at 08:19:54AM -0500, Jason Wessel wrote:
> >
> >> On 07/23/2010 08:04 AM, Frederic Weisbecker wrote:
> >>
> >> The patch may or may not be the right way to solve the problem. It is
> >> worth noting that early breakpoints are handled separately with a direct
> >> writes to the debug registers so this API does not apply.
> >>
> >
> >
> >
> > But you still need to handle them on the debug exception, right?
> >
> >
> >
>
> Yes, but at that point kgdb is first in line for the notifier so it
> works out of the box.


Ok.



> > Right.
> >
> > Actually NOTIFY_DONE is returned when there is more work to do: handling
> > another exception than breakpoint, or sending a signal. Otherwise yeah,
> > we return NOTIFY_STOP as we assume there is more work to do.
> >
> >
>
> For this specific case the hw_breakpoint handler simply consumed a
> breakpoint which was not intended for it.



Ah right.

But that thing is right:

/*
* Reset the 'i'th TRAP bit in dr6 to denote completion of
* exception handling
*/
(*dr6_p) &= ~(DR_TRAP0 << i);
/*
* bp can be NULL due to lazy debug register switching
* or due to concurrent perf counter removing.
*/
if (!bp) {
rcu_read_unlock();
break;
}


We need to prevent from dr7 lazy switches. It means kgdb must first check
its own breakpoints.


> > So the following alternatives appear to me:
> >
> > - Moving the breakpoint exception handling into the
> > struct perf_event:overflow_handler. In fact I can't find the breakpoint
> > handling in kgdb.c
> >
> >
>
> It is in the generic die notification handler for kgdb (looking at
> 2.6.35-rc6)
>
> arch/x86/kernel/kgdb.c
>
> 516 static int __kgdb_notify(struct die_args *args, unsigned long cmd)
> ...
> 551 case DIE_DEBUG:
> 552 if (atomic_read(&kgdb_cpu_doing_single_step) !=
> -1) {
> 553 if (user_mode(regs))
> 554 return single_step_cont(regs, args);
> 555 break;
> 556 } else if (test_thread_flag(TIF_SINGLESTEP))
> 557 /* This means a user thread is single
> stepping
> 558 * a system call which should be ignored
> 559 */
> 560 return NOTIFY_DONE;
> 561 /* fall through */



But I can't find where the breakpoints are handled there.



>
> > - Have a higher priority in kgdb notifier (which means decreasing the one
> > of hw_breakpoint.c)
> >
>
> kgdb had always been last in line in arch/x86/kernel/kgdb.c:
>
> 608 static struct notifier_block kgdb_notifier = {
> 609 .notifier_call = kgdb_notify,
> 610
> 611 /*
> 612 * Lowest-prio notifier priority, we want to be notified
> last:
> 613 */
> 614 .priority = -INT_MAX,
> 615 };



Why? It seems to me a kernel debugger should have the highest priority
over anything.



>
> > - Always returning NOTIFY_DONE from the breakpoint path.
> >
> >
>
> Without some further investigation, I am not sure what this will do.



Nothing, this NOTIFY_STOP is only an optimization. But now I think that
won't solve the problem. We still clear a dr6 trap bit for a debug
exception due to lazy dr7 switches we have to handle.

This is why kgdb should have the highest priority, or use the overflow
callback.



> We
> don't want to make things worse of course. Because kgdb uses the
> request hw_breakpoint api to request slot reservation having an
> attribute to say don't do anything to this HW breakpoint is certainly
> one way to fix it.
>
> > Is this a regression BTW?
> >
> >
>
> Absolutely this is a regression. No change was made in kgdb related to
> this and the kgdb HW breakpoint regression tests (which come with the
> kernel) stopped working and bisect to the commit I mentioned.


Yep, this new breakpoint layer has been a PITA for kgdb :)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Mon, Jul 26, 2010 at 07:13:23PM +0800, DDD wrote:
> Frederic Weisbecker wrote:
>> Why? It seems to me a kernel debugger should have the highest priority
>> over anything.
>
> In my option, the reason of kgdb set the lowest-prio for
> notifier is:
>
> For letting kgdb to keep simple, there is no codes to check the
> breakpoint event was generated by kgdb or not, thus it have to set kgdb
> as lowest priority to notifier.
>
> If the breakpoint event is not generated by kgdb, the source of the
> breakpoint event will consume that event before passing to kgdb's
> routine, so that the breakpoint event of kgdb getting must be generated
> by kgdb itself.



Ok, but that makes it hard to differentiate from a spurious breakpoint
event.




>>
>>
>>
>>>> - Always returning NOTIFY_DONE from the breakpoint path.
>>>>
>>>>
>>> Without some further investigation, I am not sure what this will do.
>>
>>
>>
>> Nothing, this NOTIFY_STOP is only an optimization. But now I think that
>> won't solve the problem. We still clear a dr6 trap bit for a debug
>> exception due to lazy dr7 switches we have to handle.
>>
>> This is why kgdb should have the highest priority, or use the overflow
>> callback.
>
>
> OK, I will try to use the overflow callback to let kgdb works with
> hw_breakpoints. :-)
>
> Thanks,
> Dongdong


Thanks.

kgdb sets breakpoints through arch_install_hw_breakpoint()
So when it triggers, it will be handled by perf through
perf_bp_event(), so it's quite natural it is considered as
handled and then it's bit removed from dr6. The only way
for kgdb to handle it properly is to set an overflow_handler.

Tell me if you encounter any problem.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/