From: Jeremy Allison on
On Thu, Feb 04, 2010 at 10:07:57AM +0100, Joe Ammann wrote:
> Hi all
>
> On a CentOS 5.4 system with Samba 3.0.33 (member server of an AD domain in
> 2003 native mode) I have the problem that certain users can't use the shares
> (can't logon), while others can.
>
> I *think* this is related to the fact that those users unable to connect are
> member of a huge number of groups (100+).
>
> We know from experience that this is a problem in Windows itsself (need to set
> MaxTokenSize as discussed here http://support.microsoft.com/kb/327825) or with
> Apache mod_auth_kerb (need to set LimitRequestFieldSize in Apache).
>
> Unfortunately, I was unable to find any clear indication that this might also
> be a problem with Samba/Winbind, let alone find a solution for it. And I must
> admit that I don't have any log entries that actually point me in this
> direction, so it's more of a "feeling" :-/
>
> I just wanted to ask if that (users being member of a huge number of AD groups
> and thus there Kerberos ticket getting really big) can be at all a problem
> with Samba/Winbind and that I should investigate more thouroughly along this
> line?

It could be. We depend on the underlying krb5 libraries to
do this right (fallback to TCP to get the ticket if it's too
large for UDP). What error messages do you get in the logs ?

Jeremy.
--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba
From: Joe Ammann on
Hi all

On Fri, February 5, 2010 00:02, Jeremy Allison wrote:
>> I just wanted to ask if that (users being member of a huge number of AD
>> groups
>> and thus there Kerberos ticket getting really big) can be at all a
>> problem
>> with Samba/Winbind and that I should investigate more thouroughly along
>> this
>> line?
>
> It could be. We depend on the underlying krb5 libraries to
> do this right (fallback to TCP to get the ticket if it's too
> large for UDP). What error messages do you get in the logs ?

Sorry for the delay. I tried to reproduce this in a lab setup, but was
unable to. Even with a user that is a member of 1000 groups, accessing and
permission check works. So it's probably not an issue with the sheer
number of groups.

So I investigated a bit more in the production environment (the problem
only happens there, of course :-/ I was able to identify 2 users, where 1
would work while the other one doesn't. The problem happens already in
winbind, so anything else is clearly due to this failing. Here's what
happens:

# wbinfo -n xxxxxx
S-1-5-21-1204043072-522325977-1734762113-122312 User (1)

# wbinfo -n xxxxxxa
S-1-5-21-1204043072-522325977-1734762113-124446 User (1)

# wbinfo -i xxxxxx
xxxxxx:*:1122312:1000513:X X:/home/GLOBAL/xxxxxx:/bin/false

# wbinfo -i xxxxxxa
Could not get info for user xxxxxxa

When I pump up the winbind log level to 10, here's what's logged in the
wb-GLOBAL.log

[2010/02/10 15:05:59, 4] nsswitch/winbindd_dual.c:fork_domain_child(1080)
child daemon request 21
[2010/02/10 15:05:59, 10] nsswitch/winbindd_dual.c:child_process_request(478)
process_request: request fn LOOKUPNAME
[2010/02/10 15:05:59, 3]
nsswitch/winbindd_async.c:winbindd_dual_lookupname(950)
[28748]: lookupname GLOBAL\oizlama
[2010/02/10 15:05:59, 10]
nsswitch/winbindd_cache.c:refresh_sequence_number(470)
refresh_sequence_number: GLOBAL time ok
[2010/02/10 15:05:59, 10]
nsswitch/winbindd_cache.c:refresh_sequence_number(504)
refresh_sequence_number: GLOBAL seq number is now 390673974
[2010/02/10 15:05:59, 10] nsswitch/winbindd_cache.c:centry_expired(544)
centry_expired: Key NS/GLOBAL/OIZLAMA for domain GLOBAL is good.
[2010/02/10 15:05:59, 10] nsswitch/winbindd_cache.c:wcache_fetch(629)
wcache_fetch: returning entry NS/GLOBAL/OIZLAMA for domain GLOBAL
[2010/02/10 15:05:59, 10] nsswitch/winbindd_cache.c:name_to_sid(1373)
name_to_sid: [Cached] - cached name for domain GLOBAL status: NT_STATUS_OK
[2010/02/10 15:05:59, 10]
nsswitch/winbindd_cache.c:cache_store_response(2267)
Storing response for pid 28750, len 3240
[2010/02/10 15:05:59, 4] nsswitch/winbindd_dual.c:fork_domain_child(1080)
child daemon request 60
[2010/02/10 15:05:59, 10] nsswitch/winbindd_dual.c:child_process_request(478)
process_request: request fn DUAL_USERINFO
[2010/02/10 15:05:59, 3] nsswitch/winbindd_user.c:winbindd_dual_userinfo(141)
[28748]: lookupsid S-1-5-21-1204043072-522325977-1734762113-124446
[2010/02/10 15:05:59, 10]
nsswitch/winbindd_cache.c:refresh_sequence_number(470)
refresh_sequence_number: GLOBAL time ok
[2010/02/10 15:05:59, 10]
nsswitch/winbindd_cache.c:refresh_sequence_number(504)
refresh_sequence_number: GLOBAL seq number is now 390673974
[2010/02/10 15:05:59, 10] nsswitch/winbindd_cache.c:query_user(1652)
query_user: [Cached] - doing backend query for info for domain GLOBAL
[2010/02/10 15:05:59, 3] nsswitch/winbindd_ads.c:query_user(453)
ads: query_user
[2010/02/10 15:05:59, 10] nsswitch/winbindd_ads.c:ads_cached_connection(46)
ads_cached_connection
[2010/02/10 15:05:59, 7] nsswitch/winbindd_ads.c:ads_cached_connection(59)
Current tickets expire in 35909 seconds (at 1265846668, time is now
1265810759)
[2010/02/10 15:05:59, 1] nsswitch/winbindd_ads.c:query_user(474)
query_user(sid=S-1-5-21-1204043072-522325977-1734762113-124446): Not found
[2010/02/10 15:05:59, 10]
nsswitch/winbindd_cache.c:refresh_sequence_number(470)
refresh_sequence_number: GLOBAL time ok
[2010/02/10 15:05:59, 10]
nsswitch/winbindd_cache.c:refresh_sequence_number(504)
refresh_sequence_number: GLOBAL seq number is now 390673974
[2010/02/10 15:05:59, 1] nsswitch/winbindd_user.c:winbindd_dual_userinfo(152)
error getting user info for sid
S-1-5-21-1204043072-522325977-1734762113-124446
[2010/02/10 15:05:59, 10]
nsswitch/winbindd_cache.c:cache_store_response(2267)
Storing response for pid 28750, len 3240

I can't really see anything that's going wrong besides the "query user:
.... Not found"

Here's the smb.conf

[global]
realm = GLOBAL.SZH.LOC
workgroup = GLOBAL

security = ads

local master = no
preferred master = no

template shell = /bin/false
template homedir = /home/%D/%U

idmap domains = GLOBAL
idmap config GLOBAL:backend = rid
idmap config GLOBAL:base_rid = 0
idmap config GLOBAL:range = 1000000 - 1999999

winbind use default domain = Yes
winbind enum users = No
winbind enum groups = No
winbind nested groups = Yes

log level = 1 winbind:10

Any hints?

CU, Joe
--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba
From: Joe Ammann on
On Wed, February 10, 2010 15:08, Joe Ammann wrote:
> Sorry for the delay. I tried to reproduce this in a lab setup, but was
> unable to. Even with a user that is a member of 1000 groups, accessing and
> permission check works. So it's probably not an issue with the sheer
> number of groups.
>
> So I investigated a bit more in the production environment

Some more testing revealed, that actually the group lookups seems to work:

For the user that works

# wbinfo --user-domgroups=S-1-5-21-1204043072-522325977-1734762113-122312
S-1-5-21-1204043072-522325977-1734762113-122312
..... and so on, total 97 sids

For the user that does not work

# wbinfo --user-domgroups=S-1-5-21-1204043072-522325977-1734762113-124446
S-1-5-21-1204043072-522325977-1734762113-124446
..... and so on, total 131 sids

Also, wbinfo -r does work for both users:

# wbinfo -r xxxxxx | wc -l
225
# wbinfo -r xxxxxxa | wc -l
313

It really looks like the "only" thing that does not work is the user
information lookup. But I don't understand what could fail there?? Besides
the name and the SID (to construct the UID/GID), I can't see what
information is taken from AD??

I'm confused ..

CU, Joe
--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba
 | 
Pages: 1
Prev: Samba only quotas?
Next: vfs objects - zfsacl