From: KOSAKI Motohiro on
Hi

> > Hi
> >
> >> void *kvmalloc(size_t size)
> >> {
> >> void *ptr;
> >>
> >> if (size < PAGE_SIZE)
> >> return kmalloc(PAGE_SIZE, GFP_KERNEL);
> >> ptr = alloc_pages_exact(size, GFP_KERNEL | __GFP_NOWARN);
> >
> > low order GFP_KERNEL allocation never fail. then, this doesn't works
> > as you expected.
>
> Hi, I suppose you mean the kmalloc allocation -- so kmalloc should fail
> iff alloc_pages_exact (unless somebody frees a heap of memory indeed)?

I mean, if size of alloc_pages_exact() argument is less than 8 pages,
alloc_pages_exact() never fail. see __alloc_pages_slowpath().

>
> >> if (ptr != NULL)
> >> return ptr;
> >>
> >> return vmalloc(size);
> >
> > On x86, vmalloc area is only 128MB address space. it is very rare
> > resource than physical ram. vmalloc fallback is not good idea.
>
> These functions are a replacement for explicit
> if (!(x = kmalloc()))
> x = vmalloc();
> ...
> if (is_vmalloc(x))
> vfree(x);
> else
> kfree(x);
> in the code (like fdtable does this).
>
> The 128M limit on x86_32 for vmalloc is configurable so if drivers in
> sum need more on some specific hardware, it can be increased on the
> command line (I had to do this on one machine in the past).

Right, but 99% end user don't do this. I don't think this is effective advise.


> Anyway as this is a replacement for explicit tests, it shouldn't change
> the behaviour in any way. Obviously when a user doesn't need virtually
> contiguous space, he shouldn't use this interface at all.

Why can't we make fdtable virtually contiguous free?
Anyway, alloc_fdmem() also don't works as author expected.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jiri Slaby on
On 05/13/2010 11:05 AM, KOSAKI Motohiro wrote:
>>>> void *kvmalloc(size_t size)
>>>> {
>>>> void *ptr;
>>>>
>>>> if (size < PAGE_SIZE)
>>>> return kmalloc(PAGE_SIZE, GFP_KERNEL);
>>>> ptr = alloc_pages_exact(size, GFP_KERNEL | __GFP_NOWARN);
>>>
>>> low order GFP_KERNEL allocation never fail. then, this doesn't works
>>> as you expected.
>>
>> Hi, I suppose you mean the kmalloc allocation -- so kmalloc should fail
>> iff alloc_pages_exact (unless somebody frees a heap of memory indeed)?
>
> I mean, if size of alloc_pages_exact() argument is less than 8 pages,
> alloc_pages_exact() never fail. see __alloc_pages_slowpath().

Sorry, I don't see what's the problem with that. I can see only that
alloc_pages_exact is superfluous there as kmalloc "won't fail" earlier.

>>>> if (ptr != NULL)
>>>> return ptr;
>>>>
>>>> return vmalloc(size);
>>>
>>> On x86, vmalloc area is only 128MB address space. it is very rare
>>> resource than physical ram. vmalloc fallback is not good idea.
>>
>> These functions are a replacement for explicit
>> if (!(x = kmalloc()))
>> x = vmalloc();
>> ...
>> if (is_vmalloc(x))
>> vfree(x);
>> else
>> kfree(x);
>> in the code (like fdtable does this).
>>
>> The 128M limit on x86_32 for vmalloc is configurable so if drivers in
>> sum need more on some specific hardware, it can be increased on the
>> command line (I had to do this on one machine in the past).
>
> Right, but 99% end user don't do this. I don't think this is effective advise.

Indeed. I didn't mean that as the users should change that. They should
only if there is some weird hardware with weird drivers.

>> Anyway as this is a replacement for explicit tests, it shouldn't change
>> the behaviour in any way. Obviously when a user doesn't need virtually
>> contiguous space, he shouldn't use this interface at all.
>
> Why can't we make fdtable virtually contiguous free?

This is possible, but the question is why to make the code more complex?

> Anyway, alloc_fdmem() also don't works as author expected.

Pardon my ignorance, why? (There are more similar users:
init_section_page_cgroup, sys_add_key, ext4_fill_flex_info and many others.)

--
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
> On 05/13/2010 11:05 AM, KOSAKI Motohiro wrote:
> >>>> void *kvmalloc(size_t size)
> >>>> {
> >>>> void *ptr;
> >>>>
> >>>> if (size < PAGE_SIZE)
> >>>> return kmalloc(PAGE_SIZE, GFP_KERNEL);
> >>>> ptr = alloc_pages_exact(size, GFP_KERNEL | __GFP_NOWARN);
> >>>
> >>> low order GFP_KERNEL allocation never fail. then, this doesn't works
> >>> as you expected.
> >>
> >> Hi, I suppose you mean the kmalloc allocation -- so kmalloc should fail
> >> iff alloc_pages_exact (unless somebody frees a heap of memory indeed)?
> >
> > I mean, if size of alloc_pages_exact() argument is less than 8 pages,
> > alloc_pages_exact() never fail. see __alloc_pages_slowpath().
>
> Sorry, I don't see what's the problem with that. I can see only that
> alloc_pages_exact is superfluous there as kmalloc "won't fail" earlier.

I don't talk about kmalloc. it's ok to never fail. but low order alloc_pages_exact() never fail too.
Is this ok? Why?


> >>>> if (ptr != NULL)
> >>>> return ptr;
> >>>>
> >>>> return vmalloc(size);
> >>>
> >>> On x86, vmalloc area is only 128MB address space. it is very rare
> >>> resource than physical ram. vmalloc fallback is not good idea.
> >>
> >> These functions are a replacement for explicit
> >> if (!(x = kmalloc()))
> >> x = vmalloc();
> >> ...
> >> if (is_vmalloc(x))
> >> vfree(x);
> >> else
> >> kfree(x);
> >> in the code (like fdtable does this).
> >>
> >> The 128M limit on x86_32 for vmalloc is configurable so if drivers in
> >> sum need more on some specific hardware, it can be increased on the
> >> command line (I had to do this on one machine in the past).
> >
> > Right, but 99% end user don't do this. I don't think this is effective advise.
>
> Indeed. I didn't mean that as the users should change that. They should
> only if there is some weird hardware with weird drivers.
>
> >> Anyway as this is a replacement for explicit tests, it shouldn't change
> >> the behaviour in any way. Obviously when a user doesn't need virtually
> >> contiguous space, he shouldn't use this interface at all.
> >
> > Why can't we make fdtable virtually contiguous free?
>
> This is possible, but the question is why to make the code more complex?

because it's broken. Or Am I missing something?


> > Anyway, alloc_fdmem() also don't works as author expected.
>
> Pardon my ignorance, why? (There are more similar users:
> init_section_page_cgroup, sys_add_key, ext4_fill_flex_info and many others.)

I think init_section_page_cgroup is ok. it's called at boot time. we don't enter forever page reclaim.

but other case, I don't know the reason. I guess they also have specific assumption.
I only said, generically it isn't right.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jiri Slaby on
On 05/13/2010 11:40 AM, KOSAKI Motohiro wrote:
>>>> Anyway as this is a replacement for explicit tests, it shouldn't change
>>>> the behaviour in any way. Obviously when a user doesn't need virtually
>>>> contiguous space, he shouldn't use this interface at all.
>>>
>>> Why can't we make fdtable virtually contiguous free?
>>
>> This is possible, but the question is why to make the code more complex?
>
> because it's broken.

Well, could you explain what exactly is broken about
x = kmalloc(size, GFP_KERNEL);
if (!x)
x = vmalloc(size);
? Is is that kmalloc doesn't return until is has the memory to return
when asking for order(size) <= COSTLY_ORDER? I think this is expected.

thanks,
--
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
> On 05/13/2010 11:40 AM, KOSAKI Motohiro wrote:
> >>>> Anyway as this is a replacement for explicit tests, it shouldn't change
> >>>> the behaviour in any way. Obviously when a user doesn't need virtually
> >>>> contiguous space, he shouldn't use this interface at all.
> >>>
> >>> Why can't we make fdtable virtually contiguous free?
> >>
> >> This is possible, but the question is why to make the code more complex?
> >
> > because it's broken.
>
> Well, could you explain what exactly is broken about
> x = kmalloc(size, GFP_KERNEL);
> if (!x)
> x = vmalloc(size);
> ? Is is that kmalloc doesn't return until is has the memory to return
> when asking for order(size) <= COSTLY_ORDER? I think this is expected.

Well, but fdtable doesn't really need contenious memory. no?
To make API mean we recommend to use it. but I don't hope to spread this
wrong habit. Instead, to kill it seems better.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/