From: Minchan Kim on
Hi, Lee.

On Fri, 2010-04-16 at 15:13 -0400, Lee Schermerhorn wrote:
> On Fri, 2010-04-16 at 11:07 -0500, Christoph Lameter wrote:
> > On Thu, 15 Apr 2010, Minchan Kim wrote:
> >
> > > I don't want to remove alloc_pages for UMA system.
> >
> > alloc_pages is the same as alloc_pages_any_node so why have it?
> >
> > > #define alloc_pages alloc_page_sexact_node
> > >
> > > What I want to remove is just alloc_pages_node. :)
> >
> > Why remove it? If you want to get rid of -1 handling then check all the
> > callsites and make sure that they are not using -1.
> >
> > Also could you define a constant for -1? -1 may have various meanings. One
> > is the local node and the other is any node.
>
> NUMA_NO_NODE is #defined as (-1) and can be used for this purpose. '-1'
> has been replaced by this in many cases. It can be interpreted as "No
> node specified" == "any node is acceptable". But, it also has multiple
> meanings. E.g., in the hugetlb sysfs attribute and sysctl functions it
> indicates the global hstates [all nodes] vs a per node hstate. So, I
> suppose one could define a NUMA_ANY_NODE, to make the intention clear at
> the call site.
>
> I believe that all usage of -1 to mean the local node has been removed,
> unless I missed one. Local allocation is now indicated by a mempolicy
> mode flag--MPOL_F_LOCAL. It's treated as a special case of
> MPOL_PREFERRED.

Thanks for good information. :)

--
Kind regards,
Minchan Kim


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
On 04/19/2010 12:54 AM, Minchan Kim wrote:
>> alloc_pages is the same as alloc_pages_any_node so why have it?
>
> I don't want to force using '_node' postfix on UMA users.
> Maybe they don't care getting page from any node and event don't need to
> know about _NODE_.

Yeah, then, remove alloc_pages_any_node(). I can't really see the
point of any_/exact_node. alloc_pages() and alloc_pages_node() are
fine and in line with other functions. Why change it?

>> Why remove it? If you want to get rid of -1 handling then check all the
>
> alloc_pages_node have multiple meaning as you said. So some of users
> misuses that API. I want to clear intention of user.

The name is fine. Just clean up the users and make the intended usage
clear in documentation and implementation (ie. trigger a big fat
warning) and make all the callers use named constants instead of -1
for special meanings.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Minchan Kim on
Hi, Tejun.

On Mon, Apr 19, 2010 at 6:22 AM, Tejun Heo <tj(a)kernel.org> wrote:
> On 04/19/2010 12:54 AM, Minchan Kim wrote:
>>> alloc_pages is the same as alloc_pages_any_node so why have it?
>>
>> I don't want to force using '_node' postfix on UMA users.
>> Maybe they don't care getting page from any node and event don't need to
>> know about _NODE_.
>
> Yeah, then, remove alloc_pages_any_node().  I can't really see the
> point of any_/exact_node.  alloc_pages() and alloc_pages_node() are
> fine and in line with other functions.  Why change it?
>
>>> Why remove it? If you want to get rid of -1 handling then check all the
>>
>> alloc_pages_node have multiple meaning as you said. So some of users
>> misuses that API. I want to clear intention of user.
>
> The name is fine.  Just clean up the users and make the intended usage
> clear in documentation and implementation (ie. trigger a big fat
> warning) and make all the callers use named constants instead of -1
> for special meanings.
>
> Thanks.

Let's tidy my table.

I made quick patch to show the concept with one example of pci-dma.
(Sorry but I attach patch since web gmail's mangling.)

On UMA, we can change alloc_pages with
alloc_pages_exact_node(numa_node_id(),....)
(Actually, the patch is already merged mmotm)

on NUMA, alloc_pages is some different meaning, so I don't want to change it.
on NUMA, alloc_pages_node means _ANY_NODE_.
So let's remove nid argument and change naming with alloc_pages_any_node.

Then, whole users of alloc_pages_node can be changed between
alloc_pages_exact_node and alloc_pages_any_node.

It was my intention. What's your concern?

Thanks for your interest, Tejun. :)

diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index a4ac764..dc511cb 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -152,12 +152,21 @@ void *dma_generic_alloc_coherent(struct device
*dev, size_t size,
unsigned long dma_mask;
struct page *page;
dma_addr_t addr;
+ int nid;

dma_mask = dma_alloc_coherent_mask(dev, flag);

flag |= __GFP_ZERO;
again:
- page = alloc_pages_node(dev_to_node(dev), flag, get_order(size));
+ nid = dev_to_node(dev);
+ /*
+ * If pci-dma maintainer makes sure nid never has NUMA_NO_NODE
+ * we can remove this ugly checking.
+ */
+ if (nid == NUMA_NO_NODE)
+ page = alloc_pages_any_node(flag, get_order(size));
+ else
+ page = alloc_pages_exact_node(nid, flag, get_order(size));
if (!page)
return NULL;

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 4c6d413..47fba21 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -278,13 +278,10 @@ __alloc_pages(gfp_t gfp_mask, unsigned int order,
return __alloc_pages_nodemask(gfp_mask, order, zonelist, NULL);
}

-static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask,
+static inline struct page *alloc_pagse_any_node(gfp_t gfp_mask,
unsigned int order)
{
- /* Unknown node is current node */
- if (nid < 0)
- nid = numa_node_id();
-
+ int nid = numa_node_id();
return __alloc_pages(gfp_mask, order, node_zonelist(nid, gfp_mask));
}

@@ -308,7 +305,7 @@ extern struct page *alloc_page_vma(gfp_t gfp_mask,
struct vm_area_struct *vma, unsigned long addr);
#else
#define alloc_pages(gfp_mask, order) \
- alloc_pages_node(numa_node_id(), gfp_mask, order)
+ alloc_pages_exact_node(numa_node_id(), gfp_mask, order)
#define alloc_page_vma(gfp_mask, vma, addr) alloc_pages(gfp_mask, 0)
#endif
#define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
~

--
Kind regards,
Minchan Kim
From: Mel Gorman on
On Tue, Apr 20, 2010 at 07:27:09AM +0900, Tejun Heo wrote:
> Hello, Christoph.
>
> On 04/20/2010 02:38 AM, Christoph Lameter wrote:
> > alloc_pages_exact_node results in more confusion because it does suggest
> > that fallback to other nodes is not allowed.
>
> I can't see why alloc_pages_exact_node() exists at all. It's in the
> mainline and if you look at the difference between alloc_pages_node()
> and alloc_pages_exact_node(), it's almost silly. :-(
>

alloc_pages_exact_node() avoids a branch in a hot path that is checking for
something the caller already knows. That's the reason it exists.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on
On Mon, 19 Apr 2010, Minchan Kim wrote:

> My concern is following as.
>
> alloc_pages_node means any node but it has nid argument.
> Why should user of alloc_pages who want to get page from any node pass
> nid argument? It's rather awkward.

Its not awkward but an optimization. The page can be placed on any node
but the user would prefer a certain node. Most of the NUMA things are
there for optimization purposes and not for correctness. If you must have
an allocation on certain nodes for correctness (like SLAB) then
GFP_THISNODE is used.

> Some of user misunderstood it and used alloc_pages_node instead of
> alloc_pages_exact_node although he already know exact _NID_.
> Of course, it's not a BUG since if nid >= 0 it works well.
>
> But I want to remove such multiple meaning to clear intention of user.

Its not clear to me that this renaming etc helps. You
must use GFP_THISNODE if allocation must occur from a certain node.
alloc_pages_exact_node results in more confusion because it does suggest
that fallback to other nodes is not allowed.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/