From: Andrew Morton on
On Wed, 11 Aug 2010 22:13:45 +0200
Helge Deller <deller(a)gmx.de> wrote:

> The kernel currently provides no functionality to analyze the RSS
> and swap space usage of each individual sysvipc shared memory segment.
>
> This patch add this info for each existing shm segment by extending
> the output of /proc/sysvipc/shm by two columns for RSS and swap.
>
> Since shmctl(SHM_INFO) already provides a similiar calculation (it
> currently sums up all RSS/swap info for all segments), I did split
> out a static function which is now used by the /proc/sysvipc/shm
> output and shmctl(SHM_INFO).
>

I suppose that could be useful, although it would be most interesting
to hear why _you_ consider it useful?

But is it useful enough to risk breaking existing code which parses
that file? The risk is not great, but it's there.

>
> ---
>
> shm.c | 63 ++++++++++++++++++++++++++++++++++++++++++---------------------
> 1 file changed, 42 insertions(+), 21 deletions(-)
>
>
> diff --git a/ipc/shm.c b/ipc/shm.c
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -108,7 +108,11 @@ void __init shm_init (void)
> {
> shm_init_ns(&init_ipc_ns);
> ipc_init_proc_interface("sysvipc/shm",
> - " key shmid perms size cpid lpid nattch uid gid cuid cgid atime dtime ctime\n",
> +#if BITS_PER_LONG <= 32
> + " key shmid perms size cpid lpid nattch uid gid cuid cgid atime dtime ctime RSS swap\n",
> +#else
> + " key shmid perms size cpid lpid nattch uid gid cuid cgid atime dtime ctime RSS swap\n",

This adds 11 new spaces between "perms" and "size", only on 64-bit
machines. That was unchangelogged and adds another (smaller) risk of
breaking things. Please explain.

This interface is really old and crufty and horrid, but I guess that
there's not a lot we can do about that :(

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Morton on
On Thu, 12 Aug 2010 23:33:29 +0200
Helge Deller <deller(a)gmx.de> wrote:

> On 08/12/2010 10:10 PM, Andrew Morton wrote:
> > On Wed, 11 Aug 2010 22:13:45 +0200
> > Helge Deller<deller(a)gmx.de> wrote:
> >
> >> The kernel currently provides no functionality to analyze the RSS
> >> and swap space usage of each individual sysvipc shared memory segment.
> >>
> >> This patch add this info for each existing shm segment by extending
> >> the output of /proc/sysvipc/shm by two columns for RSS and swap.
> >>
> >> Since shmctl(SHM_INFO) already provides a similiar calculation (it
> >> currently sums up all RSS/swap info for all segments), I did split
> >> out a static function which is now used by the /proc/sysvipc/shm
> >> output and shmctl(SHM_INFO).
> >>
> >
> > I suppose that could be useful, although it would be most interesting
> > to hear why _you_ consider it useful?
>
> A reasonable question, and I really should have explained when I did
> send this patch.
>
> In my job I do work for SAP in the SAP LinuxLab
> (http://www.sap.com/linux) and take care of the SAP ERP enterprise
> software on Linux.
> SAP products (esp. the SAP Netweaver ABAP Kernel) uses lots of big
> shared memory segments (we often have Linux systems with >= 16GB shm
> usage). Sometimes we get customer reports about "slow" system responses
> and while looking into their configurations we often find massive
> swapping activity on the system. With this patch it's now easy to see
> from the command line if and which shm segments gets swapped out (and
> how much) and can more easily give recommendations for system tuning.
> Without the patch it's currently not possible to do such shm analysis at
> all.

OK, thanks. copied-n-pasted into changelog ;)

> So, my patch actually does fix a real-world problem.
>
> By the way - I found another bug/issue in /proc/<pid>/smaps as well. The
> kernel currently does not adds swapped-out shm pages to the swap size
> value correctly. The swap size value always stays zero for shm pages.
> I'm currently preparing a small patch to fix that, which I will send to
> linux-mm for review soon.
>
> > But is it useful enough to risk breaking existing code which parses
> > that file? The risk is not great, but it's there.
>
> Sure. The only positive argument is maybe, that I added the new info to
> the end of the lines. IMHO existing applications which parse /proc files
> should always take into account, that more text could follow with newer
> Linux kernels...?

Yeah, they'd be pretty dumb if they failed because new columns appear
in later kernels.

But there's some pretty dumb code out there.

> > This adds 11 new spaces between "perms" and "size", only on 64-bit
> > machines. That was unchangelogged and adds another (smaller) risk of
> > breaking things. Please explain.
>
> Yes, I did added some spaces in front of the "size" field for 64bit
> kernels to get the columns correct if you cat the contents of the file.
> In sysvipc_shm_proc_show() the kernel prints the size value in
> "SPEC_SIZE" format, which is defined like this:
>
> #if BITS_PER_LONG <= 32
> #define SIZE_SPEC "%10lu"
> #else
> #define SIZE_SPEC "%21lu"
> #endif
>
> So, if the header is not adjusted, the columns are not correctly
> aligned. I actually tested this on 32- and 64-bit and it seems correct now.

<copy, paste>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Hugh Dickins on
On Thu, 12 Aug 2010, Helge Deller wrote:
> On 08/12/2010 10:10 PM, Andrew Morton wrote:
> > On Wed, 11 Aug 2010 22:13:45 +0200
> > Helge Deller<deller(a)gmx.de> wrote:
> >
> > > The kernel currently provides no functionality to analyze the RSS
> > > and swap space usage of each individual sysvipc shared memory segment.
> > >
> > > This patch add this info for each existing shm segment by extending
> > > the output of /proc/sysvipc/shm by two columns for RSS and swap.
> > >
> > > Since shmctl(SHM_INFO) already provides a similiar calculation (it
> > > currently sums up all RSS/swap info for all segments), I did split
> > > out a static function which is now used by the /proc/sysvipc/shm
> > > output and shmctl(SHM_INFO).
> > >
> >
> > I suppose that could be useful, although it would be most interesting
> > to hear why _you_ consider it useful?
>
> A reasonable question, and I really should have explained when I did send this
> patch.
>
> In my job I do work for SAP in the SAP LinuxLab (http://www.sap.com/linux) and
> take care of the SAP ERP enterprise software on Linux.
> SAP products (esp. the SAP Netweaver ABAP Kernel) uses lots of big shared
> memory segments (we often have Linux systems with >= 16GB shm usage).
> Sometimes we get customer reports about "slow" system responses and while
> looking into their configurations we often find massive swapping activity on
> the system. With this patch it's now easy to see from the command line if and
> which shm segments gets swapped out (and how much) and can more easily give
> recommendations for system tuning.
> Without the patch it's currently not possible to do such shm analysis at all.
>
> So, my patch actually does fix a real-world problem.

That's good justification, thanks.

>
> By the way - I found another bug/issue in /proc/<pid>/smaps as well. The
> kernel currently does not adds swapped-out shm pages to the swap size value
> correctly. The swap size value always stays zero for shm pages. I'm currently
> preparing a small patch to fix that, which I will send to linux-mm for review
> soon.

I certainly wouldn't call smaps's present behaviour on it a bug: but given
your justification above, I can see that it would be more useful to you,
and probably to others, for it to be changed in the way that you suggest,
to reveal the underlying swap.

Hmm, I wonder what that patch is going to look like...

>
> > But is it useful enough to risk breaking existing code which parses
> > that file? The risk is not great, but it's there.
>
> Sure. The only positive argument is maybe, that I added the new info to the
> end of the lines. IMHO existing applications which parse /proc files should
> always take into account, that more text could follow with newer Linux
> kernels...?

I hope so too. And agree you're right to correct the 64-bit header
alignment, and to show the new fields in bytes rather than pages.
But one little thing in your patch upsets me greatly...

>
> > > ---
> > >
> > > shm.c | 63
> > > ++++++++++++++++++++++++++++++++++++++++++---------------------
> > > 1 file changed, 42 insertions(+), 21 deletions(-)
> > >
> > >
> > > diff --git a/ipc/shm.c b/ipc/shm.c
> > > --- a/ipc/shm.c
> > > +++ b/ipc/shm.c
> > > @@ -108,7 +108,11 @@ void __init shm_init (void)
> > > {
> > > shm_init_ns(&init_ipc_ns);
> > > ipc_init_proc_interface("sysvipc/shm",
> > > - " key shmid perms size cpid
> > > lpid nattch uid gid cuid cgid atime dtime ctime\n",
> > > +#if BITS_PER_LONG<= 32
> > > + " key shmid perms size cpid
> > > lpid nattch uid gid cuid cgid atime dtime ctime
> > > RSS swap\n",
> > > +#else
> > > + " key shmid perms
> > > size cpid lpid nattch uid gid cuid cgid atime dtime
> > > ctime RSS swap\n",

.... why oh why do you write "RSS" in uppercase, when every other field
is named in lowercase? Please change that to "rss" and then

Acked-by: Hugh Dickins <hughd(a)google.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/