SOLVED : Re: Understanding multi-core CPU usage

theBlueSage tbs-Gb/NUjX2UK8 at public.gmane.org
Wed May 5 12:54:19 EDT 2010


On Tue, 2010-05-04 at 14:22 -0400, Dan Ritter wrote:

> On Tue, May 04, 2010 at 02:12:36PM -0400, theBlueSage wrote:
> > Hi Folks,
> > 
> > summary of aaagh!
> > --------------------
> > I was wondering if there is a tool out there that can tell me, on a
> > 64bit multi-core server, which exact process is hammering a given CPU
> > core ?
> 
> top with option field j shows last processor associated.
> 
> -dsr-
> 


Thanks Dan, but I cant seem to add the option 'j', neither as an
addition to the command line, nor as a key in interactive mode while top
is running :( However, realizing that it _must_ be possible, from your
response, I went on a little journey and came up with these lovely
little tools and thought I should post back in case anyone else runs
into this issue


[root at server ~]# ps -A -o user,pid,pcpu,s,stime,pmem,comm,psr,pset |grep
'process name' 

 -- will show you which cores are running the given process 'process
name' and their usage stats.  Example output when searching for process
name 'ndbd' :

root     26805  0.0 S Apr05  0.0 ndbd              1    -
root     26806 73.3 R Apr05 77.5 ndbd             15    -

where 1 and 15 are the cores being used the the process 'ndbd'

I also found 
[root at server ~]# mpstat -P ALL 1 1

 -- which showed a breakdown for ALL cores, every 1 second, just once.
Example output looks like this :

Average:     CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal
%idle    intr/s
Average:     all    5.11    0.00    0.37    3.80    0.06    0.31    0.00
90.34  13834.00
Average:       0    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00   1002.00
Average:       1    0.00    0.00    0.00   43.00    0.00    0.00    0.00
57.00    189.00
Average:       2    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      0.00
Average:       3    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      0.00
Average:       4    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      0.00
Average:       5    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      0.00
Average:       6    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      0.00
Average:       7    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      5.00
Average:       8    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      0.00
Average:       9    0.00    0.00    0.00   19.00    0.00    0.00    0.00
81.00      4.00
Average:      10    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      0.00
Average:      11   78.22    0.00    4.95    0.00    0.99    3.96    0.00
11.88   8106.00
Average:      12    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      0.00
Average:      13    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00   4342.00
Average:      14    0.00    0.00    0.00    0.00    0.00    0.00    0.00
100.00      0.00
Average:      15    3.00    0.00    0.00    0.00    0.00    0.00    0.00
97.00    186.00

as you can see from this my CPU core 11 is a little overworked compared
to the other cores :)

as a way of monitoring it it really was the 'ndbd' process that was
killing my box, I did this as a real-time running process.

[root at server ~]# for i in 1 2 3 4 5 6 7 8 9 10; do mpstat -P ALL 1 1; ps
-A -o user,pid,pcpu,s,stime,pmem,comm,psr,pset |grep ndbd; echo; done

I was able to see that the CPU core being slammed was always the one
being used by the 'ndbd' process.

If course, solving the problem of why this happens is a different task
entirely, but with proof I can move forward.



Richard







More information about the Discuss mailing list