I attempted to calculate a formula based on price at different time (). More specifically, donates the first price observed at least 5 minutes after the price which is measured.
The following code is used to create a variable that represents .
data WANT;
set HAVE nobs=nobs;
do _i = _n_ to nobs until(other_date > date_l_);
set HAVE(
rename=( _ric=other_ric
date_l_= other_date
price = other_price
new_time = other_time)
keep=_ric date_l_ price int1min new_time)
point=_i;
if other_ric=_ric and new_time > new_time+300 and other_date = date_l_ then do;
new_price = other_price;
leave;
end;
end;
drop other_: ;
run;
However, the code did not work correctly at all time. As shown in the pic, the new_price is correct in green rectangle but is incorrect in red rectangle. Could anyone help me to solve this problem?
The following is a sample of data.
_RIC Date_L_ Time_L_ Price new_price new_time time
BAG201310900.U 20130715 9:36:19.721 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:19.721 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:22.751 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:22.751 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:24.400 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:24.400 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:28.150 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:28.150 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:45.099 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:45.099 0.27 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:48.929 0.28 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:48.929 0.28 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:49.929 0.28 0.29 9:36 9:41
BAG201310900.U 20130715 9:36:50.899 0.28 0.29 9:36 9:41
BAG201310900.U 20130715 9:37:04.839 0.27 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:04.839 0.27 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:04.848 0.27 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:07.619 0.28 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:11.619 0.28 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:11.619 0.28 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:11.619 0.28 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:12.738 0.28 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:15.528 0.28 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:30.337 0.28 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:32.717 0.28 0.29 9:37 9:42
BAG201310900.U 20130715 9:37:58.636 0.29 0.29 9:37 9:42
BAG201310900.U 20130715 9:38:04.016 0.28 0.29 9:38 9:43
BAG201310900.U 20130715 9:38:07.326 0.28 0.29 9:38 9:43
BAG201310900.U 20130715 9:38:07.849 0.28 0.29 9:38 9:43
BAG201310900.U 20130715 9:38:16.005 0.3 0.29 9:38 9:43
BAG201310900.U 20130715 9:38:18.055 0.3 0.29 9:38 9:43
BAG201310900.U 20130715 9:38:18.055 0.3 0.29 9:38 9:43
BAG201310900.U 20130715 9:38:18.055 0.3 0.29 9:38 9:43
BAG201310900.U 20130715 9:38:20.025 0.3 0.29 9:38 9:43
BAG201310900.U 20130715 9:38:21.235 0.3 0.29 9:38 9:43
BAG201310900.U 20130715 9:38:25.585 0.3 0.29 9:38 9:43
BAG201310900.U 20130715 9:40:01.475 0.29 0.22 9:40 9:45
BAG201310900.U 20130715 9:45:04.335 0.22 0.27 9:45 9:50
BAG201310900.U 20130715 9:45:04.335 0.22 0.27 9:45 9:50
BAG201310900.U 20130715 9:45:04.335 0.22 0.27 9:45 9:50
BAG201310900.U 20130715 9:45:35.966 0.24 0.27 9:45 9:50
BAG201310900.U 20130715 9:51:13.808 0.27 0.19 9:51 9:56
BAG201310900.U 20130715 9:52:41.409 0.27 0.19 9:52 9:57
BAG201310900.U 20130715 9:53:32.730 0.28 0.19 9:53 9:58
BAG201310900.U 20130715 9:53:33.250 0.29 0.19 9:53 9:58
BAG201310900.U 20130715 9:53:36.580 0.26 0.19 9:53 9:58
BAG201310900.U 20130715 9:53:36.580 0.26 0.19 9:53 9:58
BAG201310900.U 20130715 9:53:36.580 0.26 0.19 9:53 9:58
BAG201310900.U 20130715 9:53:36.580 0.26 0.19 9:53 9:58
BAG201310900.U 20130715 9:53:36.580 0.26 0.19 9:53 9:58
BAG201310900.U 20130715 9:53:36.580 0.26 0.19 9:53 9:58
BAG201310900.U 20130715 9:54:00.601 0.25 0.19 9:54 9:59
BAG201310900.U 20130715 9:54:24.842 0.24 0.19 9:54 9:59
BAG201310900.U 20130715 9:57:42.068 0.19 0.24 9:57 10:02
BAG201310900.U 20130715 9:57:42.068 0.19 0.24 9:57 10:02
BAG201310900.U 20130715 9:57:42.068 0.19 0.24 9:57 10:02
BAG201310900.U 20130715 10:02:36.960 0.24 0.26 10:02 10:07
BAG201310900.U 20130715 10:06:46.735 0.26 0.24 10:06 10:11
BAG201310900.U 20130715 10:08:28.588 0.23 0.24 10:08 10:13
BAG201310900.U 20130715 10:09:13.008 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:09:13.008 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:09:13.008 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:09:13.008 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:09:13.008 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:09:13.018 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:09:22.508 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:09:22.508 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:09:22.528 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:09:34.628 0.24 0.24 10:09 10:14
BAG201310900.U 20130715 10:10:03.840 0.24 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:04.939 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:04.960 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:04.989 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:06.079 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:06.090 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:06.090 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:08.850 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:08.899 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:08.920 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:10:10.090 0.25 0.24 10:10 10:15
BAG201310900.U 20130715 10:46:08.210 0.24 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:22.842 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 10:46:25.331 0.23 0.22 10:46 10:51
BAG201310900.U 20130715 11:14:40.903 0.22 0.22 11:14 11:19
BAG201310900.U 20130715 11:26:52.196 0.22 0.25 11:26 11:31
BAG201310900.U 20130715 11:44:43.190 0.25 0.27 11:44 11:49
BAG201310900.U 20130715 11:44:43.211 0.25 0.27 11:44 11:49
BAG201310900.U 20130715 11:44:43.211 0.25 0.27 11:44 11:49
BAG201310900.U 20130715 11:44:43.211 0.25 0.27 11:44 11:49
BAG201310900.U 20130715 11:49:14.152 0.27 0.31 11:49 11:54
BAG201310900.U 20130715 12:09:12.418 0.31 0.3 12:09 12:14
BAG201310900.U 20130715 12:09:12.418 0.31 0.3 12:09 12:14
BAG201310900.U 20130715 12:09:12.418 0.31 0.3 12:09 12:14
BAG201310900.U 20130715 12:13:27.376 0.3 0.3 12:13 12:18
BAG201310900.U 20130715 12:14:48.365 0.3 0.3 12:14 12:19
BAG201310900.U 20130715 12:17:28.263 0.3 0.29 12:17 12:22
BAG201310900.U 20130715 12:17:43.893 0.3 0.29 12:17 12:22
BAG201310900.U 20130715 12:48:50.960 0.29 0.29 12:48 12:53
BAG201310900.U 20130715 12:49:59.878 0.29 0.29 12:49 12:54
BAG201310900.U 20130715 12:49:59.878 0.29 0.29 12:49 12:54
BAG201310900.U 20130715 12:49:59.898 0.29 0.29 12:49 12:54
BAG201310900.U 20130715 12:49:59.898 0.29 0.29 12:49 12:54
BAG201310900.U 20130715 12:49:59.898 0.29 0.29 12:49 12:54
BAG201310900.U 20130715 12:49:59.898 0.29 0.29 12:49 12:54
BAG201310900.U 20130715 12:49:59.898 0.29 0.29 12:49 12:54
I don't think using random access is going to be a good solution here, especially not using repeated random access. A better solution is probably going to be to load a hash table with your data for each day (as it looks like you have many rows for each day). Then use a hash iterator to find the t=300+ row. You don't provide sample data, so I can't really give you full code, but pseudocode is something like:
data want;
set have;
by _ric date_l_;
if _n_=1 then do; *declare hash table that's empty but has the structure of your have dataset; *declare a hash iterator for that table; end;
if first.date_l_ then do; *load the hash table with that date's rows; end;
*find the current row in the hash table;
*now iterate over the hash table from that row until you get to the end or you get a t+300 row;
*if you got t+300 row, then you have what you want, otherwise you're too far in the day and can stop looking - and probably should tell the data step to just skip all of the rest of the records for that day;
if last.date_l_ then do; *empty/delete the hash table; end;
run;
More specifically, P(t+5) the first price observed at least 5 minutes
after the price which is measured."
This example shows how a reflexive SQL join can acquire and use the row at an earliest future timemark. The answer requires a distinct time/value price stream, which the sample data is not. The example dedupes for demonstration purposes.
data have;
attrib
_RIC length=$20
Date_L_ informat=yymmdd10. format=yymmdd10.
Time_L_ informat=time15.3 format=time15.3
price length=8
;
infile datalines missover;
input _RIC Date_L_ Time_L_ Price;
timemark = dhms(date_l_, 0,0,0) + time_l_;
format timemark datetime21.3;
datalines;
BAG201310900.U 20130715 9:36:19.721 0.27
BAG201310900.U 20130715 9:36:19.721 0.27
BAG201310900.U 20130715 9:36:22.751 0.27
BAG201310900.U 20130715 9:36:22.751 0.27
BAG201310900.U 20130715 9:36:24.400 0.27
BAG201310900.U 20130715 9:36:24.400 0.27
BAG201310900.U 20130715 9:36:28.150 0.27
BAG201310900.U 20130715 9:36:28.150 0.27
BAG201310900.U 20130715 9:36:45.099 0.27
BAG201310900.U 20130715 9:36:45.099 0.27
BAG201310900.U 20130715 9:36:48.929 0.28
BAG201310900.U 20130715 9:36:48.929 0.28
BAG201310900.U 20130715 9:36:49.929 0.28
BAG201310900.U 20130715 9:36:50.899 0.28
BAG201310900.U 20130715 9:37:04.839 0.27
BAG201310900.U 20130715 9:37:04.839 0.27
BAG201310900.U 20130715 9:37:04.848 0.27
BAG201310900.U 20130715 9:37:07.619 0.28
BAG201310900.U 20130715 9:37:11.619 0.28
BAG201310900.U 20130715 9:37:11.619 0.28
BAG201310900.U 20130715 9:37:11.619 0.28
BAG201310900.U 20130715 9:37:12.738 0.28
BAG201310900.U 20130715 9:37:15.528 0.28
BAG201310900.U 20130715 9:37:30.337 0.28
BAG201310900.U 20130715 9:37:32.717 0.28
BAG201310900.U 20130715 9:37:58.636 0.29
BAG201310900.U 20130715 9:38:04.016 0.28
BAG201310900.U 20130715 9:38:07.326 0.28
BAG201310900.U 20130715 9:38:07.849 0.28
BAG201310900.U 20130715 9:38:16.005 0.3
BAG201310900.U 20130715 9:38:18.055 0.3
BAG201310900.U 20130715 9:38:18.055 0.3
BAG201310900.U 20130715 9:38:18.055 0.3
BAG201310900.U 20130715 9:38:20.025 0.3
run;
Dedupe
proc sort data=have nodupkey;
by _all_;
run;
Reflexive join (aka self-join)
proc sql;
create table want as
select
have._RIC
, have.timemark
, have.price
, future.timemark as timemark_at_5m_threshold
, future.price as price_at_5m_threshold
, future.timemark - have.timemark as interval_at_5m_threshold
from
have
left join
have as future
on
have._RIC = future._RIC
and future.timemark > have.timemark + 50 /* 50 seconds because sample data only covers 2 minutes */
group by
have._RIC, have.timemark
having
/* first of all future matches
* - this is why you want discrete timemarks
* when timemark has dups you would have multiple rows with same min
* and replication in result set
*/
future.timemark = min(future.timemark)
/* NOTE: an expression with a non-aggregate reference and an
* aggregate reference causes Proc SQL to automatically remerge.
* That is a good thing. Log will show
* NOTE: The query requires remerging summary statistics back with the original data.
*/
;
Related
I have a text-file with some listing as shown below.I want to fill in missing numbers in the first columns as shown.
Typical original text:
5 401 6 5.80 0.15 -3.56 0.61 -0.02 0.96
8 -6.11 -0.64 4.07 0.24 0.20 0.38
402 6 -0.33 1.07 0.30 1.29 -0.00 2.04
8 0.02 -0.59 0.21 0.50 0.22 0.79
403 6 3.77 -0.70 -2.74 -0.94 0.20 -1.48
8 -4.08 0.22 2.23 -0.06 -0.19 -0.09
404 6 -2.36 0.22 1.12 -0.26 0.21 -0.41
8 2.05 0.27 -1.63 0.20 -0.16 0.32
16 401 16 -6.30 -0.76 -3.61 0.64 -0.22 -1.01
227 5.99 0.27 4.12 0.47 0.15 -0.74
402 16 -12.50 0.14 -7.52 -0.01 -0.24 0.02
227 12.19 0.35 8.03 0.24 0.13 -0.38
403 16 20.48 0.19 12.84 -0.29 0.03 0.46
227 -20.79 -0.68 -13.35 -0.64 -0.18 1.02
404 16 14.28 1.09 8.93 -0.94 0.01 1.48
227 -14.59 -0.60 -9.44 -0.87 -0.21 1.38
709 401 374 -1.17 -0.99 25.11 0.63 -1.12 -0.11
204 1.05 0.79 -24.91 -0.19 -0.62 0.06
402 374 -1.55 1.09 30.49 -0.90 -1.40 0.14
204 1.43 -0.90 -30.28 0.41 -0.79 -0.09
403 374 1.90 -1.58 0.79 1.65 0.50 -0.21
204 -2.02 1.38 -0.99 -0.93 0.41 0.14
404 374 1.51 0.50 6.16 0.12 0.22 0.04
204 -1.64 -0.31 -6.37 -0.32 0.24 -0.02
How I want it to be:
5 401 6 5.80 0.15 -3.56 0.61 -0.02 0.96
5 401 8 -6.11 -0.64 4.07 0.24 0.20 0.38
5 402 6 -0.33 1.07 0.30 1.29 -0.00 2.04
5 402 8 0.02 -0.59 0.21 0.50 0.22 0.79
5 403 6 3.77 -0.70 -2.74 -0.94 0.20 -1.48
5 403 8 -4.08 0.22 2.23 -0.06 -0.19 -0.09
5 404 6 -2.36 0.22 1.12 -0.26 0.21 -0.41
5 404 8 2.05 0.27 -1.63 0.20 -0.16 0.32
16 401 16 -6.30 -0.76 -3.61 0.64 -0.22 -1.01
16 401 227 5.99 0.27 4.12 0.47 0.15 -0.74
16 402 16 -12.50 0.14 -7.52 -0.01 -0.24 0.02
16 402 227 12.19 0.35 8.03 0.24 0.13 -0.38
16 403 16 20.48 0.19 12.84 -0.29 0.03 0.46
16 403 227 -20.79 -0.68 -13.35 -0.64 -0.18 1.02
16 404 16 14.28 1.09 8.93 -0.94 0.01 1.48
16 404 227 -14.59 -0.60 -9.44 -0.87 -0.21 1.38
709 401 374 -1.17 -0.99 25.11 0.63 -1.12 -0.11
709 401 204 1.05 0.79 -24.91 -0.19 -0.62 0.06
709 402 374 -1.55 1.09 30.49 -0.90 -1.40 0.14
709 402 204 1.43 -0.90 -30.28 0.41 -0.79 -0.09
709 403 374 1.90 -1.58 0.79 1.65 0.50 -0.21
709 403 204 -2.02 1.38 -0.99 -0.93 0.41 0.14
709 404 374 1.51 0.50 6.16 0.12 0.22 0.04
709 404 204 -1.64 -0.31 -6.37 -0.32 0.24 -0.02
I had a similar problem before, where two "cells" were missing regurlarly (e.g. the 402 to 404 numbers above also were missing. Then I managed to use this script:
for /F "delims=" %%i in ('type "tmp1.txt"') do (
set row=%%i
set cnt=0
for %%l in (%%i) do set /A cnt+=1
if !cnt! equ 7 (
set row=!header! !row!
) else (
for /F "tokens=1,2" %%j in ("%%i") do set header=%%j %%k
)
echo.!row!
) >> "tmp2.txt"
Idea anyone?
Assuming, the file is formatted with spaces (no TABs):
#echo off
setlocal enabledelayedexpansion
(for /f "delims=" %%a in (tmp1.txt) do (
set "line=%%a"
set "col1=!line:~0,3!"
set "col2=!line:~3,5!"
set "rest=!line:~8!"
if "!col1!" == " " (
set "col1=!old1!"
) else (
set "old1=!col1!"
)
if "!col2!" == " " (
set "col2=!old2!"
) else (
set "old2=!col2!"
)
echo !col1!!col2!!rest!
))>tmp2.txt
You will notice, I don't split the lines into tokens with for /f, but take the lines as a whole and "split" them manually to preserve the format (the length of the substring). Then simply replace "empty values" with the saved value from the line before.
Edit in response to I have made a mistake when pasting the original text. There are 4 (empty) spaces before all lines.:
Adapt the counting as follows ( first "token increase the lenght by 4, for the rest add 4 to the start position, keep the lengths unchanged):
set "col1=!line:~0,7!"
set "col2=!line:~7,5!"
set "rest=!line:~12!"
and adapt if "!col1!" == " " ( to if "!col1!" == " " ( (from three to seven spaces)
I have a data frame which houses data for a few individuals in my study. These individuals belong to one of four groups. I would like to plot each individual's curve and compare them to others in that group.
I was hoping to facet by group and then use the units argument to draw lines for each individual in a lineplot.
Here is what I have so far:
g = sns.FacetGrid(data = m, col='Sex', row = 'Group')
g.map(sns.lineplot, 'Time','residual')
The docs say that g.map accepts arguments in the order that they appear in lineplot. units is at the end of a very long list.
How can I facet a line plot and use the units argument?
Here is my data:
Subject Time predicted Concentration Group Sex residual
1 0.5 0.24 0.01 NAFLD Male -0.23
1 1.0 0.4 0.33 NAFLD Male -0.08
1 2.0 0.58 0.8 NAFLD Male 0.22
1 4.0 0.59 0.59 NAFLD Male -0.0
1 6.0 0.47 0.42 NAFLD Male -0.04
1 8.0 0.33 0.23 NAFLD Male -0.1
1 10.0 0.22 0.16 NAFLD Male -0.06
1 12.0 0.15 0.33 NAFLD Male 0.18
3 0.5 0.26 0.08 NAFLD Female -0.18
3 1.0 0.45 0.45 NAFLD Female 0.01
3 2.0 0.66 0.7 NAFLD Female 0.03
3 4.0 0.74 0.76 NAFLD Female 0.02
3 6.0 0.62 0.7 NAFLD Female 0.08
3 8.0 0.46 0.4 NAFLD Female -0.06
3 10.0 0.32 0.27 NAFLD Female -0.05
3 12.0 0.21 0.21 NAFLD Female -0.0
4 0.5 0.52 0.13 NAFLD Female -0.39
4 1.0 0.91 1.18 NAFLD Female 0.27
4 2.0 1.37 1.03 NAFLD Female -0.34
4 4.0 1.55 2.02 NAFLD Female 0.47
4 6.0 1.32 1.19 NAFLD Female -0.13
4 8.0 1.0 0.89 NAFLD Female -0.1
4 10.0 0.71 0.66 NAFLD Female -0.05
4 12.0 0.48 0.5 NAFLD Female 0.02
5 0.5 0.46 0.16 NAFLD Female -0.3
5 1.0 0.76 0.98 NAFLD Female 0.22
5 2.0 1.05 1.03 NAFLD Female -0.02
5 4.0 1.03 1.06 NAFLD Female 0.03
5 6.0 0.8 0.77 NAFLD Female -0.03
5 8.0 0.57 0.5 NAFLD Female -0.07
5 10.0 0.4 0.42 NAFLD Female 0.02
5 12.0 0.27 0.33 NAFLD Female 0.06
6 0.5 1.08 1.02 NAFLD Female -0.06
6 1.0 1.53 1.66 NAFLD Female 0.13
6 2.0 1.67 1.52 NAFLD Female -0.16
6 4.0 1.3 1.44 NAFLD Female 0.14
6 6.0 0.94 0.94 NAFLD Female -0.0
6 8.0 0.68 0.63 NAFLD Female -0.05
6 10.0 0.49 0.36 NAFLD Female -0.13
6 12.0 0.35 0.48 NAFLD Female 0.13
7 0.5 0.5 0.34 Control Female -0.16
7 1.0 0.81 0.84 Control Female 0.04
7 2.0 1.08 1.17 Control Female 0.1
7 4.0 1.0 0.99 Control Female -0.01
7 6.0 0.73 0.65 Control Female -0.08
7 8.0 0.5 0.49 Control Female -0.01
7 10.0 0.33 0.37 Control Female 0.04
7 12.0 0.22 0.25 Control Female 0.03
8 0.5 0.44 0.37 Control Male -0.06
8 1.0 0.67 0.74 Control Male 0.07
8 2.0 0.82 0.8 Control Male -0.03
8 4.0 0.72 0.72 Control Male 0.01
8 6.0 0.54 0.54 Control Male -0.0
8 8.0 0.4 0.38 Control Male -0.02
8 10.0 0.29 0.31 Control Male 0.02
8 12.0 0.21 0.21 Control Male 0.0
9 0.5 0.51 0.26 Control Female -0.25
9 1.0 0.86 0.66 Control Female -0.21
9 2.0 1.23 1.62 Control Female 0.39
9 4.0 1.3 1.26 Control Female -0.03
9 6.0 1.07 0.94 Control Female -0.13
9 8.0 0.81 0.74 Control Female -0.07
9 10.0 0.59 0.62 Control Female 0.03
9 12.0 0.43 0.54 Control Female 0.11
10 0.5 0.81 0.82 Control Female 0.01
10 1.0 1.05 1.03 Control Female -0.02
10 2.0 1.04 1.04 Control Female -0.0
10 4.0 0.77 0.81 Control Female 0.04
10 6.0 0.55 0.52 Control Female -0.03
10 8.0 0.39 0.35 Control Female -0.04
10 10.0 0.28 0.31 Control Female 0.03
10 12.0 0.2 0.21 Control Female 0.01
11 0.5 0.08 0.07 NAFLD Male -0.01
11 1.0 0.15 0.08 NAFLD Male -0.07
11 2.0 0.24 0.13 NAFLD Male -0.11
11 4.0 0.32 0.45 NAFLD Male 0.12
11 6.0 0.33 0.38 NAFLD Male 0.05
11 8.0 0.3 0.28 NAFLD Male -0.02
11 10.0 0.25 0.23 NAFLD Male -0.02
11 12.0 0.2 0.16 NAFLD Male -0.04
12 0.5 0.72 0.75 NAFLD Female 0.03
12 1.0 0.84 0.76 NAFLD Female -0.08
12 2.0 0.8 0.77 NAFLD Female -0.03
12 4.0 0.67 0.74 NAFLD Female 0.07
12 6.0 0.56 0.65 NAFLD Female 0.09
12 8.0 0.46 0.48 NAFLD Female 0.02
12 10.0 0.38 0.34 NAFLD Female -0.05
12 12.0 0.32 0.25 NAFLD Female -0.07
13 0.5 0.28 0.07 Control Female -0.21
13 1.0 0.49 0.38 Control Female -0.1
13 2.0 0.74 0.94 Control Female 0.2
13 4.0 0.88 0.84 Control Female -0.04
13 6.0 0.77 0.79 Control Female 0.02
13 8.0 0.61 0.57 Control Female -0.03
13 10.0 0.45 0.44 Control Female -0.01
13 12.0 0.32 0.32 Control Female 0.01
14 0.5 0.26 0.04 NAFLD Female -0.22
14 1.0 0.44 0.35 NAFLD Female -0.1
14 2.0 0.64 0.84 NAFLD Female 0.19
14 4.0 0.68 0.73 NAFLD Female 0.04
14 6.0 0.54 0.45 NAFLD Female -0.1
14 8.0 0.39 0.34 NAFLD Female -0.05
14 10.0 0.26 0.26 NAFLD Female 0.01
14 12.0 0.16 0.24 NAFLD Female 0.07
15 0.5 0.3 0.11 NAFLD Male -0.19
15 1.0 0.49 0.61 NAFLD Male 0.12
15 2.0 0.67 0.68 NAFLD Male 0.01
15 4.0 0.64 0.67 NAFLD Male 0.03
15 6.0 0.48 0.42 NAFLD Male -0.06
15 8.0 0.33 0.31 NAFLD Male -0.02
15 10.0 0.22 0.26 NAFLD Male 0.04
15 12.0 0.15 0.17 NAFLD Male 0.02
16 0.5 0.16 0.05 NAFLD Male -0.12
16 1.0 0.26 0.35 NAFLD Male 0.1
16 2.0 0.33 0.32 NAFLD Male -0.01
16 4.0 0.28 0.27 NAFLD Male -0.01
16 6.0 0.19 0.17 NAFLD Male -0.02
16 8.0 0.12 0.13 NAFLD Male 0.01
16 10.0 0.07 0.09 NAFLD Male 0.02
16 12.0 0.05 0.05 NAFLD Male 0.0
17 0.5 0.32 0.16 NAFLD Female -0.16
17 1.0 0.54 0.59 NAFLD Female 0.06
17 2.0 0.74 0.78 NAFLD Female 0.04
17 4.0 0.71 0.76 NAFLD Female 0.05
17 6.0 0.53 0.43 NAFLD Female -0.1
17 8.0 0.36 0.35 NAFLD Female -0.01
17 10.0 0.23 0.25 NAFLD Female 0.02
17 12.0 0.15 0.2 NAFLD Female 0.05
18 0.5 0.49 0.18 Control Female -0.31
18 1.0 0.81 0.82 Control Female 0.01
18 2.0 1.1 1.27 Control Female 0.16
18 4.0 1.03 1.06 Control Female 0.03
18 6.0 0.72 0.65 Control Female -0.07
18 8.0 0.45 0.38 Control Female -0.07
18 10.0 0.26 0.28 Control Female 0.02
18 12.0 0.14 0.19 Control Female 0.04
19 0.5 0.15 0.04 NAFLD Female -0.11
19 1.0 0.27 0.21 NAFLD Female -0.06
19 2.0 0.43 0.43 NAFLD Female -0.01
19 4.0 0.56 0.66 NAFLD Female 0.1
19 6.0 0.54 0.52 NAFLD Female -0.02
19 8.0 0.47 0.48 NAFLD Female 0.01
19 10.0 0.38 0.38 NAFLD Female 0.0
19 12.0 0.29 0.24 NAFLD Female -0.05
20 0.5 0.38 0.07 NAFLD Female -0.31
20 1.0 0.6 0.82 NAFLD Female 0.22
20 2.0 0.75 0.79 NAFLD Female 0.04
20 4.0 0.63 0.58 NAFLD Female -0.05
20 6.0 0.44 0.39 NAFLD Female -0.05
20 8.0 0.29 0.27 NAFLD Female -0.02
20 10.0 0.19 0.23 NAFLD Female 0.04
20 12.0 0.13 0.19 NAFLD Female 0.07
21 0.5 0.37 0.28 NAFLD Male -0.09
21 1.0 0.56 0.66 NAFLD Male 0.1
21 2.0 0.68 0.64 NAFLD Male -0.04
21 4.0 0.59 0.62 NAFLD Male 0.02
21 6.0 0.45 0.43 NAFLD Male -0.02
21 8.0 0.34 0.31 NAFLD Male -0.03
21 10.0 0.26 0.29 NAFLD Male 0.03
21 12.0 0.19 0.2 NAFLD Male 0.0
22 0.5 0.28 0.21 Control Male -0.07
22 1.0 0.42 0.5 Control Male 0.08
22 2.0 0.5 0.47 Control Male -0.03
22 4.0 0.42 0.42 Control Male 0.0
22 6.0 0.31 0.32 Control Male 0.01
22 8.0 0.23 0.22 Control Male -0.01
22 10.0 0.16 0.17 Control Male 0.01
22 12.0 0.12 0.11 Control Male -0.01
23 0.5 0.46 0.18 Control Female -0.28
23 1.0 0.75 0.65 Control Female -0.1
23 2.0 1.03 1.23 Control Female 0.2
23 4.0 0.96 1.05 Control Female 0.09
23 6.0 0.67 0.58 Control Female -0.1
23 8.0 0.42 0.36 Control Female -0.06
23 10.0 0.24 0.22 Control Female -0.02
23 12.0 0.14 0.14 Control Female 0.0
24 0.5 0.2 0.14 NAFLD Male -0.06
24 1.0 0.33 0.41 NAFLD Male 0.08
24 2.0 0.44 0.4 NAFLD Male -0.04
24 4.0 0.41 0.42 NAFLD Male 0.01
24 6.0 0.31 0.31 NAFLD Male 0.0
24 8.0 0.22 0.21 NAFLD Male -0.01
24 10.0 0.15 0.17 NAFLD Male 0.02
24 12.0 0.1 0.09 NAFLD Male -0.02
25 0.5 0.28 0.05 NAFLD Female -0.23
25 1.0 0.48 0.43 NAFLD Female -0.05
25 2.0 0.7 0.82 NAFLD Female 0.12
25 4.0 0.75 0.8 NAFLD Female 0.06
25 6.0 0.6 0.56 NAFLD Female -0.03
25 8.0 0.42 0.38 NAFLD Female -0.04
25 10.0 0.28 0.28 NAFLD Female -0.0
25 12.0 0.18 0.18 NAFLD Female -0.0
26 0.5 0.65 0.38 NAFLD Female -0.27
26 1.0 1.0 1.2 NAFLD Female 0.2
26 2.0 1.23 1.26 NAFLD Female 0.03
26 4.0 1.0 0.98 NAFLD Female -0.02
26 6.0 0.67 0.59 NAFLD Female -0.08
26 8.0 0.43 0.42 NAFLD Female -0.01
26 10.0 0.27 0.33 NAFLD Female 0.06
26 12.0 0.17 0.22 NAFLD Female 0.05
27 0.5 0.1 0.07 NAFLD Male -0.02
27 1.0 0.17 0.18 NAFLD Male 0.02
27 2.0 0.24 0.23 NAFLD Male -0.01
27 4.0 0.27 0.3 NAFLD Male 0.02
27 6.0 0.24 0.22 NAFLD Male -0.01
27 8.0 0.19 0.17 NAFLD Male -0.01
27 10.0 0.14 0.16 NAFLD Male 0.01
27 12.0 0.11 0.11 NAFLD Male 0.0
28 0.5 0.23 0.16 Control Female -0.08
28 1.0 0.4 0.39 Control Female -0.01
28 2.0 0.58 0.57 Control Female -0.01
28 4.0 0.62 0.69 Control Female 0.07
28 6.0 0.49 0.46 Control Female -0.04
28 8.0 0.35 0.39 Control Female 0.04
28 10.0 0.23 0.18 Control Female -0.05
28 12.0 0.15 0.12 Control Female -0.03
29 0.5 0.33 0.24 Control Female -0.09
29 1.0 0.55 0.5 Control Female -0.05
29 2.0 0.8 0.86 Control Female 0.06
29 4.0 0.84 0.91 Control Female 0.07
29 6.0 0.66 0.58 Control Female -0.08
29 8.0 0.46 0.43 Control Female -0.03
29 10.0 0.3 0.33 Control Female 0.03
29 12.0 0.19 0.2 Control Female 0.01
30 0.5 0.23 0.19 Control Female -0.04
30 1.0 0.4 0.41 Control Female 0.01
30 2.0 0.6 0.6 Control Female -0.0
30 4.0 0.68 0.71 Control Female 0.03
30 6.0 0.58 0.56 Control Female -0.03
30 8.0 0.45 0.43 Control Female -0.02
30 10.0 0.33 0.36 Control Female 0.02
30 12.0 0.24 0.24 Control Female 0.0
31 0.5 0.36 0.31 Control Female -0.05
31 1.0 0.61 0.66 Control Female 0.05
31 2.0 0.85 0.82 Control Female -0.03
31 4.0 0.86 0.9 Control Female 0.05
31 6.0 0.65 0.62 Control Female -0.03
31 8.0 0.45 0.43 Control Female -0.02
31 10.0 0.3 0.31 Control Female 0.01
31 12.0 0.19 0.21 Control Female 0.02
32 0.5 0.24 0.14 NAFLD Male -0.09
32 1.0 0.4 0.41 NAFLD Male 0.01
32 2.0 0.56 0.61 NAFLD Male 0.04
32 4.0 0.57 0.58 NAFLD Male 0.02
32 6.0 0.43 0.39 NAFLD Male -0.04
32 8.0 0.29 0.28 NAFLD Male -0.01
32 10.0 0.19 0.2 NAFLD Male 0.01
32 12.0 0.12 0.14 NAFLD Male 0.03
33 0.5 0.17 0.05 NAFLD Male -0.12
33 1.0 0.28 0.23 NAFLD Male -0.06
33 2.0 0.42 0.56 NAFLD Male 0.14
33 4.0 0.45 0.42 NAFLD Male -0.03
33 6.0 0.36 0.33 NAFLD Male -0.03
33 8.0 0.26 0.24 NAFLD Male -0.02
33 10.0 0.18 0.21 NAFLD Male 0.03
33 12.0 0.12 0.14 NAFLD Male 0.02
34 0.5 0.09 0.1 NAFLD Male 0.01
34 1.0 0.16 0.19 NAFLD Male 0.03
34 2.0 0.25 0.23 NAFLD Male -0.03
34 4.0 0.32 0.32 NAFLD Male -0.0
34 6.0 0.32 0.3 NAFLD Male -0.02
34 8.0 0.28 0.3 NAFLD Male 0.02
34 10.0 0.24 0.25 NAFLD Male 0.02
34 12.0 0.2 0.18 NAFLD Male -0.02
35 0.5 0.15 0.02 NAFLD Female -0.13
35 1.0 0.27 0.14 NAFLD Female -0.14
35 2.0 0.46 0.38 NAFLD Female -0.08
35 4.0 0.64 0.8 NAFLD Female 0.16
35 6.0 0.67 0.74 NAFLD Female 0.07
35 8.0 0.63 0.61 NAFLD Female -0.02
35 10.0 0.55 0.51 NAFLD Female -0.04
35 12.0 0.46 0.42 NAFLD Female -0.04
36 0.5 0.19 0.12 NAFLD Female -0.07
36 1.0 0.32 0.36 NAFLD Female 0.04
36 2.0 0.47 0.46 NAFLD Female -0.01
36 4.0 0.53 0.57 NAFLD Female 0.04
36 6.0 0.48 0.43 NAFLD Female -0.05
36 8.0 0.41 0.39 NAFLD Female -0.01
36 10.0 0.34 0.38 NAFLD Female 0.04
36 12.0 0.28 0.27 NAFLD Female -0.01
37 0.5 0.1 0.02 NAFLD Male -0.08
37 1.0 0.17 0.1 NAFLD Male -0.08
37 2.0 0.28 0.27 NAFLD Male -0.01
37 4.0 0.36 0.44 NAFLD Male 0.08
37 6.0 0.34 0.37 NAFLD Male 0.03
37 8.0 0.29 0.28 NAFLD Male -0.02
37 10.0 0.23 0.22 NAFLD Male -0.02
37 12.0 0.18 0.15 NAFLD Male -0.03
If you use FacetGrid.map_dataframe, you can pass the arguments almost as if you were directly calling lineplot directly:
g = sns.FacetGrid(data = m, col='Sex', row='Group')
g.map_dataframe(sns.lineplot, x='Time', y='residual', units='Subject', estimator=None)
A potential work around is to define a new function
g = sns.FacetGrid(data = m, col='Sex', row = 'Group')
def f(x,y,z,*args,**kwargs):
return sns.lineplot(x = x, y = y, units = z, estimator = None, *args, **kwargs)
g.map(f, 'Time','residual','Subject')
I want to sort a file based on values in columns 2-8?
Essentially I want ascending order based on the highest value that appears on the line in any of those fields but ignoring columns 1, 9 and 10. i.e. the line with the highest value should be the last line of the file, 2nd largest value should be 2nd last line etc... If the next number in the ascending order appears on multiple lines (like A/B) I don't care of the order it gets printed.
I've looked at using sort but can't figure out an easy way to do what I want...
I'm a bit stumped, any ideas?
Input:
#1 2 3 4 5 6 7 8 9 10
A 0.00 0.00 0.01 0.23 0.19 0.07 0.26 0.52 0.78
B 0.00 0.00 0.02 0.26 0.19 0.09 0.20 0.56 0.76
C 0.00 0.00 0.02 0.16 0.20 0.22 2.84 0.60 3.44
D 0.00 0.00 0.02 0.29 0.22 0.09 0.28 0.62 0.90
E 0.00 0.00 0.90 0.09 0.18 0.05 0.24 1.21 1.46
F 0.00 0.00 1.06 0.03 0.04 0.01 0.00 1.13 1.14
G 0.00 0.00 1.11 0.10 0.31 0.08 0.64 1.60 2.25
H 0.00 0.00 1.39 0.03 0.04 0.01 0.01 1.47 1.48
I 0.00 0.00 1.68 0.16 0.55 0.24 5.00 2.63 7.63
J 0.00 0.00 6.86 0.52 1.87 0.59 12.79 9.83 22.62
K 0.00 0.00 7.26 0.57 2.00 0.64 11.12 10.47 21.59
Expected output:
#1 2 3 4 5 6 7 8 9 10
A 0.00 0.00 0.01 0.23 0.19 0.07 (0.26) 0.52 0.78
B 0.00 0.00 0.02 (0.26) 0.19 0.09 0.20 0.56 0.76
D 0.00 0.00 0.02 (0.29) 0.22 0.09 0.28 0.62 0.90
E 0.00 0.00 (0.90) 0.09 0.18 0.05 0.24 1.21 1.46
F 0.00 0.00 (1.06) 0.03 0.04 0.01 0.00 1.13 1.14
G 0.00 0.00 (1.11) 0.10 0.31 0.08 0.64 1.60 2.25
H 0.00 0.00 (1.39) 0.03 0.04 0.01 0.01 1.47 1.48
C 0.00 0.00 0.02 0.16 0.20 0.22 (2.84) 0.60 3.44
I 0.00 0.00 1.68 0.16 0.55 0.24 (5.00) 2.63 7.63
K 0.00 0.00 7.26 0.57 2.00 0.64 (11.12) 10.47 21.59
J 0.00 0.00 6.86 0.52 1.87 0.59 (12.79) 9.83 22.62
Preprocess the data: print the max of columns 2 through 8 at the start of each line, then sort, then remove the added column:
awk '
NR==1{print "x ", $0}
NR>1{
max = $2;
for( i = 3; i <= 8; i++ )
if( $i > max )
max = $i;
print max, $0
}' OFS=\\t input-file | sort -n | cut -f 2-
Another pure awk variant:
$ awk 'NR==1; # print header
NR>1{ #For other lines,
a=$2;
ai=2;
for(i=3;i<=8;i++){
if($i>a){
a=$i;
ai=i;
}
} # Find the max number in the line
$ai= "(" $ai ")"; # decoration - mark highest with ()
g[$0]=a;
}
function cmp_num_val(i1, v1, i2, v2) {return (v1 - v2);} # sorting function
END{
PROCINFO["sorted_in"]="cmp_num_val"; # assign sorting function
for (a in g) print a; # print
}' sortme.txt | column -t # column -t for formatting.
#1 2 3 4 5 6 7 8 9 10
A 0.00 0.00 0.01 0.23 0.19 0.07 (0.26) 0.52 0.78
B 0.00 0.00 0.02 (0.26) 0.19 0.09 0.20 0.56 0.76
D 0.00 0.00 0.02 (0.29) 0.22 0.09 0.28 0.62 0.90
E 0.00 0.00 (0.90) 0.09 0.18 0.05 0.24 1.21 1.46
F 0.00 0.00 (1.06) 0.03 0.04 0.01 0.00 1.13 1.14
G 0.00 0.00 (1.11) 0.10 0.31 0.08 0.64 1.60 2.25
H 0.00 0.00 (1.39) 0.03 0.04 0.01 0.01 1.47 1.48
C 0.00 0.00 0.02 0.16 0.20 0.22 (2.84) 0.60 3.44
I 0.00 0.00 1.68 0.16 0.55 0.24 (5.00) 2.63 7.63
K 0.00 0.00 7.26 0.57 2.00 0.64 (11.12) 10.47 21.59
J 0.00 0.00 6.86 0.52 1.87 0.59 (12.79) 9.83 22.62
When I require open-uri and either active_support/core_ext/numeric/conversions.rb or active_support/core_ext/big_decimal/conversions.rb, 'open "http://some.website.com"' becomes extremely slow.
How can I avoid this?
Ruby 2.0.0, active_support 4.0.0
EDIT
Here are profiling results. There are so many Gem::Dependency#matching_specs (and others) calls.
source (with conversions)
require 'open-uri'
require 'active_support/core_ext/numeric/conversions'
open 'http://stackoverflow.com'
result
% cumulative self self total
time seconds seconds calls ms/call ms/call name
21.46 0.56 0.56 22620 0.02 0.11 Gem::Dependency#matching_specs
13.41 0.91 0.35 4567 0.08 0.76 Array#each
5.36 1.05 0.14 1500 0.09 0.15 Gem::Version#<=>
4.98 1.18 0.13 3810 0.03 0.11 Gem::BasicSpecification#contains_requirable_file?
3.83 1.28 0.10 5353 0.02 0.03 Gem::StubSpecification#activated?
3.45 1.37 0.09 27604 0.00 0.00 Gem::StubSpecification#name
3.07 1.45 0.08 1382 0.06 0.33 nil#
3.07 1.53 0.08 2139 0.04 0.25 Gem::Specification#initialize
2.68 1.60 0.07 106 0.66 5.85 Kernel#gem_original_require
2.68 1.67 0.07 21258 0.00 0.00 String#===
...
source (without conversions)
require 'open-uri'
open 'http://stackoverflow.com'
result
% cumulative self self total
time seconds seconds calls ms/call ms/call name
36.36 0.08 0.08 46 1.74 10.65 Kernel#gem_original_require
22.73 0.13 0.05 816 0.06 0.09 nil#
4.55 0.14 0.01 46 0.22 11.09 Kernel#require
4.55 0.15 0.01 22 0.45 22.27 Net::BufferedIO#rbuf_fill
4.55 0.16 0.01 3 3.33 3.33 URI::Parser#split
4.55 0.17 0.01 88 0.11 0.34 Module#module_eval
4.55 0.18 0.01 133 0.08 0.45 Object#DelegateClass
4.55 0.19 0.01 184 0.05 0.11 Gem.find_unresolved_default_spec
4.55 0.20 0.01 1280 0.01 0.01 Integer#chr
4.55 0.21 0.01 1280 0.01 0.01 String#%
4.55 0.22 0.01 1381 0.01 0.01 Module#method_added
...
I have a system with uneven CPU load in a odd pattern. It's serving up apache, elastic search, redis, and email.
Here's the mpstat output. Notice how %usr for the last 12 cores is well below the top 12.
# mpstat -P ALL
Linux 3.5.0-17-generic (<server1>) 02/16/2013 _x86_64_ (24 CPU)
10:21:46 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
10:21:46 PM all 17.15 0.00 2.20 0.33 0.00 0.09 0.00 0.00 80.23
10:21:46 PM 0 27.34 0.00 4.08 0.56 0.00 0.53 0.00 0.00 67.48
10:21:46 PM 1 24.51 0.00 3.25 0.53 0.00 0.34 0.00 0.00 71.38
10:21:46 PM 2 26.69 0.00 4.20 0.50 0.00 0.24 0.00 0.00 68.36
10:21:46 PM 3 24.38 0.00 3.04 0.70 0.00 0.23 0.00 0.00 71.65
10:21:46 PM 4 24.50 0.00 4.04 0.57 0.00 0.15 0.00 0.00 70.74
10:21:46 PM 5 21.75 0.00 2.80 0.74 0.00 0.15 0.00 0.00 74.55
10:21:46 PM 6 28.30 0.00 3.75 0.84 0.00 0.04 0.00 0.00 67.07
10:21:46 PM 7 30.20 0.00 3.94 0.16 0.00 0.03 0.00 0.00 65.67
10:21:46 PM 8 30.55 0.00 4.09 0.12 0.00 0.03 0.00 0.00 65.21
10:21:46 PM 9 32.66 0.00 3.40 0.09 0.00 0.03 0.00 0.00 63.81
10:21:46 PM 10 32.20 0.00 3.57 0.08 0.00 0.03 0.00 0.00 64.12
10:21:46 PM 11 32.08 0.00 3.92 0.08 0.00 0.03 0.00 0.00 63.88
10:21:46 PM 12 4.53 0.00 0.41 0.34 0.00 0.04 0.00 0.00 94.68
10:21:46 PM 13 9.14 0.00 1.42 0.32 0.00 0.04 0.00 0.00 89.08
10:21:46 PM 14 5.92 0.00 0.70 0.35 0.00 0.06 0.00 0.00 92.97
10:21:46 PM 15 6.14 0.00 0.66 0.35 0.00 0.04 0.00 0.00 92.81
10:21:46 PM 16 7.39 0.00 0.65 0.34 0.00 0.04 0.00 0.00 91.57
10:21:46 PM 17 6.60 0.00 0.83 0.39 0.00 0.05 0.00 0.00 92.13
10:21:46 PM 18 5.49 0.00 0.54 0.30 0.00 0.01 0.00 0.00 93.65
10:21:46 PM 19 6.78 0.00 0.88 0.21 0.00 0.01 0.00 0.00 92.12
10:21:46 PM 20 6.17 0.00 0.58 0.11 0.00 0.01 0.00 0.00 93.13
10:21:46 PM 21 5.78 0.00 0.82 0.10 0.00 0.01 0.00 0.00 93.29
10:21:46 PM 22 6.29 0.00 0.60 0.10 0.00 0.01 0.00 0.00 93.00
10:21:46 PM 23 6.18 0.00 0.61 0.10 0.00 0.01 0.00 0.00 93.10
I have another system, a database server running MySQL, which shows an even distribution.
# mpstat -P ALL
Linux 3.5.0-17-generic (<server2>) 02/16/2013 _x86_64_ (32 CPU)
10:27:57 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
10:27:57 PM all 0.77 0.00 0.07 0.68 0.00 0.00 0.00 0.00 98.47
10:27:57 PM 0 2.31 0.00 0.19 1.86 0.00 0.01 0.00 0.00 95.63
10:27:57 PM 1 1.73 0.00 0.17 1.87 0.00 0.01 0.00 0.00 96.21
10:27:57 PM 2 2.62 0.00 0.25 2.51 0.00 0.01 0.00 0.00 94.62
10:27:57 PM 3 1.60 0.00 0.17 1.99 0.00 0.01 0.00 0.00 96.23
10:27:57 PM 4 1.86 0.00 0.16 1.84 0.00 0.01 0.00 0.00 96.13
10:27:57 PM 5 2.30 0.00 0.25 2.45 0.00 0.01 0.00 0.00 94.99
10:27:57 PM 6 2.05 0.00 0.20 1.89 0.00 0.01 0.00 0.00 95.86
10:27:57 PM 7 2.13 0.00 0.20 2.31 0.00 0.01 0.00 0.00 95.36
10:27:57 PM 8 0.82 0.00 0.11 4.05 0.00 0.03 0.00 0.00 94.99
10:27:57 PM 9 0.70 0.00 0.18 0.06 0.00 0.00 0.00 0.00 99.06
10:27:57 PM 10 0.18 0.00 0.04 0.01 0.00 0.00 0.00 0.00 99.77
10:27:57 PM 11 0.20 0.00 0.01 0.01 0.00 0.00 0.00 0.00 99.78
10:27:57 PM 12 0.13 0.00 0.01 0.01 0.00 0.00 0.00 0.00 99.86
10:27:57 PM 13 0.04 0.00 0.01 0.00 0.00 0.00 0.00 0.00 99.95
10:27:57 PM 14 0.03 0.00 0.01 0.00 0.00 0.00 0.00 0.00 99.97
10:27:57 PM 15 0.03 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.97
10:27:57 PM 16 0.05 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.94
10:27:57 PM 17 0.41 0.00 0.10 0.04 0.00 0.00 0.00 0.00 99.45
10:27:57 PM 18 2.78 0.00 0.06 0.14 0.00 0.00 0.00 0.00 97.01
10:27:57 PM 19 1.19 0.00 0.08 0.19 0.00 0.00 0.00 0.00 98.53
10:27:57 PM 20 0.48 0.00 0.04 0.30 0.00 0.00 0.00 0.00 99.17
10:27:57 PM 21 0.70 0.00 0.03 0.16 0.00 0.00 0.00 0.00 99.11
10:27:57 PM 22 0.08 0.00 0.01 0.02 0.00 0.00 0.00 0.00 99.90
10:27:57 PM 23 0.30 0.00 0.02 0.06 0.00 0.00 0.00 0.00 99.62
10:27:57 PM 24 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
10:27:57 PM 25 0.04 0.00 0.03 0.00 0.00 0.00 0.00 0.00 99.94
10:27:57 PM 26 0.06 0.00 0.01 0.00 0.00 0.00 0.00 0.00 99.93
10:27:57 PM 27 0.01 0.00 0.01 0.00 0.00 0.00 0.00 0.00 99.98
10:27:57 PM 28 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.99
10:27:57 PM 29 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
10:27:57 PM 30 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
10:27:57 PM 31 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.99
Both are dedicated systems running Ubuntu 12.10 (not virtual).
I've thought and read up about setting nice, taskset, or trying to tweak the scheduler but I don't want to make any rash decisions. Also, this system isn't performing "bad" per-se, I just want to ensure all cores are being utilized properly.
Let me know if I can provide additional information. Any suggestions to even the CPU load on "server1" are greatly appreciated.
This is not a problem until some cores hit 100% and others don't (i.e. in the statistics you've shown us, there's nothing that would suggest that the uneven distribution is negatively affecting the performance). In your case, you probably have quite a few processes that distribute evenly, resulting in a base load of 6-10% on each core, and then ~12 more threads that require 10-20% of a core each. You can't split a single process/thread between cores.