Change affinity of process with windows script

Change affinity of process with windows script - windows

In Windows, with
START /node 1 /affinity ff cmd /C "app.exe"
I can set the affinity of app.exe (number of cores used by app.exe).
With a windows script, How I can change the affinity of a running process ?

PowerShell can do this task for you
Get Affinity:
PowerShell "Get-Process app | Select-Object ProcessorAffinity"
Set Affinity:
PowerShell "$Process = Get-Process app; $Process.ProcessorAffinity=255"
Example: (8 Core Processor)
Core # = Value = BitMask
Core 1 = 1 = 00000001
Core 2 = 2 = 00000010
Core 3 = 4 = 00000100
Core 4 = 8 = 00001000
Core 5 = 16 = 00010000
Core 6 = 32 = 00100000
Core 7 = 64 = 01000000
Core 8 = 128 = 10000000
Just add the decimal values together for which core you want to use. 255 = All 8 cores.
All Cores = 255 = 11111111
Example Output:
C:\>PowerShell "Get-Process notepad++ | Select-Object ProcessorAffinity"
ProcessorAffinity
-----------------
255
C:\>PowerShell "$Process = Get-Process notepad++; $Process.ProcessorAffinity=13"
C:\>PowerShell "Get-Process notepad++ | Select-Object ProcessorAffinity"
ProcessorAffinity
-----------------
13
C:\>PowerShell "$Process = Get-Process notepad++; $Process.ProcessorAffinity=255"
C:\>
Source:
Here is a nicely detailed post on how to change a process's affinity:
http://www.energizedtech.com/2010/07/powershell-setting-processor-a.html

The accepted answer works, but only for the first process in the list. The solution to that in the comments does not work for me.
To change affinity of all processes with the same name use this:
Powershell "ForEach($PROCESS in GET-PROCESS processname) { $PROCESS.ProcessorAffinity=255}"
Where 255 is the mask as given in the accepted answer.

For anyone else looking for answers to this and not finding any, the solution I found was to use an app called WinAFC (or AffinityChanger). This is a partial GUI, partial command line app that allows you to specify profiles for certain executables, and will poll the process list for them. If it finds matching processes, it will change the affinity of those processes according to the settings in the loaded profile.
There is some documentation here: http://affinitychanger.sourceforge.net/
For my purposes, I created a profile that looked like this:
TestMode = 0
TimeInterval = 1
*\convert.exe := PAIR0+PAIR1
This profile sets any convert.exe process to use the first two CPU core pairs (CPU0, CPU1, CPU2, and CPU3), polling every second. TestMode is a toggle that allows you to see if your profile is working without actually setting affinities.
Hope someone finds this useful!

If you really like enums, you can do it this way. ProcessorAffinity is an IntPtr, so it takes a little extra type casting.
[flags()] Enum Cores {
Core1 = 0x0001
Core2 = 0x0002
Core3 = 0x0004
Core4 = 0x0008
Core5 = 0x0010
Core6 = 0x0020
Core7 = 0x0040
Core8 = 0x0080
}
$a = get-process notepad
[cores][int]$a.Processoraffinity
Core1, Core2, Core3, Core4
$a.ProcessorAffinity = [int][cores]'core1,core2,core3,core4'

wmic process where name="some.exe" call setpriority ProcessIDLevel
I think these are the priority levels .You can also use PID instead of process name.

Related

Limiting the CPU utilization of a user in windows

I want to know if there is any way to limit the number of cpu usage by the user name in windows? For example, there are 8 cores and I want to limit the global cpu usage of a user to 6. So, he can not run more than 6 serial jobs (each use one core).
In Linux, that can be done via scripting. But I haven't see any similar thing even with powershell scripts. Does that mean, it can not be done?

The keyword for this is Affinity.
Affinity starts at 0 being first core.
Affinity is a bitmap
10000000 = first core
01000000 = second core
11000000 = first and second core
00100000 = third core
10100000 = first and third core
11100000 = first second and third core
function Set-Affinity([string]$Username,[int[]]$core){
[int]$affinty = 0
$core | %{ $affinty += [math]::pow(2,$_)}
get-process -IncludeUserName | ?{$_.UserName -eq $Username} | %{
$_.ProcessorAffinity = $affinty
}
}
Set-Affinity -username "TESTDOMAIN\TESTUSER" -core 0,1,2,3

BCM2835 gpio device tree raspberry pi

I was looking to modify my GPIO driver for raspberry pi using device tree support.
First there were 2 files:
I read the device tree file in /arc/arm/boot/dts/bcm2835.dts
and for gpio following section was present:
gpio: gpio {
compatible = "brcm,bcm2835-gpio";
reg = <0x7e200000 0xb4>;
/*
* The GPIO IP block is designed for 3 banks of GPIOs.
* Each bank has a GPIO interrupt for itself.
* There is an overall "any bank" interrupt.
* In order, these are GIC interrupts 17, 18, 19, 20.
* Since the BCM2835 only has 2 banks, the 2nd bank
* interrupt output appears to be mirrored onto the
* 3rd bank's interrupt signal.
* So, a bank0 interrupt shows up on 17, 20, and
* a bank1 interrupt shows up on 18, 19, 20!
*/
interrupts = <2 17>, <2 18>, <2 19>, <2 20>;
gpio-controller;
#gpio-cells = <2>;
interrupt-controller;
#interrupt-cells = <2>;
};
From the references on the internet The reg = 0x7e200000 is understood but What is 0xb4.
I read the device tree file in /arch/arm/boot/dts/bcm2835-rpi-b.dts
and for gpio following section was present:
/ {
compatible = "raspberrypi,model-b", "brcm,bcm2835";
model = "Raspberry Pi Model B";
memory {
reg = <0 0x10000000>;
};
leds {
compatible = "gpio-leds";
act {
label = "ACT";
gpios = <&gpio 16 1>;
default-state = "keep";
linux,default-trigger = "heartbeat";
};
};
};
&gpio {
pinctrl-names = "default";
pinctrl-0 = <&alt0 &alt3>;
alt0: alt0 {
brcm,pins = <0 1 2 3 4 5 6 7 8 9 10 11 14 15 40 45>;
brcm,function = <4>; /* alt0 */
};
alt3: alt3 {
brcm,pins = <48 49 50 51 52 53>;
brcm,function = <7>; /* alt3 */
};
};
So, Which one of the dts files should I use, and how to read and interpret those key value pairs, for eg: what is pinctrl. and how does this approach affect on my code.
I know I am asking a lot of stuff here, but this is new and looks interesting and I want to modify my driver using this approach. Please help.
PS: I have made a driver using the standard udev support. So dynamic device node creation is managed.
I am not using platform model.

1.
From the references on the internet The reg = 0x7e200000 is understood but What is 0xb4.
reg = <0x7e200000 0xb4>
Here 0xb4 refers to the length of the register.
"reg : Address and length of the register set for the device"
You can probably checkout this pdf for better info
http://events.linuxfoundation.org/sites/events/files/slides/petazzoni-device-tree-dummies.pdf
2.
So, Which one of the dts files should I use, and how to read and interpret those key value pairs,
I will split the question into two parts. For reading key value pairs:
Every Device tree entry would have an associated binding file that describes how you read the key value pairs.
For example http://lxr.free-electrons.com/source/Documentation/devicetree/bindings/arm/bcm/brcm,bcm11351-cpu-method.txt . You can get the corresponding details .
Regarding which dts files should I use:
Now if u have noticed bcm2835.dtsi. is not a dts file but a dtsi file.
http://lxr.free-electrons.com/source/arch/arm/boot/dts/bcm2835.dtsi
dtsi files can be included into other dts or dtsi files just like we include other libraries like conio.h. or stdio.h in our C code.
Here bcm2835-rpi-b.dts is a dts file and if u notice the file here http://lxr.free-electrons.com/source/arch/arm/boot/dts/bcm2835-rpi-b.dts
it includes the following:
/include/ "bcm2835.dtsi"
This means that all the dt entries in bcm2835.dtsi is imported into bcm2835-rpi-b.dts.
You can choose to leave the nodes as is or modify the properties in rpi-b-dts, but the final entry made in dts file will be the one reflected in the dtb.
3.
for eg: what is pinctrl. and how does this approach affect on my code.
Pinctrl is framework provided in the kernel for accessing PIN's here gpio's. You can probably checkout the documentation used https://www.kernel.org/doc/Documentation/pinctrl.txt

Analyzing readdir() performance

It's bothering me that linux takes so long to list all files for huge directories, so I created a little test script that recursively lists all files of a directory:
#include <stdio.h>
#include <dirent.h>
int list(char *path) {
int i = 0;
DIR *dir = opendir(path);
struct dirent *entry;
char new_path[1024];
while(entry = readdir(dir)) {
if (entry->d_type == DT_DIR) {
if (entry->d_name[0] == '.')
continue;
strcpy(new_path, path);
strcat(new_path, "/");
strcat(new_path, entry->d_name);
i += list(new_path);
}
else
i++;
}
closedir(dir);
return i;
}
int main() {
char *path = "/home";
printf("%i\n", list(path));
return 0;
When compiling it with gcc -O3, the program runs about 15 sec (I ran the programm a few times and it's approximately constant, so the fs cache should not play a role here):
$ /usr/bin/time -f "%CC %DD %EE %FF %II %KK %MM %OO %PP %RR %SS %UU %WW %XX %ZZ %cc %ee %kk %pp %rr %ss %tt %ww %xx" ./a.out
./a.outC 0D 0:14.39E 0F 0I 0K 548M 0O 2%P 178R 0.30S 0.01U 0W 0X 4096Z 7c 14.39e 0k 0p 0r 0s 0t 1692w 0x
So it spends about S=0.3sec in kernelspace and U=0.01sec in userspace and has 7+1692 context switches.
A context switch takes about 2000nsec * (7+1692) = 3.398msec [1]
However, there are more than 10sec left and I would like to find out what the program is doing in this time.
Are there any other tools to investigate what the program is doing all the time?
gprof just tells me the time for the (userspace) call graph and gcov does not list time spent in each line but only how often a time is executed...
[1] http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html

oprofile is a decent sampling profiler which can profile both user and kernel-mode code.
According to your numbers, however, approximately 14.5 seconds of the time is spent asleep, which is not really registered well by oprofile. Perhaps what may be more useful would be ftrace combined with a reading of the kernel code. ftrace provides trace points in the kernel which can log a message and stack trace when they occur. The event that would seem most useful for determining why your process is sleeping would be the sched_switch event. I would recommend that you enable kernel-mode stacks and the sched_switch event, set a buffer large enough to capture the entire lifetime of your process, then run your process and stop tracing immediately after. By reviewing the trace, you will be able to see every time your process went to sleep, whether it was runnable or non-runnable, a high resolution time stamp, and a call stack indicating what put it to sleep.
ftrace is controlled through debugfs. On my system, this is mounted in /sys/kernel/debug, but yours may be different. Here is an example of what I would do to capture this information:
# Enable stack traces
echo "1" > /sys/kernel/debug/tracing/options/stacktrace
# Enable the sched_switch event
echo "1" > /sys/kernel/debug/tracing/events/sched/sched_switch/enable
# Make sure tracing is enabled
echo "1" > /sys/kernel/debug/tracing/tracing_on
# Run the program and disable tracing as quickly as possible
./your_program; echo "0" > /sys/kernel/debug/tracing/tracing_on
# Examine the trace
vi /sys/kernel/debug/tracing/trace
The resulting output will have lines which look like this:
# tracer: nop
#
# entries-in-buffer/entries-written: 22248/3703779 #P:1
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..3 2113.437500: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:0 next_pid=878 next_prio=120
<idle>-0 [000] d..3 2113.437531: <stack trace>
=> __schedule
=> schedule
=> schedule_preempt_disabled
=> cpu_startup_entry
=> rest_init
=> start_kernel
kworker/0:0-878 [000] d..3 2113.437836: sched_switch: prev_comm=kworker/0:0 prev_pid=878 prev_prio=120 prev_state=S ==> next_comm=your_program next_pid=898 next_prio=120
kworker/0:0-878 [000] d..3 2113.437866: <stack trace>
=> __schedule
=> schedule
=> worker_thread
=> kthread
=> ret_from_fork
The lines you will care about will be when your program appears as the prev_comm task, meaning the scheduler is switching away from your program to run something else. prev_state will indicate that your program was still runnable (R) or was blocked (S, U or some other letter, see the ftrace source). If blocked, you can examine the stack trace and the kernel source to figure out why.

VB6 RS232 commands not working

I have the following code:
MSCommProj.CommPort = 6
MSCommProj.RThreshold = 1
MSCommProj.Settings = "19200,N,8,1"
MSCommProj.InputLen = 0
MSCommProj.PortOpen = True
And it opens just fine and connects but when i try sending the command:
MSCommProj.Output = "21 8901 5057 31 0A" & Chr$(13)
and
MSCommProj.Output = "21 89 01 50 57 31 0A" & Chr$(13)
and
MSCommProj.Output = "3F 89 01 50 57 0A" & Chr$(13)
as instructed by the user manual, it does not come on.
Here is the pages in the manual that shows this. Maybe i am just doing it wrong?:

Are you sure that you're meant to be sending character data to the RS232 interface for that? Those look like binary sequences to me.
Rather than:
MSCommProj.Output = "3F 89 01 50 57 0A" & Chr$(13)
I'd be looking at transmitting the binary data thus:
MSCommProj.Output = chr$(63) & chr$(137) & chr$(1) & chr$(80) & chr$(87) & chr(10)
You'll note that there's no chr$(13) at the end, the spec doesn't call for that.
If you want to know what the conversions are for those hex values, start up the Windows calculator, change the view to scientific, switch to hex mode, enter the value, the switch to decimal mode.
Or you can download an ASCII table for this purpose. Or view one of my voluminous essays on the subject here.

You are required to send bytes given.
You instead send string representation of those.
Send actual bytes.
chr$(&h21) & chr$(&h89) & chr$(&h01) & chr$(&h50) etc.

It was because i did not use a cross-over cable... All the rs232 codes were correct. Blah.

Code Golf: Duplicate Character Removal in String

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
The challenge: The shortest code, by character count, that detects and removes duplicate characters in a String. Removal includes ALL instances of the duplicated character (so if you find 3 n's, all three have to go), and original character order needs to be preserved.
Example Input 1:
nbHHkRvrXbvkn
Example Output 1:
RrX
Example Input 2:
nbHHkRbvnrXbvkn
Example Output 2:
RrX
(the second example removes letters that occur three times; some solutions have failed to account for this)
(This is based on my other question where I needed the fastest way to do this in C#, but I think it makes good Code Golf across languages.)

LabVIEW 7.1
ONE character and that is the blue constant '1' in the block diagram.
I swear, the input was copy and paste ;-)
http://i25.tinypic.com/hvc4mp.png
http://i26.tinypic.com/5pnas.png

Perl
21 characters of perl, 31 to invoke, 36 total keystrokes (counting shift and final return):
perl -pe's/$1//gwhile/(.).*\1/'

Ruby — 61 53 51 56 35
61 chars, the ruler says. (Gives me an idea for another code golf...)
puts ((i=gets.split(''))-i.select{|c|i.to_s.count(c)<2}).join
+-------------------------------------------------------------------------+
|| | | | | | | | | | | | | | | |
|0 10 20 30 40 50 60 70 |
| |
+-------------------------------------------------------------------------+
gets.chars{|c|$><<c[$_.count(c)-1]}
... 35 by Nakilon

APL
23 characters:
(((1+ρx)-(ϕx)ιx)=xιx)/x
I'm an APL newbie (learned it yesterday), so be kind -- this is certainly not the most efficient way to do it. I'm ashamed I didn't beat Perl by very much.
Then again, maybe it says something when the most natural way for a newbie to solve this problem in APL was still more concise than any other solution in any language so far.

Python:
s=raw_input()
print filter(lambda c:s.count(c)<2,s)
This is a complete working program, reading from and writing to the console. The one-liner version can be directly used from the command line
python -c 's=raw_input();print filter(lambda c:s.count(c)<2,s)'

J (16 12 characters)
(~.{~[:I.1=#/.~)
Example:
(~.{~[:I.1=#/.~) 'nbHHkRvrXbvkn'
RrX
It only needs the parenthesis to be executed tacitly. If put in a verb, the actual code itself would be 14 characters.
There certainly are smarter ways to do this.
EDIT: The smarter way in question:
(~.#~1=#/.~) 'nbHHkRvrXbvkn'
RrX
12 characters, only 10 if set in a verb. I still hate the fact that it's going through the list twice, once to count (#/.) and another to return uniques (nub or ~.), but even nubcount, a standard verb in the 'misc' library does it twice.

Haskell
There's surely shorter ways to do this in Haskell, but:
Prelude Data.List> let h y=[x|x<-y,(<2).length$filter(==x)y]
Prelude Data.List> h "nbHHkRvrXbvkn"
"RrX"
Ignoring the let, since it's only required for function declarations in GHCi, we have h y=[x|x<-y,(<2).length$filter(==x)y], which is 37 characters (this ties the current "core" Python of "".join(c for c in s if s.count(c)<2), and it's virtually the same code anyway).
If you want to make a whole program out of it,
h y=[x|x<-y,(<2).length$filter(==x)y]
main=interact h
$ echo "nbHHkRvrXbvkn" | runghc tmp.hs
RrX
$ wc -c tmp.hs
54 tmp.hs
Or we can knock off one character this way:
main=interact(\y->[x|x<-y,(<2).length$filter(==x)y])
$ echo "nbHHkRvrXbvkn" | runghc tmp2.hs
RrX
$ wc -c tmp2.hs
53 tmp2.hs
It operates on all of stdin, not line-by-line, but that seems acceptable IMO.

C89 (106 characters)
This one uses a completely different method than my original answer. Interestingly, after writing it and then looking at another answer, I saw the methods were very similar. Credits to caf for coming up with this method before me.
b[256];l;x;main(c){while((c=getchar())>=0)b[c]=b[c]?1:--l;
for(;x-->l;)for(c=256;c;)b[--c]-x?0:putchar(c);}
On one line, it's 58+48 = 106 bytes.
C89 (173 characters)
This was my original answer. As said in the comments, it doesn't work too well...
#include<stdio.h>
main(l,s){char*b,*d;for(b=l=s=0;l==s;s+=fread(b+s,1,9,stdin))b=realloc(b,l+=9)
;d=b;for(l=0;l<s;++d)if(!memchr(b,*d,l)&!memchr(d+1,*d,s-l++-1))putchar(*d);}
On two lines, it's 17+1+78+77 = 173 bytes.

C#
65 Characters:
new String(h.Where(x=>h.IndexOf(x)==h.LastIndexOf(x)).ToArray());
67 Characters with reassignment:
h=new String(h.Where(x=>h.IndexOf(x)==h.LastIndexOf(x)).ToArray());

C#
new string(input.GroupBy(c => c).Where(g => g.Count() == 1).ToArray());
71 characters

PHP (136 characters)
<?PHP
function q($x){return $x<2;}echo implode(array_keys(array_filter(
array_count_values(str_split(stream_get_contents(STDIN))),'q')));
On one line, it's 5+1+65+65 = 136 bytes. Using PHP 5.3 you could save a few bytes making the function anonymous, but I can't test that now. Perhaps something like:
<?PHP
echo implode(array_keys(array_filter(array_count_values(str_split(
stream_get_contents(STDIN))),function($x){return $x<2;})));
That's 5+1+66+59 = 131 bytes.

another APL solution
As a dynamic function (18 charachters)
{(1+=/¨(ω∘∊¨ω))/ω}
line assuming that input is in variable x (16 characters):
(1+=/¨(x∘∊¨x))/x

VB.NET
For Each c In s : s = IIf(s.LastIndexOf(c) <> s.IndexOf(c), s.Replace(CStr(c), Nothing), s) : Next
Granted, VB is not the optimal language to try to save characters, but the line comes out to 98 characters.

PowerShell
61 characters. Where $s="nbHHkRvrXbvkn" and $a is the result.
$h=#{}
($c=[char[]]$s)|%{$h[$_]++}
$c|%{if($h[$_]-eq1){$a+=$_}}
Fully functioning parameterized script:
param($s)
$h=#{}
($c=[char[]]$s)|%{$h[$_]++}
$c|%{if($h[$_]-eq1){$a+=$_}}
$a

C: 83 89 93 99 101 characters
O(n2) time.
Limited to 999 characters.
Only works in 32-bit mode (due to not #include-ing <stdio.h> (costs 18 chars) making the return type of gets being interpreted as an int and chopping off half of the address bits).
Shows a friendly "warning: this program uses gets(), which is unsafe." on Macs.
.
main(){char s[999],*c=gets(s);for(;*c;c++)strchr(s,*c)-strrchr(s,*c)||putchar(*c);}
(and this similar 82-chars version takes input via the command line:
main(char*c,char**S){for(c=*++S;*c;c++)strchr(*S,*c)-strrchr(*S,*c)||putchar(*c);}
)

Golfscript(sym) - 15
.`{\{=}+,,(!}+,
+-------------------------------------------------------------------------+
|| | | | | | | | | | | | | | | |
|0 10 20 30 40 50 60 70 |
| |
+-------------------------------------------------------------------------+

Haskell
(just knocking a few characters off Mark Rushakoff's effort, I'd rather it was posted as a comment on his)
h y=[x|x<-y,[_]<-[filter(==x)y]]
which is better Haskell idiom but maybe harder to follow for non-Haskellers than this:
h y=[z|x<-y,[z]<-[filter(==x)y]]
Edit to add an explanation for hiena and others:
I'll assume you understand Mark's version, so I'll just cover the change. Mark's expression:
(<2).length $ filter (==x) y
filters y to get the list of elements that == x, finds the length of that list and makes sure it's less than two. (in fact it must be length one, but ==1 is longer than <2 ) My version:
[z] <- [filter(==x)y]
does the same filter, then puts the resulting list into a list as the only element. Now the arrow (meant to look like set inclusion!) says "for every element of the RHS list in turn, call that element [z]". [z] is the list containing the single element z, so the element "filter(==x)y" can only be called "[z]" if it contains exactly one element. Otherwise it gets discarded and is never used as a value of z. So the z's (which are returned on the left of the | in the list comprehension) are exactly the x's that make the filter return a list of length one.
That was my second version, my first version returns x instead of z - because they're the same anyway - and renames z to _ which is the Haskell symbol for "this value isn't going to be used so I'm not going to complicate my code by giving it a name".

Javascript 1.8
s.split('').filter(function (o,i,a) a.filter(function(p) o===p).length <2 ).join('');
or alternately- similar to the python example:
[s[c] for (c in s) if (s.split("").filter(function(p) s[c]===p).length <2)].join('');

TCL
123 chars. It might be possible to get it shorter, but this is good enough for me.
proc h {i {r {}}} {foreach c [split $i {}] {if {[llength [split $i $c]]==2} {set r $r$c}}
return $r}
puts [h [gets stdin]]

C
Full program in C, 141 bytes (counting newlines).
#include<stdio.h>
c,n[256],o,i=1;main(){for(;c-EOF;c=getchar())c-EOF?n[c]=n[c]?-1:o++:0;for(;i<o;i++)for(c=0;c<256;c++)n[c]-i?0:putchar(c);}

Scala
54 chars for the method body only, 66 with (statically typed) method declaration:
def s(s:String)=(""/:s)((a,b)=>if(s.filter(c=>c==b).size>1)a else a+b)

Ruby
63 chars.
puts (t=gets.split(//)).map{|i|t.count(i)>1?nil:i}.compact.join

VB.NET / LINQ
96 characters for complete working statement
Dim p=New String((From c In"nbHHkRvrXbvkn"Group c By c Into i=Count Where i=1 Select c).ToArray)
Complete working statement, with original string and the VB Specific "Pretty listing (reformatting of code" turned off, at 96 characters, non-working statement without original string at 84 characters.
(Please make sure your code works before answering. Thank you.)

C
(1st version: 112 characters; 2nd version: 107 characters)
k[256],o[100000],p,c;main(){while((c=getchar())!=-1)++k[o[p++]=c];for(c=0;c<p;c++)if(k[o[c]]==1)putchar(o[c]);}
That's
/* #include <stdio.h> */
/* int */ k[256], o[100000], p, c;
/* int */ main(/* void */) {
while((c=getchar()) != -1/*EOF*/) {
++k[o[p++] = /*(unsigned char)*/c];
}
for(c=0; c<p; c++) {
if(k[o[c]] == 1) {
putchar(o[c]);
}
}
/* return 0; */
}
Because getchar() returns int and putchar accepts int, the #include can 'safely' be removed.
Without the include, EOF is not defined, so I used -1 instead (and gained a char).
This program only works as intended for inputs with less than 100000 characters!
Version 2, with thanks to strager
107 characters
#ifdef NICE_LAYOUT
#include <stdio.h>
/* global variables are initialized to 0 */
int char_count[256]; /* k in the other layout */
int char_order[999999]; /* o ... */
int char_index; /* p */
int main(int ch_n_loop, char **dummy) /* c */
/* variable with 2 uses */
{
(void)dummy; /* make warning about unused variable go away */
while ((ch_n_loop = getchar()) >= 0) /* EOF is, by definition, negative */
{
++char_count[ ( char_order[char_index++] = ch_n_loop ) ];
/* assignment, and increment, inside the array index */
}
/* reuse ch_n_loop */
for (ch_n_loop = 0; ch_n_loop < char_index; ch_n_loop++) {
(char_count[char_order[ch_n_loop]] - 1) ? 0 : putchar(char_order[ch_n_loop]);
}
return 0;
}
#else
k[256],o[999999],p;main(c){while((c=getchar())>=0)++k[o[p++]=c];for(c=0;c<p;c++)k[o[c]]-1?0:putchar(o[c]);}
#endif

Javascript 1.6
s.match(/(.)(?=.*\1)/g).map(function(m){s=s.replace(RegExp(m,'g'),'')})
Shorter than the previously posted Javascript 1.8 solution (71 chars vs 85)

Assembler
Tested with WinXP DOS box (cmd.exe):
xchg cx,bp
std
mov al,2
rep stosb
inc cl
l0: ; to save a byte, I've encoded the instruction to exit the program into the
; low byte of the offset in the following instruction:
lea si,[di+01c3h]
push si
l1: mov dx,bp
mov ah,6
int 21h
jz l2
mov bl,al
shr byte ptr [di+bx],cl
jz l1
inc si
mov [si],bx
jmp l1
l2: pop si
l3: inc si
mov bl,[si]
cmp bl,bh
je l0+2
cmp [di+bx],cl
jne l3
mov dl,bl
mov ah,2
int 21h
jmp l3
Assembles to 53 bytes. Reads standard input and writes results to standard output, eg:
programname < input > output

PHP
118 characters actual code (plus 6 characters for the PHP block tag):
<?php
$s=trim(fgets(STDIN));$x='';while(strlen($s)){$t=str_replace($s[0],'',substr($s,1),$c);$x.=$c?'':$s[0];$s=$t;}echo$x;

C# (53 Characters)
Where s is your input string:
new string(s.Where(c=>s.Count(h=>h==c)<2).ToArray());
Or 59 with re-assignment:
var a=new string(s.Where(c=>s.Count(h=>h==c)<2).ToArray());

Haskell Pointfree
import Data.List
import Control.Monad
import Control.Arrow
main=interact$liftM2(\\)nub$ap(\\)nub
The whole program is 97 characters, but the real meat is just 23 characters. The rest is just imports and bringing the function into the IO monad. In ghci with the modules loaded it's just
(liftM2(\\)nub$ap(\\)nub) "nbHHkRvrXbvkn"
In even more ridiculous pointfree style (pointless style?):
main=interact$liftM2 ap liftM2 ap(\\)nub
It's a bit longer though at 26 chars for the function itself.

Shell/Coreutils, 37 Characters
fold -w1|sort|uniq -u|paste -s -d ''

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio