Why is Intel Pin not able to instrument open syscall? - linux-kernel

I am trying to build a pintool that should be able to instrument an open() syscall that targets a specific file/directory and replace the file path argument with another value.
For example, here is a very simple code that I want to instrument:
#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
using namespace std;
int main(int argc, char **argv)
{
int i = open("/home/preet_derasari/important.txt", O_RDONLY);
cout << "fid: " << i << endl;
}
In this example I want Pin to change the file path from /home/preet_derasari/important.txt to /home/preet_derasari/dummy.txt.
In order to do this I wrote a very simple pintool after referring to some example pintools and Pin APIs:
#include "pin.H"
#include <iostream>
#include <fstream>
#include <syscall.h>
#include <string>
using namespace std;
INT32 Usage()
{
cout << "This tool prints out the number of dynamically executed " << endl
<< "instructions, basic blocks and threads in the application." << endl
<< endl;
cout << KNOB_BASE::StringKnobSummary() << endl;
return -1;
}
void SyscallEntry(THREADID threadIndex, CONTEXT *ctxt, SYSCALL_STANDARD std, void *v)
{
ADDRINT sysNum = PIN_GetSyscallNumber(ctxt, std);
cout << "entered syscall: " << sysNum << endl;
if(sysNum == SYS_open)
{
cout << "open encountered!" << endl;
char *path = (char *)PIN_GetSyscallArgument(ctxt, std, 0);
cout << "Original File Path: " << path << endl;
int match = strcmp((char *)PIN_GetSyscallArgument(ctxt, std, 0), "/home/preet_derasari/important.txt");
if(!match)
{
string pathDummy = "/home/preet_derasari/dummy.txt";
PIN_SetSyscallArgument (ctxt, std, 0, (ADDRINT) pathDummy.c_str());
cout << "Dummy File Path: " << pathDummy << endl;
}
}
}
int main(int argc, char* argv[])
{
cout << "Open Syscall Value: " << SYS_open << endl;
if (PIN_Init(argc, argv))
{
return Usage();
}
cout << "===============================================" << endl;
cout << "This application is instrumented by MyPinTool" << endl;
cout << "===============================================" << endl;
PIN_AddSyscallEntryFunction(SyscallEntry, 0);
// Start the program, never returns
PIN_StartProgram();
return 0;
}
I run the pintool with this command: ../../../pin -t obj-intel64/MY_pin.so -- test where MY_pin.so is the pintool shared object library and test is the sample code mentioned above.
The output just baffles me because Pin is instrumenting all syscalls except open:
Open Syscall Value: 2
===============================================
This application is instrumented by MyPinTool
===============================================
entered syscall: 12
entered syscall: 158
entered syscall: 21
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 257
entered syscall: 0
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 5
entered syscall: 9
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 3
entered syscall: 158
entered syscall: 10
entered syscall: 10
entered syscall: 10
entered syscall: 11
entered syscall: 12
entered syscall: 12
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 3
As you can see pin instruments all syscalls except open i.e., syscall number 2 (based on x86_64 ISA).
An interesting observation is that the program doesn't output the cout from my test program (cout << "fid: " << i << endl;) which makes me question if Pin is doing something weird with the open syscall?
Specifications:
Pin version - pin-3.21-98484-e7cd811fd
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
ISA: x86_64
CPU: AMD Ryzen 7 1700X Eight-Core Processor
Can someone please help me understand why this is happening?

strace cat foo shows you that programs don't use the old open(2) system call anymore:
...
openat(AT_FDCWD, "foo", O_RDONLY) = 3
...
__NR_openat is 257, which your PIN tool observed 3 times. Apparently even the open() libc wrapper function internally uses the openat Linux system call. (The __NR_open = 2 system call does still work; the kernel also has code to pass its args to the current implementation. IDK which is more efficient, like maybe it just sets up an AT_FDCWD arg and calls sys_openat() which has to decode it again, just like glibc does in user-space.)
The open(2) man page also documents openat(2).
The dirfd argument is used in conjunction with the pathname
argument as follows:
If the pathname given in pathname is absolute, then dirfd is
ignored.
If the pathname given in pathname is relative and dirfd is the
special value AT_FDCWD, then pathname is interpreted relative
to the current working directory of the calling process (like
open()).
...
openat / linkat and so on, when used with an fd from open(O_DIRECTORY), let programs like find avoid TOCTOU races, and/or let multi-threaded programs avoid having to actually chdir (because there's only one CWD per process, not per thread.)
Using them with AT_FDCWD has no advantage or disadvantage vs. old-style open(2).

Related

stdio redirect "fails" when calling waveInOpen, why?

Here's my basic program, it should compile fairly easily with VisualStudio (even express).
// ConsoleApplication1.cpp : This file contains the 'main' function. Program execution begins and ends there.
//
#include "stdafx.h"
#include <iostream>
#include <windows.h>
#include <mmsystem.h>
#include <stdint.h>
#pragma comment(lib, "winmm.lib")
HWAVEIN hWaveIn;
WAVEFORMATEX WaveFormat;
WAVEHDR WaveHeader;
typedef union
{
uint32_t u32;
struct
{
int16_t iLeft;
int16_t iRight;
};
} audiosample16_t;
#define AUDIORATE (44100*4)
#define SECONDS (13)
audiosample16_t MyBuffer[AUDIORATE*SECONDS];
int _tmain(int argc, _TCHAR* argv[])
{
std::cout << "Hello World!\n";
UINT WaveId = 0;
WaveFormat.wFormatTag = WAVE_FORMAT_PCM; // simple, uncompressed format
WaveFormat.nChannels = 2; // 1=mono, 2=stereo
WaveFormat.nSamplesPerSec = 44100;
WaveFormat.wBitsPerSample = 16; // 16 for high quality, 8 for telephone-grade
WaveFormat.nBlockAlign = WaveFormat.nChannels*WaveFormat.wBitsPerSample/8;
WaveFormat.nAvgBytesPerSec = (WaveFormat.nSamplesPerSec)*(WaveFormat.nChannels)*(WaveFormat.wBitsPerSample)/8;
WaveFormat.cbSize=0;
WaveHeader.lpData = (LPSTR)MyBuffer;
WaveHeader.dwBufferLength = sizeof(MyBuffer);
WaveHeader.dwFlags = 0;
std::cout << "Hello World!\n";
//std::cout << std::flush;
HRESULT hr;
if(argc>1)
hr= waveInOpen(&hWaveIn,WaveId,&WaveFormat,0,0,CALLBACK_NULL);
std::cout << "Hello World!\n";
std::cout << "Hello World!\n";
//std::cout << std::flush;
return 0;
}
If you call it from the command line with no arguments, everything prints out fine(several 'Hello World!'s). If you redirect this to a file (myprog.exe > blah.txt) , again, everything works fine and several lines of 'Hello World!' end up in the file as expected.
HOWEVER, if you have an argument (so that waveInOpen is called), it will not redirect anything to the file. The file is empty. If you don't redirect the output, it'll print out to the command prompt just fine.
UNLESS you uncomment the std::flush lines, then the file isn't empty and everything works fine.
What the heck is going on under the hood that's causing that? Shouldn't stdout be flushed on exit and piped to the file no matter what? What is the waveInOpen() call doing that screws up the stdio buffering like that?
FWIW, this came to light because we're calling this program from TCL and Python to do audio quality measurements on an attached product and nothing was being read back, even though it would print out fine when run from the command line (and not redirected).

How to write a shell script(any shell languages) that interacts with a program and passes arguments? [duplicate]

This question already has answers here:
Passing arguments to an interactive program non-interactively
(5 answers)
Closed 4 years ago.
#include <iostream>
using namespace std;
int main(void){
int number = 0;
cout << "Please enter a number: ";
cin >> number ;
cout << "the number you enter is " << number << endl;
return 0;}
This is my program that takes in an argument and prints it out.
#! /bin/bash
number=1
echo "1" | ./a.out #>> result.txt
This is my bash script that is trying to pass an argument to the program.
1
Please enter a number: the number you enter is 1
This is the result.txt. I wanted it more like this:
Please enter a number: 1
the number you enter is 1
How should I fix it so that the script would pass the argument more like a human does.
And is bash a really good scripting language doing this kind of work or there are other better scripting languages. (google says tcl is better that bash for this kind of interactive program?)
Unless I'm misunderstanding the problem, if you'd like to pass parameters to your c++ program, you should add argc and argv to your main function.
Your program.cpp:
#include <iostream>
using namespace std;
int main(int argc, char *argv[]) {
cout << "your number is " << argv[1] << endl;
return 0;
}
The shell script (send-arg.sh):
#!/bin/sh
./program 42
Output:
./send-arg.sh
your number is 42

Linux Character device: userspace (cat) does not stop reading

I made a simple character device, and created a node for communicating with it. When I cat /dev/mychrdev, it userspace invokes my character device's read function over and over again. The relevant code is:
static ssize_t useless_read(struct file *filp, char __user *buff, size_t count, loff_t *offp) {
int ret, read_count;
read_count = sprintf(message, "Major number: %d\n", MAJOR(useless_chr_dev->cdev_num));
ret = copy_to_user(buff, message, read_count);
if (ret == 0) {
printk(KERN_INFO "userspace read success");
return read_count;
} else
return -EFAULT;
}
After I initiate a read from the terminal, dmesg is filled with:
[ 8966.299554] userspace read success
and cat is giving out lots of
Major number: 242
Major number: 242
Major number: 242
Major number: 242
Why isnt it stopping ?
cat is not about calling your read function once. cat will keep calling the read function unless you return a 0 (i.e no more bytes to read).

How to determine if there are bytes available to be read from boost:asio:serial_port

I am trying to use boost to communicate serially between my desktop and an arduino. In arduino space, I can ask the serial port whether or not there are bytes available before trying to perform a read.
I am having trouble finding the equivalent for boost::asio::serial_port.
While Boost.Asio does not provide direct support for this, one can still accomplish this by using serial port's native_handle() with system specific calls. Consult the system's documentation to determine how to query for the available bytes ready to be read, but it is often ioctl(..., FIONREAD, ...) on Linux, and ClearCommError() on Windows.
Here is a complete minimal example that uses system specific calls to get the number of bytes available. The example program will continue to query the serial port until there are greater than 20 bytes available, at which point it will read all but 5 bytes:
#include <iostream>
#include <vector>
#include <boost/asio.hpp>
#include <boost/thread.hpp>
/// #brief Returns the number of bytes available for reading from a serial
/// port without blocking.
std::size_t get_bytes_available(
boost::asio::serial_port& serial_port,
boost::system::error_code& error)
{
error = boost::system::error_code();
int value = 0;
#if defined(BOOST_ASIO_WINDOWS) || defined(__CYGWIN__)
COMSTAT status;
if (0 != ::ClearCommError(serial_port.lowest_layer().native_handle(),
NULL, &status))
{
value = status.cbInQue;
}
// On error, set the error code.
else
{
error = boost::system::error_code(::GetLastError(),
boost::asio::error::get_system_category());
}
#else // defined(BOOST_ASIO_WINDOWS) || defined(__CYGWIN__)
if (0 == ::ioctl(serial_port.lowest_layer().native_handle(),
FIONREAD, &value))
{
error = boost::system::error_code(errno,
boost::asio::error::get_system_category());
}
#endif // defined(BOOST_ASIO_WINDOWS) || defined(__CYGWIN__)
return error ? static_cast<std::size_t>(0)
: static_cast<size_t>(value);
}
/// #brief Returns the number of bytes available for reading from a serial
/// port without blocking. Throws on error.
std::size_t get_bytes_available(boost::asio::serial_port& serial_port)
{
boost::system::error_code error;
std::size_t bytes_available = get_bytes_available(serial_port, error);
if (error)
{
boost::throw_exception((boost::system::system_error(error)));
}
return bytes_available;
}
int main(int argc, char* argv[])
{
if (argc < 2)
{
std::cerr << "Usage: " << argv[0] << " <device_name>" << std::endl;
return 1;
}
// Create all I/O objects.
boost::asio::io_service io_service;
boost::asio::serial_port serial_port(io_service, argv[1]);
// Continue quering the serial port until at least 20 bytes are available
// to be read.
std::size_t bytes_available = 0;
while (bytes_available < 20)
{
bytes_available = get_bytes_available(serial_port);
std::cout << "available: " << bytes_available << std::endl;
boost::this_thread::sleep_for(::boost::chrono::seconds(3));
}
// Read all but 5 available bytes.
std::vector<char> buffer(bytes_available - 5);
std::size_t bytes_transferred =
read(serial_port, boost::asio::buffer(buffer));
bytes_available = get_bytes_available(serial_port);
// Print results.
std::cout << "Read " << bytes_transferred << " bytes\n";
std::cout.write(&buffer[0], bytes_transferred);
std::cout << "\navailable: " << bytes_available << std::endl;
}
Create virtual serial ports with socat:
$ socat -d -d PTY: PTY
2015/02/01 21:12:31 socat[3056] N PTY is /dev/pts/2
2015/02/01 21:12:31 socat[3056] N PTY is /dev/pts/3
2015/02/01 21:12:31 socat[3056] N starting data transfer loop
with FDs [3,3] and [5,5]
After starting the program in one terminal, I write to /dev/pts/3 in another terminal:
$ echo -n "This is" > /dev/pts/3
$ echo -n " an example" > /dev/pts/3
$ echo -n " with asio." > /dev/pts/3
And the resulting output from the program:
$ ./a.out /dev/pts/2
available: 0
available: 7
available: 18
available: 29
Read 24 bytes
This is an example with
available: 5
I don't know of such a thing in asio, but as comments above have already stated, you don't really need it. I have an example of how to use boost asio serial at:
https://github.com/cdesjardins/ComBomb/blob/master/TargetConnection/TgtSerialConnection.cpp
It uses async_read_some to fill a buffer with serial data, the buffer data is then queued up for other parts of the program to process.

Different behavior of boost::serialization of strings on text archive

I'm having some issue serializing a std::string with boost::serialization on a text_oarchive. AFAICT, I have two identical pieces of code that behaves differently in two different programs.
This is the program that I believe is behaving correctly:
#include <iostream>
#include <string>
#include <sstream>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
template <typename T>
void serialize_deserialize(const T & src, T & dst)
{
std::string serialized_data_str;
std::cout << "original data: " << src << std::endl;
std::ostringstream archive_ostream;
boost::archive::text_oarchive oarchive(archive_ostream);
oarchive << src;
serialized_data_str = archive_ostream.str();
std::cout << "serialized data: " << serialized_data_str << std::endl;
std::istringstream archive_istream(serialized_data_str);
boost::archive::text_iarchive iarchive(archive_istream);
iarchive >> dst;
}
int main()
{
std::string archived_data_str = "abcd";
std::string restored_data_str;
serialize_deserialize<std::string>(archived_data_str, restored_data_str);
std::cout << "restored data: " << restored_data_str << std::endl;
return 0;
}
And this is its output:
original data: abcd
serialized data: 22 serialization::archive 10 4 abcd
restored data: abcd
(You can compile it with: g++ boost-serialization-string.cpp -o boost-serialization-string -lboost_serialization)
This one, on the other hand, is an excerpt of the program I'm writing (derived from boost_asio/example/serialization/connection.hpp) that serializes std::string data converting each character in its hex representation:
/// Asynchronously write a data structure to the socket.
template <typename T, typename Handler>
void async_write(const T& t, Handler handler)
{
// Serialize the data first so we know how large it is.
std::cout << "original data: " << t << std::endl;
std::ostringstream archive_stream;
boost::archive::text_oarchive archive(archive_stream);
archive << t;
outbound_data_ = archive_stream.str();
std::cout << "serialized data: " << outbound_data_ << std::endl;
[...]
And this is an excerpt of its output:
original data: abcd
serialized data: 22 serialization::archive 10 5 97 98 99 100 0
The version (10) is the same, right? So that should be the proof that I'm using the same serialization library in both programs.
However, I really can't figure out what's going on here. I've been trying to solve this puzzle for almost an entire work day now, and I'm out of ideas.
For anyone that may want to reproduce this result, it should be sufficient to download the Boost serialization example, add the following line
connection_.async_write("abcd", boost::bind(&client::handle_write, this, boost::asio::placeholders::error));
at line 50 of client.cpp, add the following member function in client.cpp
/// Handle completion of a write operation.
void handle_write(const boost::system::error_code& e)
{
// Nothing to do. The socket will be closed automatically when the last
// reference to the connection object goes away.
}
add this cout:
std::cout << "serialized data: " << outbound_data_ << std::endl;
at connection.hpp:59
and compile with:
g++ -O0 -g3 client.cpp -o client -lboost_serialization -lboost_system
g++ -O0 -g3 server.cpp -o server -lboost_serialization -lboost_system
I'm using g++ 4.8.1 under Ubuntu 13.04 64bit with Boost 1.53
Any help would be greatly appreciated.
P.s. I'm posting this because the deserialization of the std::strings isn't working at all! :)
I see two causes of such behavior.
The compiler does not explicitly converts "abcd" from const char * to std::string and the serialization handles it as a vector of "bytes" and not as an ASCII string. Changing the code to the connection_.async_write(std::string("abcd"), boost::bind(&client::handle_write, this, boost::asio::placeholders::error)); should fix the problem.
Probably, the string type passed as the t argument of the async_write template method is not std::string but std::wstring and it is serialized not as an ASCII string ("abcd") but as an unsigned short vector and 97 98 99 100 is a decimal representation of the ASCII characters a, b, c and d.

Resources