Count lines in pipe stream without consume - c++11

I open a pipe stream with given cmd command:
FILE* fp = popen(cmd.c_str(), "r");
How to count its lines without consume?
I tried:
char* line = NULL;
size_t len = 0;
unsigned int lines = 0;
while(getline(&line, &len, fp) != -1){
++lines;
}
But it consumes fp pipe stream.

I guess you are on Linux or some other POSIX system.
You basically cannot process the data from a pipe(7) (internally used by popen(3) ...) without consuming it, since pipes are non-seekable (lseek(2) would fail with ESPIPE, mmap(2) would fail with EACCESS)
You could either redirect the command to some temporary file (using lower level fork,dup2,execve syscalls(2), as explained in Advanced Linux Programming) then process that file and rewind it (and/or resend it elsewhere) or read all the data from the pipe into memory (so the available memory is a limiting factor).

Related

How to replace read function for procfs entry that returned EOF and byte count read both?

I am working on updating our kernel drivers to work with linux kernel 4.4.0 on Ubuntu 16.0.4. The drivers last worked with linux kernel 3.9.2.
In one of the modules, we have a procfs entries created to read/write the on-board fan monitoring values. Fan monitoring is used to read/write the CPU or GPU temperature/modulation,etc. values.
The module is using the following api to create procfs entries:
struct proc_dir_entry *create_proc_entry(const char *name, umode_t
mode,struct proc_dir_entry *parent);
Something like:
struct proc_dir_entry * proc_entry =
create_proc_entry("fmon_gpu_temp",0644,proc_dir);
proc_entry->read_proc = read_proc;
proc_entry->write_proc = write_proc;
Now, the read_proc is implemented something in this way:
static int read_value(char *buf, char **start, off_t offset, int count, int *eof, void *data) {
int len = 0;
int idx = (int)data;
if(idx == TEMP_FANCTL)
len = sprintf (buf, "%d.%02d\n", fmon_readings[idx] / TEMP_SAMPLES,
fmon_readings[idx] % TEMP_SAMPLES * 100 / TEMP_SAMPLES);
else if(idx == TEMP_CPU) {
int i;
len = sprintf (buf, "%d", fmon_readings[idx]);
for( i=0; i < FCTL_MAX_CPUS && fmon_cpu_temps[i]; i++ ) {
len += sprintf (buf+len, " CPU%d=%d",i,fmon_cpu_temps[i]);
}
len += sprintf (buf+len, "\n");
}
else if(idx >= 0 && idx < READINGS_MAX)
len = sprintf (buf, "%d\n", fmon_readings[idx]);
*eof = 1;
return len;
}
This read function definitely assumes that the user has provided enough buffer space to store the temperature value. This is correctly handled in userspace program. Also, for every call to this function the read value is in totality and therefore there is no support/need for subsequent reads for same temperature value.
Plus, if I use "cat" program on this procfs entry from shell, the 'cat' program correctly displays the value. This is supported, I think, by the setting of EOF to true and returning read bytes count.
New linux kernels do not support this API anymore.
My question is:
How can I change this API to new procfs API structure keeping the functionality same as: every read should return the value, program 'cat' should also work fine and not go into infinite loop ?
The primary user interface for read files on Linux is read(2). Its pair in kernel space is .read function in struct file_operations.
Every other mechanism for read file in kernel space (read_proc, seq_file, etc.) is actually an (parametrized) implementation of .read function.
The only way for kernel to return EOF indicator to user space is returning 0 as number of bytes read.
Even read_proc implementation you have for 3.9 kernel actually implements eof flag as returning 0 on next invocation. And cat actually perfoms the second invocation of read for find that file is end.
(Moreover, cat performs more than 2 invocations of read: first with 1 as count, second with count equal to page size minus 1, and the last with remaining count.)
The simplest way for "one-shot" read implementation is using seq_file in single_open() mode.

win32 using while loop with ReadFile/WriteFile in C to write from one program to a new one

I've written a program that reads text from one file and copies it to a new file. Using a while loop and the ReadFile/Writefile functions, my program works...but my program won't stop running unless I force stop it. I'm guessing that I'm not closing my handles properly or that my while loop may be set up wrong. Once I force stop my program, the file is successfully copied over to the new location with a new name.
int n = 0;
while(n=ReadFile(hFileSource, buffer, 23, &dwBytesRead, NULL)){
WriteFile(hFileNew, buffer, dwBytesRead, &dwBytesWritten, NULL);
}
CloseHandle(hFileSource);
CloseHandle(hFileNew);
return 0;
You're not correctly testing for the end-of-file. ReadFile doesn't return failure for EOF, it returns success but with 0 bytes read. To correctly check for EOF:
while (ReadFile(hFileSource, buffer, 23, &dwBytesRead, NULL))
{
if (dwBytesRead == 0)
break;
// write data etc
}
Is there any reason you're only reading/writing 23 bytes at a time? This will be rather inefficient.

How Should I read a File line-by-line in C?

I'd like to read a file line-by-line. I have fgets() working okay, but am not sure what to do if a line is longer than the buffer sizes I've passed to fgets()? And furthermore, since fgets() doesn't seem to be Unicode-aware, and I want to allow UTF-8 files, it might miss line endings and read the whole file, no?
Then I thought I'd use getline(). However, I'm on Mac OS X, and while getline() is specified in /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk/usr/include/stdio.h, it's not in /usr/include/stdio, so gcc doesn't find it in the shell. And it's not particularly portable, obviously, and I'd like the library I'm developing to be generally useful.
So what's the best way to read a file line-by-line in C?
First of all, it's very unlikely that you need to worry about non-standard line terminators like U+2028. Normal text files are not expected to contain them, and the very overwhelming majority of all existing software that reads normal text files doesn't support them. You mention getline() which is available in glibc but not in MacOS's libc, and it would surprise me if getline() did support such fancy line terminators. It's almost a certainly that you can get away with just supporting LF (U+000A) and maybe also CR+LF (U+000D U+000A). To do that, you don't need to care about UTF-8. That's the beauty of UTF-8's ASCII compatibility and is by design.
As for supporting lines that are longer than the buffer you pass to fgets(), you can do this with a little extra logic around fgets. In pseudocode:
while true {
fgets(buffer, size, stream);
dynamically_allocated_string = strdup(buffer);
while the last char (before the terminating NUL) in the buffer is not '\n' {
concatenate the contents of buffer to the dynamically allocated string
/* the current line is not finished. read more of it */
fgets(buffer, size, stream);
}
process the whole line, as found in the dynamically allocated string
}
But again, I think you will find that there's really quite a lot of software out there that simply doesn't bother with that, from software that parses system config files like /etc/passwd to (some) scripting languages. Depending on your use case, it may very well be good enough to use a "big enough" buffer (e.g. 4096 bytes) and declare that you don't support lines longer than that. You can even call it a security feature (a line length limit is protection against resource exhaustion attacks from a crafted input file).
Based on this answer, here's what I've come up with:
#define LINE_BUF_SIZE 1024
char * getline_from(FILE *fp) {
char * line = malloc(LINE_BUF_SIZE), * linep = line;
size_t lenmax = LINE_BUF_SIZE, len = lenmax;
int c;
if(line == NULL)
return NULL;
for(;;) {
c = fgetc(fp);
if(c == EOF)
break;
if(--len == 0) {
len = lenmax;
char * linen = realloc(linep, lenmax *= 2);
if(linen == NULL) {
// Fail.
free(linep);
return NULL;
}
line = linen + (line - linep);
linep = linen;
}
if((*line++ = c) == '\n')
break;
}
*line = '\0';
return linep;
}
To read stdin:
char *line;
while ( line = getline_from(stdin) ) {
// do stuff
free(line);
}
To read some other file, I first open it with fopen():
FILE *fp;
fp = fopen ( filename, "rb" );
if (!fp) {
fprintf(stderr, "Cannot open %s: ", argv[1]);
perror(NULL);
exit(1);
}
char *line;
while ( line = getline_from(fp) ) {
// do stuff
free(line);
}
This works very nicely for me. I'd love to see an alternative that uses fgets() as suggested by #paul-tomblin, but I don't have the energy to figure it out tonight.

Windows fopen and the N flag

I'm reading some code that uses fopen to open files for writing. The code needs to be able to close and rename these files from time to time (it's a rotating file logger). The author says that for this to happen the child processes must not inherit these FILE handles. (On Windows, that is; on Unix it's OK.) So the author writes a special subroutine that duplicates the handle as non-inheritable and closes the original handle:
if (!(log->file = fopen(log->path, mode)))
return ERROR;
#ifdef _WIN32
sf = _fileno(log->file);
sh = (HANDLE)_get_osfhandle(sf);
if (!DuplicateHandle(GetCurrentProcess(), sh, GetCurrentProcess(),
&th, 0, FALSE, DUPLICATE_SAME_ACCESS)) {
fclose(log->file);
return ERROR;
}
fclose(log->file);
flags = (*mode == 'a') ? _O_APPEND : 0;
tf = _open_osfhandle((intptr_t)th, _O_TEXT | flags);
if (!(log->file = _fdopen(tf, "at"))) {
_close(tf);
return ERROR;
}
#endif
Now, I'm also reading MSDN docs on fopen and see that their version of fopen has a Microsoft-specific flag that seems to do the same: the N flag:
N: Specifies that the file is not inherited by child processes.
Question: do I understand it correctly that I can get rid of that piece above and replace it (on Windows) with an additional N in the mode parameter?
Yes, you can.
fopen("myfile", "rbN") creates a non-inheritable file handle.
The N flag is not mentioned anywhere in Linux documentation for fopen, so the solution will be most probably not portable, but for MS VC it works fine.

Duplex named pipe hangs on a certain write

I have a C++ pipe server app and a C# pipe client app communicating via Windows named pipe (duplex, message mode, wait/blocking in separate read thread).
It all works fine (both sending and receiving data via the pipe) until I try and write to the pipe from the client in response to a forms 'textchanged' event. When I do this, the client hangs on the pipe write call (or flush call if autoflush is off). Breaking into the server app reveals it's also waiting on the pipe ReadFile call and not returning.
I tried running the client write on another thread -- same result.
Suspect some sort of deadlock or race condition but can't see where... don't think I'm writing to the pipe simultaneously.
Update1: tried pipes in byte mode instead of message mode - same lockup.
Update2: Strangely, if (and only if) I pump lots of data from the server to the client, it cures the lockup!?
Server code:
DWORD ReadMsg(char* aBuff, int aBuffLen, int& aBytesRead)
{
DWORD byteCount;
if (ReadFile(mPipe, aBuff, aBuffLen, &byteCount, NULL))
{
aBytesRead = (int)byteCount;
aBuff[byteCount] = 0;
return ERROR_SUCCESS;
}
return GetLastError();
}
DWORD SendMsg(const char* aBuff, unsigned int aBuffLen)
{
DWORD byteCount;
if (WriteFile(mPipe, aBuff, aBuffLen, &byteCount, NULL))
{
return ERROR_SUCCESS;
}
mClientConnected = false;
return GetLastError();
}
DWORD CommsThread()
{
while (1)
{
std::string fullPipeName = std::string("\\\\.\\pipe\\") + mPipeName;
mPipe = CreateNamedPipeA(fullPipeName.c_str(),
PIPE_ACCESS_DUPLEX,
PIPE_TYPE_MESSAGE | PIPE_READMODE_MESSAGE | PIPE_WAIT,
PIPE_UNLIMITED_INSTANCES,
KTxBuffSize, // output buffer size
KRxBuffSize, // input buffer size
5000, // client time-out ms
NULL); // no security attribute
if (mPipe == INVALID_HANDLE_VALUE)
return 1;
mClientConnected = ConnectNamedPipe(mPipe, NULL) ? TRUE : (GetLastError() == ERROR_PIPE_CONNECTED);
if (!mClientConnected)
return 1;
char rxBuff[KRxBuffSize+1];
DWORD error=0;
while (mClientConnected)
{
Sleep(1);
int bytesRead = 0;
error = ReadMsg(rxBuff, KRxBuffSize, bytesRead);
if (error == ERROR_SUCCESS)
{
rxBuff[bytesRead] = 0; // terminate string.
if (mMsgCallback && bytesRead>0)
mMsgCallback(rxBuff, bytesRead, mCallbackContext);
}
else
{
mClientConnected = false;
}
}
Close();
Sleep(1000);
}
return 0;
}
client code:
public void Start(string aPipeName)
{
mPipeName = aPipeName;
mPipeStream = new NamedPipeClientStream(".", mPipeName, PipeDirection.InOut, PipeOptions.None);
Console.Write("Attempting to connect to pipe...");
mPipeStream.Connect();
Console.WriteLine("Connected to pipe '{0}' ({1} server instances open)", mPipeName, mPipeStream.NumberOfServerInstances);
mPipeStream.ReadMode = PipeTransmissionMode.Message;
mPipeWriter = new StreamWriter(mPipeStream);
mPipeWriter.AutoFlush = true;
mReadThread = new Thread(new ThreadStart(ReadThread));
mReadThread.IsBackground = true;
mReadThread.Start();
if (mConnectionEventCallback != null)
{
mConnectionEventCallback(true);
}
}
private void ReadThread()
{
byte[] buffer = new byte[1024 * 400];
while (true)
{
int len = 0;
do
{
len += mPipeStream.Read(buffer, len, buffer.Length);
} while (len>0 && !mPipeStream.IsMessageComplete);
if (len==0)
{
OnPipeBroken();
return;
}
if (mMessageCallback != null)
{
mMessageCallback(buffer, len);
}
Thread.Sleep(1);
}
}
public void Write(string aMsg)
{
try
{
mPipeWriter.Write(aMsg);
mPipeWriter.Flush();
}
catch (Exception)
{
OnPipeBroken();
}
}
If you are using separate threads you will be unable to read from the pipe at the same time you write to it. For example, if you are doing a blocking read from the pipe then a subsequent blocking write (from a different thread) then the write call will wait/block until the read call has completed and in many cases if this is unexpected behavior your program will become deadlocked.
I have not tested overlapped I/O, but it MAY be able to resolve this issue. However, if you are determined to use synchronous calls then the following models below may help you to solve the problem.
Master/Slave
You could implement a master/slave model in which the client or the server is the master and the other end only responds which is generally what you will find the MSDN examples to be.
In some cases you may find this problematic in the event the slave periodically needs to send data to the master. You must either use an external signaling mechanism (outside of the pipe) or have the master periodically query/poll the slave or you can swap the roles where the client is the master and the server is the slave.
Writer/Reader
You could use a writer/reader model where you use two different pipes. However, you must associate those two pipes somehow if you have multiple clients since each pipe will have a different handle. You could do this by having the client send a unique identifier value on connection to each pipe which would then let the server associate the two pipes. This number could be the current system time or even a unique identifier that is global or local.
Threads
If you are determined to use the synchronous API you can use threads with the master/slave model if you do not want to be blocked while waiting for a message on the slave side. You will however want to lock the reader after it reads a message (or encounters the end of a series of message) then write the response (as the slave should) and finally unlock the reader. You can lock and unlock the reader using locking mechanisms that put the thread to sleep as these would be most efficient.
Security Problem With TCP
The loss going with TCP instead of named pipes is also the biggest possible problem. A TCP stream does not contain any security natively. So if security is a concern you will have to implement that and you have the possibility of creating a security hole since you would have to handle authentication yourself. The named pipe can provide security if you properly set the parameters. Also, to note again more clearly: security is no simple matter and generally you will want to use existing facilities that have been designed to provide it.
I think you may be running into problems with named pipes message mode. In this mode, each write to the kernel pipe handle constitutes a message. This doesn't necessarily correspond with what your application regards a Message to be, and a message may be bigger than your read buffer.
This means that your pipe reading code needs two loops, the inner reading until the current [named pipe] message has been completely received, and the outer looping until your [application level] message has been received.
Your C# client code does have a correct inner loop, reading again if IsMessageComplete is false:
do
{
len += mPipeStream.Read(buffer, len, buffer.Length);
} while (len>0 && !mPipeStream.IsMessageComplete);
Your C++ server code doesn't have such a loop - the equivalent at the Win32 API level is testing for the return code ERROR_MORE_DATA.
My guess is that somehow this is leading to the client waiting for the server to read on one pipe instance, whilst the server is waiting for the client to write on another pipe instance.
It seems to me that what you are trying to do will rather not work as expected.
Some time ago I was trying to do something that looked like your code and got similar results, the pipe just hanged
and it was difficult to establish what had gone wrong.
I would rather suggest to use client in very simple way:
CreateFile
Write request
Read answer
Close pipe.
If you want to have two way communication with clients which are also able to receive unrequested data from server you should
rather implement two servers. This was the workaround I used: here you can find sources.

Resources