Performing conditional change in .ods file in bash - bash

How can I do a conditional change in an .ods document? I have two columns. One of them stores a string and the second a value. I want to search the document with a particular string that I have, say "xyz". If this matches any of the strings that are shown in the first column, I would like a value of 1 to be deducted from the cell in the same row, but from the second column. The data in the .ods document are separated by the different adjoining cells (so a tab?)
As an example, consider the following:
xyz 23
xxy 42
xzz 76
If I have the string "xxy", I would like the bash script to update the .ods file such that it looks as so:
xyz 23
xxy 41
xzz 76
Now, the strings that I am searching for are stored in a seperate .txt file. I would like to iterate over all of the strings in the .txt file and repeatedly perform the described operation in the .ods file. There can be cases where the are multiple occurrences of the same string. Any helps with this?

This should be a comment, but its a bit long
am searching for are stored in a text file.
No. A MS Excel files is not a text file. Its not even a file but rather an embedded filesystem where content is encapsuleted in OLE, or more recently as an xml tree. While there are both OLE and XML parsers available on Unix (I assume you want to run this on Linux/Unix/Posix since you've flagged this with bash, awk and sed) that just gets you access to where the data is stored. You still need a detailled understanding of the file format to be able to make changes. While it may be possible to do this in bash, it would be a lot easier in a dedicated programming language. Several do come with libraries for processing Excel files but vary in their support for file formats. Alternatively you could load it up in openoffice using its UNO API.

Related

How to extract specific lines from a huge data file?

I have a very large data file, about 32GB. The file is made up of about 130k lines, each of which mainly contains numbers, but also has few characters.
The task I need to perform is very clear: I have to extract 20 lines and write them to a new text file.
I know the exact line number for each of the 20 lines that I want to copy.
So the question is: how can I extract the content at a specific line number from the large file? I am on Windows. Is there a tool that can do such sort of operations, or I need to write some code?
If there is no direct way of doing that, I was thinking that a possible approach is to first extract small blocks of the original file (so that each block contains one or more lines to extract) and then use a standard editor to find the lines within each block. In this case, the question would be: how can I split a large file in blocks by line on windows? I use a tool named HJ-Split which works very well with large files, but it can only split by size, not by line.
Install[1] Babun Shell (or Cygwin, but I recommend the Babun), and then use sed command as described here: How can I extract a predetermined range of lines from a text file on Unix?
[1] Installing Babun means actually just unzipping it somewhere, so you don't have to have the Administrator rights on the server.

Is there a part of a windows file that can't be modified?

I'm trying to accomplish something that will let a user download a file from a web application onto their system. The file will contain a unique five digit code. Using this unique five digit code the users can search for a file in their file system.
I'm wondering where is the best place to put this five digit code in a file so that users can easily search for the file. The simplest approach would be to put it in the name of the file, however, users can change the name of the file easily.
I'm looking for a filed where I can put the code so that users won't be able to modify it but will still be able to search for it. Is this possible?
If you say File.. what kind of file format do you mean. I'm asking because a file is just a pile of bytes and you can append your 5 digit code every where in the file, if it is your own file format. But if you tell us which file format you use, probably there are some fields which can be used to search for it. As example Tiff has many tags. Images have other meta data. etc

how to create a BAT file to be used in monetdb bulk load from a Java program

I have a file with a list of string (one cloumn).
File example
sdfsdfsdf
hfhfhfghf
dfgdggdfg
pookokkoo
base on the documentation on monetdb web site, I have to create a BAT file.
How do I convert my file with strings into a BAT file ready to be imported in monetdb?
How do I do this from Java?
Thanks,
monetdb site doc
http://www.monetdb.org/Documentation/Cookbooks/SQLrecipies/BinaryBulkLoad
From the Documentation:
For variable length strings, the file should have one C-based string value per line, terminated by a newline, and it is processed without escape character conversion.
Since this is exactly the format that you have, you can use it directly.
A technical note on the side: Strings in MonetDB are dictionary compressed which makes it really hard to generate the binary representation "by hand".
The MonetDB's documentation says: "Copying binary files: Migration of tables between MonetDB/SQL instances can be sped up using the binary COPY INTO/FROM format. See the recipe for this functionality.", however, it does not say how I can dump the data in binary format. How can I do it? I expected something like: "copy binary from TABLE to "file(s)"; and then "copy binary into TABLE from ('file(s)');

Section Delimiting for Flat Files

I need to generate a flat file with several different sections, each with different record structures. All data is delimited text, single line per record. What would be a good delimiting sequence, or mechanism to differentiate sections, given that records can contain line feeds etc. within quoted text fields?
Well, provided you are free of the necessity of editing the file with an inferior text editor or that the file be human-readable, you could use one of the four C0 control codes which are ASCII characters 28–31, which are meant for delimiting text records. They just never caught on because of the first two points.
You haven't specified what you intend to do with the file. What language do you use, etc.
If the file will contain different sections, and different structures, I would suggest using the YAML structure.
There are many libraries that allow for to reading/writing using YAML.

What is the best way to edit the middle of an existing flat file?

I have tool that creates variables for a simulation. The current workflow involves hand copying those variables into the simulation input file. The input file is a standard flat file, i.e. not binary or XML. I would like to automate the addition of the variables to the flat input file.
The variables copy over existing variables in the file, e.g.
New Variables:
Length 10
Height 20
Depth 30
Old Variables:
...
Weight 100
Age 20
Length 10
Height 20
Depth 30
...
Would like to have the old variables copy over the new variable. They are 200 lines into the flat input file.
Thanks for any insights.
P.S. This is on Windows.
If you're stuck using flat, then you're stuck using the old fashioned way of updating them: read from original, write to temp file, either write the original row or change the data and then write that. To add data, write it to the temp file at the appropriate point; to delete data, simply don't copy it from the original file.
Finally, close both files and rename the temp file to the original file name.
Alternatively, it might be time to think about a little database.
For something like this I'd be looking at a simple template engine. You'd have a base template with predefined marker tokens instead of variable values and then just pass the values required to your engine along with the template and it will spit out the resultant file, all present and correct. There are a number of Open Source template engines available in Java that would meet your needs, I imagine such things are also available in your language of choice. You could even roll your own without too much difficulty.
Note that under Unix, one would probably look at using mmap() because you can then use functions such as memmove() to move the data around and add new data or truncate() the result if the file is then smaller (you may also want to use truncate() to grow the file).
Under MS-Windows, you have the MapViewOfFileEx() function to do the same thing. The API is different, though,
and I'm not exactly sure what happens or how to grow/shrink the file (MSDN also includes a truncate()-like function and maybe that works).
Of course, it's important to use memcpy() or memmove() properly to not overwrite the wrong data or go outside the buffer.

Resources