How can I edit the columns of multiple files in the same folder then overwriting it? - macos

I am at the edge of frustration thats why my way of explain myself might be rough.
I am trying to remove a column from a tab delimited text files. I have 10 files with same pattern of names such as (written below), thats why when I use such commands like cut I get frustrated with using tab many times.
So my questions;
1) Is there a easier method of basically copying the file name for overwriting?
cut -f 1,3 oldfile.txt > oldfile.txt
btw when I do the method above, my all lines got erased for some reason.
2) Since I dont know the regular expression well enough, is there a way to go around of awk ? that language seems pretty complicated for me.
3)What is the most memory efficient method for his task? Please tell me a method then I will learn that method even its as hard as learning Chinese.
I am a unix frasturared mediocare. sorry for my bad england.
Best regards,
file name example: KKK12398801_normal_912839.txt

Related

Possible to copy/move/etc multiple files of same base name via Windows CMD/.BAT?

I am wondering if it is possible to accomplish the following, given some context and example.
I have files in "Server\Share\Folder\File##.ext"
Sometimes the "File##.ext" can be "File01.ext" through "File20.ext", and other times it may be "File01.ext" through "File40.ext"
Sometimes there are less of these files, sometimes there are more.
I want a batch file to take the files from "Server\Share\Folder\File##.ext" and move them to "Server\Share\OtherFolder\File##.ext". I know I can accomplish this easily with:
copy /y "Server\Share\Folder\File01.ext" "Server\Share\OtherFolder\File01.ext"
Then just add another line for each extra "File02.ext, File03.ext, etc., but I am wondering if it is possible to make it so that any file that resembles "File##.ext" can be included, so that no matter how many ## I have, it always works without issue.
Thanks in advance for any and all advice!
EDIT
Someone mentioned using Wildcards, but my question with that is - lets say those files are File01.ext through File05.ext, will it match what it finds to the newly moved file? Like will it find File01 from File?? on the source and Make it File01 from File?? at the destination?
You can accomplish this task with a FORloop program in batch-file.
You can also loop through the Commands using : and variable name.
Combining these two would help you get what you want.
We can help you with Ideas and little bit of the coding. But the Efforts must be done by you. So U can learn programming better

How to make this simple GUI in Python?

I'm totally new with programming, but have made some scripts for extraction data from .txt files etc. Now I am making a simple script for work, but need a simple GUI so people can run use it efficiently. The script is really simple, and consists of 4 dictionaries and a list with the keys for the values that I want to print from one of the dictionaries. What I need is a GUI that looks like the one posted. There will be 4 buttons, one for each dictionary, and the user can only pick one. On the left will be the keys, and the keys transferred to the right will be put in a list, which will be used to write the values to a .txt file. This is probably really simple, but I have no idea where to start with GUI, so I hope that someone can give me some ideas. In advance, thank you :)
Exaple: https://ci.apache.org/projects/wicket/guide/6.x/img/multi-select-transfer-component.png
It's cool that you are getting into GUI programming. Try tkinter:
https://www.tutorialspoint.com/python/python_gui_programming.htm

How to find foreign language used in "C comments"

I have a large source code where most of the documentation and source code comments are in english. But one of the minor contributors wrote comments in a different language, spread in various places.
Is there a simple trick that will let me find them ? I imagine first a way to extract all comments from the code and generate a single text file (with possible source file / line number info), then pipe this through some language detection app.
If that matters, I'm on Linux and the current compiler on this project is CLang.
The only thing that comes to mind is to go through all of the code manually and check it yourself. If it's a similar language, that doesn't contain foreign letters, consider using something with a spellchecker. This way, the text that isn't recognized will get underlined, and easy to spot.
Other than that, I don't see an easy way to go through with this.
You could make a program, that reads the files and only prints the comments out to another output file, where you then spell check that file, but this would seem to be a waste of time, as you would easily be able to spot the comments yourself.
If you do make a program for that, however, keep in mind that there are three things to check for:
If comment starts with /*, make sure it stops reading when encountering */
If comment starts with //, only read one line - unless:
If line starting with // ends with \, read next line as well
While it is possible to detect a language from a string automatically, you need way more words than fit in a usual comment to do so.
Solution: Use your own eyes and your own brain...

How to split a large csv file into multiple files in GO lang?

I am a novice Go lang programmer,trying to learn Go lang features.I wanted to split a large csv file into multiple files in GO lang, each file containing the header.How do i do this? I have searched everywhere but couldnt get the right solution.Any help in this regard will be greatly appreciated.
Also please suggest me a good book for reference.
Thanking You
Depending on your shell fu this problem might be better suited for common shell utilities but you specifically mentioned go.
Let's think through the problem.
How big is this csv file? Are we talking 100 lines or is it 5G ?
If it's smallish I typically use this:
http://golang.org/pkg/io/ioutil/#ReadFile
However, this package also exists:
http://golang.org/pkg/encoding/csv/
Regardless - let's return to the abstraction of the problem. You have a header (which is the first line) and then the rest of the document.
So what we probably want to do (if ignoring csv for the moment) is to read in our file.
Then we want to split the file body by all the newlines in it.
You can use this to do so:
http://golang.org/pkg/strings/#Split
You didn't mention but do you know how many files you want to split by or would you rather split by the line count or byte count? What's the actual limitation here?
Generally it's not going to be file count but if we pretend it is we simply want to divide our line count by our expected file count to give lines/file.
Now we can take slices of the appropriate size and write the file back out via:
http://golang.org/pkg/io/ioutil/#WriteFile
A trick I use sometime to help think me threw these things is to write down our mission statement.
"I want to split a large csv file into multiple files in go"
Then I start breaking that up into pieces but take the divide/conquer approach - don't try to solve the entire problem in one go - just break it up to where you can think about it.
Also - make gratiutious use of pseudo-code until you can comfortably write the real code itself. Sometimes it helps to just write a short comment inline with how you think the code should flow and then get it down to the smallest portion that you can code and work from there.
By the way - many of the golang.org packages have example links where you can literally run in your browser the example code and cut/paste that to your own local environment.
Also, I know I'll catch some haters with this - but as for books - imo - you are going to learn a lot faster just by trying to get things working rather than reading. Action trumps passivity always. Don't be afraid to fail.
Here is a package that might help. You can set a necessary chunk size in bytes and a file will be split on an appropriate amount of chunks.

Eliminating code duplication in a single file

Sadly, a project that I have been working on lately has a large amount of copy-and-paste code, even within single files. Are there any tools or techniques that can detect duplication or near-duplication within a single file? I have Beyond Compare 3 and it works well for comparing separate files, but I am at a loss for comparing single files.
Thanks in advance.
Edit:
Thanks for all the great tools! I'll definitely check them out.
This project is an ASP.NET/C# project, but I work with a variety of languages including Java; I'm interested in what tools are best (for any language) to remove duplication.
Check out Atomiq. It finds code that is duplicate that is prime for extracting to one location.
http://www.getatomiq.com/
If you're using Eclipse, you can use the copy paste detector (CPD) https://olex.openlogic.com/packages/cpd.
You don't say what language you are using, which is going to affect what tools you can use.
For Python there is CloneDigger. It also supports Java but I have not tried that. It can find code duplication both with a single file and between files, and gives you the result as a diff-like report in HTML.
See SD CloneDR, a tool for detecting copy-paste-edit code within and across multiple files. It detects exact copyies, copies that have been reformatted, and near-miss copies with different identifiers, literals, and even different seqeunces of statements.
The CloneDR handles many languages, including Java (1.4,1.5,1.6) and C# especially up to C#4.0. You can see sample clone detection reports at the website, also including one for C#.
Resharper does this automagically - it suggests when it thinks code should be extracted into a method, and will do the extraction for you
Check out PMD , once you have configured it (which is tad simple) you can run its copy paste detector to find duplicate code.
One with some Office skills can do following sequence in 1 minute:
use ordinary formatter to unify the code style, preferably without line wrapping
feed the code text into Microsoft Excel as a single column
search and replace all dual spaces with single one and do other replacements
sort column
At this point the keywords for duplicates will be already well detected. But to go further
add comparator formula to 2nd column and counter to 3rd
copy and paste values again, sort and see the most repetitive lines
There is an analysis tool, called Simian, which I haven't yet tried. Supposedly it can be run on any kind of text and point out duplicated items. It can be used via a command line interface.
Another option similar to those above, but with a different tool chain: https://www.npmjs.com/package/jscpd

Resources