Merge numbered files with variable names [closed] - bash

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 11 months ago.
The community is reviewing whether to reopen this question as of 10 months ago.
Improve this question
I have a number of numbered files, e.g.:
alpha_01.txt alpha_02.txt beta_01.txt beta_02.txt
I want to execute a single line bash that will output correctly merged files based on their variable name (e.g. alpha, beta, ...), that is, alpha.txt beta.txt.
I can do so for a single file:
cat alpha_*.txt(n) >>alpha.txt 2>/dev/null
But I don‘t know the name before _*.txt.
Can I use a wildcard here? Or what would be the best solution?

If you want to concatenate all the alpha_xxx.txt files then you cannot have beta_xxx.txt in the arguments of cat.
As #tripleee said, the easiest way would be to use a for loop where you list all the prefixes:
for name in alpha beta
do
cat "$name"_*.txt > "$name".txt
done
Now, if you don't know the prefixes in advance then you can always workout something with awk:
awk '
BEGIN {
for (i = 1; i <= ARGC; i++) {
filename = ARGV[i]
if (filename !~ /^(.*\/)?[^\/]+_[0-9]+\.[^\/.]+$/)
continue
match(filename, /^(.*\/)?[^\/]+_/)
prefix = substr(filename, RSTART, RLENGTH-1)
match(filename, /\.[^.\/]+$/)
suffix = substr(filename, RSTART, RLENGTH)
outfile[filename] = prefix suffix
}
}
FILENAME in outfile { print $0 > outfile[FILENAME] }
' ./*.txt

Related

printf “%6.1f” 12.3456 printf”|%6s%8.2f|” hello 12.2456) explanation [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
The following code returns the following values accordingly :
printf “%6.1f” 12.3456 printf”|%6s%8.2f|” hello 12.2456)
_ _ 12.3 |_hello_ _ _ 12.25|
The question is what does each character of the code mean and why does it return these values?
They seem homeworks and the code is... read this short tutorial which should help you solve it https://linuxconfig.org/bash-printf-syntax-basics-with-examples
As a hint: printf “%5.2f” 100.1555 Which means:
printf function that prints text to console
% print an argument
5.2f format of the argument
5 print integer part with 5 characters, if the integer part has less than 5 characters, fill it with spaces, if it has more it will NOT cut it
. separates the integer and decimal format syntax
2 print decimal part with 2 characters, if the integer part has less than 2 characters, fill it with spaces, if it has more cut it to meet 2 characters
f the argument is of type float. For more info https://www.le.ac.uk/users/rjm1/cotter/page_30.htm
100.1555 argument
The result would be: <space><space>100.15 (if you count the characters, there are 5 characters at the left of the dot, and 2 at the right --> 5.2)
For printf “%6.2f” 100.1555 result would be: <space><space><space>100.15
For printf “%6.3f” 100.1555 result would be: <space><space><space>100.155
For printf “%1.3f” 100.1555 result would be: 100.155 (integer part is never cut)

Awk: Match column values from 2 files if their numerical values are close [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
Following my first question here (Awk: Length of column number)
My data:
File 1
8.193506084253E+06 1.900521460E+01
8.193538509494E+06 1.899919490E+01
8.193540934736E+06 1.899317535E+01
8.193543359977E+06 1.898720476E+01
8.193546406105E+06 1.897934066E+01
File 2
8.193505938557E+06 1.572155163E+01
8.193509618041E+06 1.573016361E+01
8.193513297526E+06 1.573874442E+01
8.193516977010E+06 1.574725969E+01
I want to take $1 from File 2 and search in File 1 the most closest* value in $1, in order to get an output like this example
8.193505938557E+06 1.572155163E+01 1.900521460E+01
In this case the only the first value of column $1 in file 2 has a match, and nothing else because the other values of $1 from File 2 are not close enough (defining some condition) to any value of $1 from File 1
Note that the number of rows are different.
*closest= where the difference between the two numbers is smaller than some threshold
To my understanding, according to your description the result should be:
1235.34 d a
3457.23 e b
7589.34 f b
i.e. including a line for "f" which is closest to "b".
This can be done using the following script:
ARGIND == 1 {
haystack[$1] = $2;
}
ARGIND == 2 {
bestdiff=-1;
for (v in haystack)
if (bestdiff < 0 || (v-$1)**2 < bestdiff) {
bestkey=haystack[v];
bestdiff=(v-$1)**2;
}
print $1, $2, bestkey;
}
(I'm using squaring via **2 as a substitute for taking the absolute value.)
If you want to suppress results if the difference is for example greater than 10, to get the result you quoted, use something like this:
if (bestdiff < 10**2)
print $1, $2, bestkey;
Edit: The OP changed the example in- and output in the question. Here are the original example files for reference. File 1:
1234.34 a
3456.23 b
2325.89 c
2326.20 c2
File 2:
1235.34 d
3457.23 e
7589.34 f
Output:
1235.34 d a
3457.23 e b
Note: ARGIND and ** are GNU extensions. See comment from mklement0 below for details.
Load the first column values into an array for file2.
Then compare the differences using abs function.
The script will output the closest match (closest means a single superlative).
awk 'BEGIN{closestVal=9999}
function abs(x){return ((x < 0.0) ? -x : x)}
{
if (NR==FNR) { f1col2[NR]=$2;v[NR]=$1; next; }
for (n in v)
{
if (abs(v[n] - $1) < closestVal)
{
closestVal = abs(v[n] - $1)
closestLine = $0 " " f1col2[n]
}
}
}
END {print closestLine}' file1 file2

How to match exactly one string when there are multiple matches [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am trying to extract from a file with 3-4 entries only the first journal reference. Any ideas on how to get only the first occurrence of a match?
Here is what I have done so far. I can extract the references, but I am getting all of them:
if file_line =~ /^ JOURNAL \*?(.*)/
captured_journal = $1
To be more clear, this is some of the file I am trying to extract only the first JOURNAL entry from:
JOURNAL Genomics 33 (2), 229-246 (1996)
PUBMED 8660972
REFERENCE 2 (bases 1 to 17009)
AUTHORS Lopez,J.V.
TITLE Direct Submission
JOURNAL Submitted (07-FEB-1995) Jose V. Lopez, Laboratory of Viral
Carcinogenesis, PRI/DynCorp, Biological Carcinogenesis and
Development Prog, Bldg 560, Room 11-21, NCI-Frederick Cancer
Research and Development Center, Frederick, MD 21702-1201, USA`enter code here`
I only want "Genomics 33 (2), 229-246 (1996)" but I am also getting the next JOURNAL entries.
It is hard to answer your question, your example does not show the complete coding.
One possibility: Your if file_line is inside a loop. Then you could leave the loop:
filecontent.each_line{|file_line|
if file_line =~ /^ JOURNAL \*?(.*)/
captured_journal = $1
break
end
}
As an alternative you could check, if you already found an entry:
captured_journal = nil
filecontent.each_line{|file_line|
if file_line =~ /^ JOURNAL \*?(.*)/
captured_journal = $1 unless captured_journal
end
}
But maybe you are not in a loop and the file content is stored in a String (e.g. with File.read). Then you could use a simple regex:
filecontent =~ /^ JOURNAL \*?(.*)/
captured_journal = $1
or
/^ JOURNAL \*?(.*)/.match(filecontent)[1]
Correction after you posted more details:
You could use the regex /^\s*JOURNAL\s+(.*)/. Your Regexp uses a fix number of spaces. With \s+ the number of spaces is flexible.

ruby regex replace the corresponding string [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
str = "1627207:132069:color:green;20518:28421:size:62cm"
aliastr = "20518:28421:S;20518:28358:L;20518:28357:M;1627207:132069:red"
How to dynamic replace str to "1627207:132069:color:red;20518:28421:size:S".
It was a pretty unclear question, but I think I got it now. Your aliastr contains mappings which control the replacements, i.e., the key '20518:28421:' should map to value 'S' and the key '1627207:132069:' should map to 'red'. Then you want to search for those keys in str and replace their current value with that new value. This does that:
str = "1627207:132069:color:green;20518:28421:size:62cm"
aliastr = "20518:28421:S;20518:28358:L;20518:28357:M;1627207:132069:red"
mapping = Hash[aliastr.scan(/(\d+:\d+:)(.*?)(?:;|$)/)]
# mapping = {"20518:28421:"=>"S", "20518:28358:"=>"L", "20518:28357:"=>"M", "1627207:132069:"=>"red"}
replaced = str.gsub(/(\d+:\d+:)(\w+:).*?(;|$)/) do |match|
key = $1
value = mapping[$1]
key + $2 + value + $3
end
p replaced
# => "1627207:132069:color:red;20518:28421:size:S"
Your question is not very clear, and probably contains an error ("color:red" in your wanted result vs. "red" in aliastr).
You may try something like this:
str = "1627207:132069:color:green;20518:28421:size:62cm"
aliastr = "20518:28421:S;20518:28358:L;20518:28357:M;1627207:132069:red"
replacements = aliastr.split(";").map{|s| parts=s.split(":"); [/#{parts[0]}:#{parts[1]}:.*/,s]}
src = str.split(";")
src.map{|s| replacements.each{|r| s.sub!(r[0],r[1])}; s }.join(";")

Not able to find a particular regex in ruby [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
This is my regex /(.+)(\.|::)(\S+)\Z/. User enter function.[] Output $1 as function and $3 as []user enter function[] output nil The desired output is $1 as function and $3 as [].
Any guesses how can I alter the above regex to do this.
Call the match method to set $1 and $3:
/(\w+)(\.|::)?(\S+)\Z/.match('mongo.[]')
$1 # => mongo
$3 # => []
/(\w+)(\.|::)?(\S+)\Z/.match('mongo[]')
$1 # => mongo
$3 # => []
Are you looking for /(.+)(\.|::)*(\S+)\Z/?
The asterisk that I added means zero or more.
Or /(.+)(\.|::)?(\S+)\Z/,
The question mark means zero or one.

Resources