ls command, default (alphabetical) sorting order - sorting

here a piece of code :
$> ls
` = _ ; ? ( ] # \ % 1 4 7 a B d E g H J l M o P r S u V x Y
^ > - : ' ) { $ & + 2 5 8 A c D f G i k L n O q R t U w X z
< | , ! " [ } * # 0 3 6 9 b C e F h I K m N p Q s T v W y Z
I'm printing all ASCII character, each element is a folder, and I'm trying to understand the default sorting order of the ls command.
I understand that's there is a case insensitive comparison to sort alphabetic character, with digit coming first.
I've some trouble to understand how special character are sorted, and I'm not able to find something clear. I was thinking it could be related to the ASCII table, but when we see how things are ordered it really make no sens with it... Where is this order coming from ?
Thanks

Related

How do i change an array of numbers into corresponding letters of the alphabet

I have an array called variable that contains the numbers 1-26, i am trying to use a for loop in bash to go through each number of the array and associating it with a letter from the alphabet as tr only lets me translate the first few letters of the alphabet. An example of my code is
Note: i am using bash
#!/bin/bash
for p1 in "${variable[#]}"; do
if (( $p1 == 1 )); then
newvar+='a'
elif (( $p1 == 2 )); then
newvar+='b'
...... and so on down to z
i am trying to create the string newvar which contains these translated letters. However when i try to run this it only shows me a which is the very first number translated. Why doesn't this work?
for p1 in "${variable[#]}"; do
chars+=( $((p1 + 96)) )
done
printf '%b' $(printf '\\%03o' ${chars[#]})
Maybe:
# alphabet=(a b c d e f g h i j k l m n o p q r s t u v w x y z)
alphabet=({a..z})
letters=(8 5 12 12 15 23 15 18 12 4)
phrase=''
for i in "${letters[#]}"; do
phrase+="${alphabet[i-1]}"
done
echo $phrase
helloworld

destructure sequence into lexical variables

I have a sequence with a known number of elements (from a pcre match) and would like to map this into lexical variables.
I can probably loop over the sequence and put every element onto the stack and then :> ( a b c d ) but is there an idiomatic way to do this ?
Oh and my sequence has more than 4 elements, so first4 doesn't cut it, although I could obviously use first4 and then first3 on a subset of the sequence.
If you are sure that's want you really want to do, you could use firstn from quotations.generalizations:
SYMBOLS: a b c d e f g h ;
[let
{ 1 2 3 4 5 6 7 8 }
8 firstn :> ( a b c d e f g h )
a b c d e f g h . . . . . . . . ]
But it sounds like a bad idea. It's tricky, because the lexical variables are not "real" variables, the compiler converts them into stack shuffling. That's why they don't play nice with macros and :> can't be called like a regular word.
If you use dynamic variables it's easier:
SYMBOLS: a b c d e f g h ;
{ 1 2 3 4 5 6 7 8 }
{ a b c d e f g h } [ set ] 2each
{ a b c d e f g h } [ get . ] each

Processing data swapped over files BASH

First, I would like to apologize for my extremely basic knowledge about coding. Then I hope that I will be able to express myself correctly about my issue. Do no hesitate to ask for further clarifications or anything else...
I'm encountering troubles postprocessing data...
My goal is to recombine data which were swapped.
EDIT : here is a .rar folder containing my test example which works and the one that I try to make working... (do not be afraid by the time it requires to process the data)
https://drive.google.com/file/d/1AEPUc8haT5_Z3LR3jnZZlpyfxhdDwwo6/view?usp=sharing
EDIT 2 : Here is what I expect on paper (Its my TestReorder3OK folder in my rar archive)
enter image description here
EDIT 3 : MINIMAL COMPLETE EXAMPLE
Script :
#!/bin/bash
# Definir le nombre de replica
NP=3
NP1=$[NP-1]
rm torder*
for repl in `seq 0 $NP1`
do
echo $repl
# colle la colonne 2 du fichier .lammps dans un fichier rep_0, puis dans la seconde boucle, la colonne 3 dans rep_1, etc.
awk -v rep=$repl '{r2=rep+2;print $r2}' < log.lammps > rep_$repl
i=0
j=0
# cree une boucle dans la boucle
for a in `cat rep_$repl`
do
i=$[i+1]
j=$[j+3]
head -$i screen.$repl.temp | tail -1 >> torder.$a
head -$j ccccd2_H_${repl}_col.bak2 | tail -3 >> ccccd2_H_${a}_temp_col.bak2
done
done
log.lammps file
1 0 1 2
2 1 0 2
3 1 2 0
Starting at column 2, this file contains the number associated to the inputs below. Here is an expanded explanation :
column 2 has three values : 0, 1 and 1 ; the 0 is associated to the first three lines of the file ccccd2_H_0_col.bak2, the next three ones are associated the 1 and the last three ones again to the value 1.
column 3 has also three values : 1, 0 and 2 ; the 1 is associated to the first three lines of the file ccccd2_H_1_col.bak2, the next three ones are associated the 0 and the last three ones again to the value 2.
Same story for column 4.
Now what I want, is that every set of three lines associated to the 0 value go into a single file. Every set of three lines associated to the 1 value go into another single file, and the sets of three lines associated to the 2 value to a last file.
Inputs :
ccccd2_H_0_col.bak2
blank line
N a b c
C d e f
N g h i
C j k l
N m n o
C p q r
ccccd2_H_1_col.bak2
blank line
N s t u
C v w x
N y z a
C b c d
N e f g
C h i j
ccccd2_H_2_col.bak2
blank line
N k l m
C n o p
N q r s
C t u v
N w x y
C z a b
Outputs : These are the desired outputs and the one that I get for simple test files
ccccd2_H_0_temp_col
blank line
N a b c
C d e f
N y z a
C b c d
N w x y
C z a b
ccccd2_H_1_temp_col
blank line
N g h i
C j k l
N m n o
C p q r
N s t u
C v w x
ccccd2_H_2_temp_col
blank line
N e f g
C h i j
N k l m
C n o p
N q r s
C t u v
This works fine on small test files (as shown here), but not on my real system. For my real system, I have the log.lammps file that contains 14 rows and 10,001 lines, and my input files that contain 121,121 lines (so 10,001 * block of 121 lines). It creates files 10 times larger with more data than it should.
Can you enlighten me about my issue ? I think this is linked to the difference of line number from my files containing a single row and the files containing cartesian coordinates, but I really don't understand the link nor the way to solve it...
Thank you in advance...
I think I understand what you're trying do do now and this GNU awk script (for ARGIND, ENDFILE and inbuilt open file management) will do it:
$ cat ../tst.awk
ARGIND == 1 {
for (inFileNr=2; inFileNr<=NF; inFileNr++) {
outFileNrs[inFileNr,NR] = $inFileNr
}
next
}
ENDFILE { RS = "" }
{ print ORS $0 > ("ccccd2_H_" outFileNrs[ARGIND,FNR] "_temp_col") }
Look:
INPUT:
$ ls
ccccd2_H_0_col.bak2 ccccd2_H_1_col.bak2 ccccd2_H_2_col.bak2 log.lammps
$ cat log.lammps
1 0 1 2
2 1 0 2
3 1 2 0
$ paste ccccd2_H_0_col.bak2 ccccd2_H_1_col.bak2 ccccd2_H_2_col.bak2 | sed 's/\t/\t\t/g'
N a b c N s t u N k l m
C d e f C v w x C n o p
N g h i N y z a N q r s
C j k l C b c d C t u v
N m n o N e f g N w x y
C p q r C h i j C z a b
SCRIPT EXECUTION:
$ awk -f ../tst.awk log.lammps ccccd2_H_0_col.bak2 ccccd2_H_1_col.bak2 ccccd2_H_2_col.bak2
OUTPUT:
$ ls
ccccd2_H_0_col.bak2 ccccd2_H_1_col.bak2 ccccd2_H_2_col.bak2 log.lammps
ccccd2_H_0_temp_col ccccd2_H_1_temp_col ccccd2_H_2_temp_col
$ paste ccccd2_H_0_temp_col ccccd2_H_1_temp_col ccccd2_H_2_temp_col | sed 's/\t/\t\t/g'
N a b c N g h i N e f g
C d e f C j k l C h i j
N y z a N m n o N k l m
C b c d C p q r C n o p
N w x y N s t u N q r s
C z a b C v w x C t u v

Finding the largest power of a number that divides a factorial in haskell

So I am writing a haskell program to calculate the largest power of a number that divides a factorial.
largestPower :: Int -> Int -> Int
Here largestPower a b has find largest power of b that divides a!.
Now I understand the math behind it, the way to find the answer is to repeatedly divide a (just a) by b, ignore the remainder and finally add all the quotients. So if we have something like
largestPower 10 2
we should get 8 because 10/2=5/2=2/2=1 and we add 5+2+1=8
However, I am unable to figure out how to implement this as a function, do I use arrays or just a simple recursive function.
I am gravitating towards it being just a normal function, though I guess it can be done by storing quotients in an array and adding them.
Recursion without an accumulator
You can simply write a recursive algorithm and sum up the result of each call. Here we have two cases:
a is less than b, in which case the largest power is 0. So:
largestPower a b | a < b = 0
a is greater than or equal to b, in that case we divide a by b, calculate largestPower for that division, and add the division to the result. Like:
| otherwise = d + largestPower d b
where d = (div a b)
Or putting it together:
largestPower a b | a < b = 1
| otherwise = d + largestPower d b
where d = (div a b)
Recursion with an accumuator
You can also use recursion with an accumulator: a variable you pass through the recursion, and update accordingly. At the end, you return that accumulator (or a function called on that accumulator).
Here the accumulator would of course be the running product of divisions, so:
largestPower = largestPower' 0
So we will define a function largestPower' (mind the accent) with an accumulator as first argument that is initialized as 1.
Now in the recursion, there are two cases:
a is less than b, we simply return the accumulator:
largestPower' r a b | a < b = r
otherwise we multiply our accumulator with b, and pass the division to the largestPower' with a recursive call:
| otherwise = largestPower' (d+r) d b
where d = (div a b)
Or the full version:
largestPower = largestPower' 1
largestPower' r a b | a < b = r
| otherwise = largestPower' (d+r) d b
where d = (div a b)
Naive correct algorithm
The algorithm is not correct. A "naive" algorithm would be to simply divide every item and keep decrementing until you reach 1, like:
largestPower 1 _ = 0
largestPower a b = sumPower a + largestPower (a-1) b
where sumPower n | n `mod` b == 0 = 1 + sumPower (div n b)
| otherwise = 0
So this means that for the largestPower 4 2, this can be written as:
largestPower 4 2 = sumPower 4 + sumPower 3 + sumPower 2
and:
sumPower 4 = 1 + sumPower 2
= 1 + 1 + sumPower 1
= 1 + 1 + 0
= 2
sumPower 3 = 0
sumPower 2 = 1 + sumPower 1
= 1 + 0
= 1
So 3.
The algorithm as stated can be implemented quite simply:
largestPower :: Int -> Int -> Int
largestPower 0 b = 0
largestPower a b = d + largestPower d b where d = a `div` b
However, the algorithm is not correct for composite b. For example, largestPower 10 6 with this algorithm yields 1, but in fact the correct answer is 4. The problem is that this algorithm ignores multiples of 2 and 3 that are not multiples of 6. How you fix the algorithm is a completely separate question, though.

Replacing PIPE (|) symbol in hive

Hello i have a text containing pipe (|) symbol and i want to replace it with space. This is the text in the column description
|TrueCricketLover|M€$$!|
PTI|Capricorn|No DM|#TeamIK|#shaneRWatson33 ❤
Boom Boom❤
Striving to be a better human!
I have tried the regexp_replace function like this
regexp_replace(description,'|',' ')
This command returns this value
| T r u e C r i c k e t L o v e r | M € $ $ ! |
P T I | C a p r i c o r n | N o D M | # T e a m I K | # s h a n e R W a t s o n 3 3 ❤
B o o m B o o m ❤
S t r i v i n g t o b e a b e t t e r h u m a n !
L o v e h i m w h o l e a s t D e s e r v e s I t , T h a t ' s i t ❤
It is not replacing the pipe (|) symbol. Kindly help.
Try this:
select regexp_replace(description,'\\|',' ') from table;
Since a pipe character is an OR operator in regex in must be escaped. In Java flavored regex, two escape characters, back slashes, must be used.
Try this one add \ in your regexp_replace function
insert overwrite table_name select regexp_replace(id,'\\|',' ') from table_name

Resources