Read first line from latest file - shell

I’m working on requirement where I need to read first line from latest file under a directory. In a directory I can have multiple files but I want to read first line of latest file out all files which are having PPP in their file name.
I know how to read first line of file and write into a file
head -n 1 jsonPPPvp.txt > output.txt
But how can I pick latest file ( as per the time stamp) out of all files in a directory which are having PPP in it..?
Any suggestions please...!
I’ve written a command

Using find with -print0 and xargs -0 in a command substitution
Your optimal solution, though still requiring 4 subshells, will protect against all caveats in filenames by having find output nul-terminated filenames that can be used in conjunction with xargs -0 to form a nul-terminated list of filenames to be passed to ls for sorting in reverse selecting the last file with tail -n1 and the first line in that file with head -n1.
Using the -maxdepth 1 option to find limits the search to the current directory and prevent recursing into subdirectories (remove it if you want to search the entire directory tree below the current directory), e.g.
head -n1 $(find . -maxdepth 1 -type f -name "*PPP*" -print0 |
xargs -0 ls -rt |
tail -n 1)
In addition to working with nul-terminated filenames, it will benefit from letting xargs form the list to sort rather than looping to find the newest.

It is not maybe the best solution but it works (by latest file, I have considered the file modified with the most recent timestamp ):
ls -ltra
total 32
drwxr-xr-x 3 allanrobert primarygroup 4096 Feb 15 17:37 ..
drwxr-xr-x 2 allanrobert primarygroup 4096 Feb 15 17:37 .
-rw-r--r-- 1 allanrobert primarygroup 6 Feb 15 17:40 file2PPP2
-rw-r--r-- 1 allanrobert primarygroup 6 Feb 15 17:40 other
-rw-r--r-- 1 allanrobert primarygroup 6 Feb 15 17:40 file3PPP3
-rw-r--r-- 1 allanrobert primarygroup 6 Feb 15 17:40 other2
-rw-r--r-- 1 allanrobert primarygroup 6 Feb 15 17:40 other1
-rw-r--r-- 1 allanrobert primarygroup 6 Feb 15 17:40 file1PPP
file content:
cat file1PPP
a
b
c
Command:
find . -type f -maxdepth 1 -name '*PPP*' -printf '%T+ %p\n' | sort -r | head -1 | cut -d' ' -f2 | xargs head -1
a
Beware of spaces in filenames!

temp = `ls -Art | head -n 1 `
head -1 $temp

head -n 1 $(find ./ -name "*PPP*" -type f | xargs ls -rt1 | tail -n 1)
The drawback of the command above is that you must have a *PPP* file in your directory, otherwise the command produces wrong result.

You can also try this:
ls -tr | grep "PPP" | tail -n 1 | xargs head -n 1

Related

How to get a particular field from ls output

i am trying to get the info of the latest folder created in a particular path.
Here i am using the below command to fetch and filter the results so that i get only folders starting with 11,12,19:
ls_info=$(ls -lrt /orcl/grid/product |grep '11\|12\|19')
The output of ls_info is :
total 12
drwxrwx--- 3 oragrid oinstall 4096 May 21 2014 11.2.0.3
drwxr-xr-x 3 oragrid oinstall 4096 Feb 25 2019 11.2.0.4
How can i fetch "11.2.0.4" from this,which is the latest created folder.
Please suggest.Thanks.
Do not parse ls. Use find instead. First get the list of directories you want and print the directories with the modification timestamps. Then sort the list, filter newest line and remove the timestamp. With GNU utilities you can:
find /orcl/grid/product -mindepth 1 -maxdepth 1 -type d '(' -name '11*' -o -name '12*' -o -name '19*' ')' -printf "%Ts\t%f\n" | sort -n | cut -f2- | tail -n1

Find unmatched list from directory files in unix

I have a file called samples.list with sample IDs. I have same files in my directory that I want to pattern match with my sample.list and get the output of unmatched sample.list.
samples.list
SRR1369385
SRR1352799
SRR1377262
SRR1400622
ls -lh
-rw-rw----+ 1 gen dbgap_6109 2.2G Jul 29 02:44 SRR1369385_1.fastq.gz
-rw-rw----+ 1 gen dbgap_6109 2.2G Jul 29 02:44 SRR1369385_2.fastq.gz
-rw-rw----+ 1 gen dbgap_6109 1.2G Jul 29 03:34 SRR1352799_1.fastq.gz
-rw-rw----+ 1 gen dbgap_6109 1.2G Jul 29 03:34 SRR1352799_2.fastq.gz
-rw-rw----+ 1 gen tnt_pipeli 2.2G Jul 29 01:44 sometxt.txt
The output I want (samples that did not match with the file names in the directory):
SRR1377262
SRR1400622
code I tried:
grep -oFf `cat samples.list` ls -lh | grep -vFf - `cat samples.list`
I would really appreciate if someone could guide me through the solution.
# find all files named in the way you want and print filenames
find . -maxdepth 1 -type f -name '*_*.fastq.gz' -printf "%f\n" |
# Remove all everything except the SRR=numbers
sed 's/_.*//' |
# Sort the list, remove duplicate elements
sort -u |
# join the list with samples and print only unmatched elements from samples
join -v1 -o 1.1 <(sort samples.list) -
Tested on repl.
Notes:
do not use backticks `. Prefer $(...) instead. obsolete and deprecated syntax bashhackers wiki
greps -f options takes filename not content of the file. You could do grep -f some_file.txt to grep with all regexes stored in some_file.txt.
ls -lh produces output to stdout. Running grep ls -lh would make grep want to search file named ls (and what for -l and -h if you want to search filenames?). While you could ls -1 | grep, but it's better to find . -maxdepth 1 -mindepth 1 | grep ...
Try this:
awk -F_ 'NR==FNR{a[$1]=1;next}!($0 in a)' <(ls) samples.list
First that will index everything until _ from ls for each output line (NR==FNR is true for these lines), and then find all unmatched lines in samples.list (»if a line is not indexed, print it«).

Bash - Version Numbers in Filenames. How to list latest versions only?

I have a directory of versioned files. The version of each file is indicated within it's filename, e.g. "_v1".
Example
List of files shown by ls:
123_FileA_v1.txt
123_FileA_v2.txt
132_FileB_v1.txt
I want to run a command to see only the latest versions:
123_FileA_v2.txt
132_FileB_v1.txt
My first attempt was to list files by mtime using
ls -ltr
But in my case this doesn't lead to sufficient results. I really want to collect versions from the filenames.
What would be the best way to do it?
This will do it :
ls | awk -F '_' '!prefixes[$1]++'
Hope it helps!
Edit :
If you want to see specific info you can do :
ls | awk -F '_' '!prefixes[$1]++' | xargs ls -lh
This will work as long as there are not spaces in your filenames.
Edit :
As requested by #PaulHodges, here is the sample output :_
$ ls -lh
total 0
drwxr-xr-x 5 Matias-Barrios Matias-Barrios 160B Feb 27 11:40 .
drwxr-xr-x 106 Matias-Barrios Matias-Barrios 3.3K Feb 27 11:39 ..
-rw-r--r-- 1 Matias-Barrios Matias-Barrios 0B Feb 27 11:40 132_FileB_v1.txt
-rw-r--r-- 1 Matias-Barrios Matias-Barrios 0B Feb 27 11:40 123_FileA_v2.txt
-rw-r--r-- 1 Matias-Barrios Matias-Barrios 0B Feb 27 11:40 123_FileA_v1.txt
$ ls | awk -F '_' '!prefixes[$1]++'
.
..
132_FileB_v1.txt
123_FileA_v2.txt
You could do something like
(
PATTERN="[0-9]{3}_[^_]*"
for prefix in `find . | egrep -o "$PATTERN" | sort -u`;
do
ls $prefix* | tail -1;
done
)
It will print
123_FileA_v2.txt
132_FileB_v1.txt
What happens here?
The surrounding braces ( are used to support copy & paste of the provided code. read more
The variable PATTERN is used to access all files starting with the same prefix.
The for prefix in `find . | egrep -o "$PATTERN" | sort -u generates a list of file prefixes.
The ls $prefix* lists all files with the same prefix in alphanumerical order
The | tail -1 shows only the last entry of the former ls $prefix*
Edit
I decided to use find . instead of ls *. With that I hope to circumvent the issues with ls *. Please correct me, if I'm wrong!

concatenate grep output to an echo statement in UNIX

I am trying to output the number of directories in a given path on a SINGLE line. My desire is to output this:
X-many directories
Currently, with my bash sript, I get this:
X-many
directories
Here's my code:
ARGUMENT=$1
ls -l $ARGUMENT | egrep -c '^drwx'; echo -n "directories"
How can I fix my output? Thanks
I suggest
echo "$(ls -l "$ARGUMENT" | egrep -c '^drwx') directories"
This uses the shell's feature of final newline removal for command substitution.
Do not pipe to ls output and count directories as you can get wrong results if special characters have been used in file/directory names.
To count directories use:
shopt -s nullglob
arr=( "$ARGUMENT"/*/ )
echo "${#arr[#]} directories"
/ at the end of glob will make sure to match only directories in "$ARGUMENT" path.
shopt -s nullglob is to make sure to return empty results if glob pattern fails (no directory in given argument).
as alternative solution
$ bc <<< "$(find /etc -maxdepth 1 -type d | wc -l)-1"
116
another one
$ count=0; while read curr_line; do count=$((count+1)); done < <(ls -l ~/etc | grep ^d); echo ${count}
116
Would work correctly with spaces in the folder name
$ ls -la
total 20
drwxrwxr-x 5 alex alex 4096 Jun 30 18:40 .
drwxr-xr-x 11 alex alex 4096 Jun 30 16:41 ..
drwxrwxr-x 2 alex alex 4096 Jun 30 16:43 asdasd
drwxrwxr-x 2 alex alex 4096 Jun 30 16:43 dfgerte
drwxrwxr-x 2 alex alex 4096 Jun 30 16:43 somefoler with_space
$ count=0; while read curr_line; do count=$((count+1)); done < <(ls -l ./ | grep ^d); echo ${count}
3

grep ls output across tabs

If I ls -l in a directory and get:
-rwxr-x--- 1 user1 admin 0 8 Aug 2012 file.txt
-rwxr-x--- 1 user1 admin 1733480 26 Jul 2012 Archive.pax.gz
drwxr-x---# 7 user1 admin 238 31 Jul 2012 Mac Shots
-rwxr-x---# 1 user3 admin 598445 31 Jul 2012 Mac Shots.zip
-rwxr-x---# 1 user1 admin 380 6 Jul 2012 an.sh
-rwxr-x--- 1 user2 admin 14 30 Jun 2012 analystName.txt
-rwxr-x--- 1 user1 admin 36 8 Aug 2012 apple.txt
drwxr-x---# 7 user1 admin 238 31 Jul 2012 iPad Shots
-rwxr-x---# 1 user1 admin 7372367 31 Jul 2012 iPad Shots.zip
-rwxr-x--- 1 user2 admin 109 30 Jun 2012 test.txt
drwxr-x--- 3 user1 admin 102 26 Jul 2012 usr
but want to list only the files owned by "user1" which were modified in "Aug" to get
-rwxr-x--- 1 user1 admin 0 8 Aug 2012 file.txt
-rwxr-x--- 1 user1 admin 36 8 Aug 2012 apple.txt
What is the best method?
Parsing ls output is never a good and reliable solution. ls is a tool for interactively looking at file information. Its output is formatted for humans and will cause bugs in scripts. Use globs or find instead. Understand why: http://mywiki.wooledge.org/ParsingLs
Instead, you can try :
find . -type f -user 'user1' -maxdepth 1
or
find . -type f -printf '%u %f\n' -maxdepth 1 # if you want to show the username
or
stat -c '%U %f' * | cut -d" " -f2-
See
man find
man stat
Or you can be more explicit, since Michael's grep would also find a file owned by user1 namedd 'August iPad Shots' no matter when it was modified:
ls -l | awk '($3=="user1" && $7=="Aug")'
I think the safest way to do it is like this :
touch --date "2012-08-01" /tmp/start
touch --date "2012-09-01" /tmp/stop
find . -maxdepth 1 -type f -user user1 -newer /tmp/start -not -newer /tmp/stop -print0 | xargs -0 ls -l {}
rm /tmp/start /tmp/stop
Or as a one liner
touch --date "2012-08-01" /tmp/start; touch --date "2012-09-01" /tmp/stop; find . -maxdepth 1 -type f -user user1 -newer /tmp/start -not -newer /tmp/stop -print0 | xargs -0 ls -l {}; rm /tmp/start /tmp/stop
Advantages:
You don't parse ls
It works for filenames with Aug in them
Disadvantages
It is a bit long
Explanation:
-maxdepth 1: restricts the results to the current directory
-type f: restricts the results to files
-user user1: resttrings the results to files that belong to user1
-newer /tmp/start: restring the results to files newer than /tmp/start, which was created with the desired date
-not -newer /tmp/stop: restring the results to files not newer than /tmp/stop, which was created with the desired date
-print0: so it can handle filenames with newlines in their name!
How about ls -l | grep user1 | grep Aug?
Or you can combine the regexp: ls -l | grep 'user1.*Aug'

Resources