I'm working on a long Bash script. I want to read cells from a CSV file into Bash variables. I can parse lines and the first column, but not any other column. Here's my code so far:
cat myfile.csv|while read line
read -d, col1 col2 < <(echo $line)
echo "I got:$col1|$col2"
It's only printing the first column. As an additional test, I tried the following:
read -d, x y < <(echo a,b,)
And $y is empty. So I tried:
read x y < <(echo a b)
And $y is b. Why?

You need to use IFS instead of -d:
while IFS=, read -r col1 col2
echo "I got:$col1|$col2"
done < myfile.csv
To skip a given number of header lines:
while IFS=, read -r col1 col2
if ((skip_headers))
echo "I got:$col1|$col2"
done < myfile.csv
Note that for general purpose CSV parsing you should use a specialized tool which can handle quoted fields with internal commas, among other issues that Bash can't handle by itself. Examples of such tools are cvstool and csvkit.

Coming late to this question and as bash do offer new features, because this question stand about bash and because none of already posted answer show this powerful and compliant way of doing precisely this.
Parsing CSV files under bash, using loadable module
Conforming to RFC 4180, a string like this sample CSV row:
12,22.45,"Hello, ""man"".","A, b.",42
should be splitted as
1 12
2 22.45
3 Hello, "man".
4 A, b.
5 42
bash loadable .C compiled modules.
Under bash, you could create, edit, and use loadable c compiled modules. Once loaded, they work like any other builtin!! ( You may find more information at source tree. ;)
Current source tree (Oct 15 2021, bash V5.1-rc3) do contain a bunch of samples:
accept listen for and accept a remote network connection on a given port
asort Sort arrays in-place
basename Return non-directory portion of pathname.
cat cat(1) replacement with no options - the way cat was intended.
csv process one line of csv data and populate an indexed array.
dirname Return directory portion of pathname.
fdflags Change the flag associated with one of bash's open file descriptors.
finfo Print file info.
head Copy first part of files.
hello Obligatory "Hello World" / sample loadable.
tee Duplicate standard input.
template Example template for loadable builtin.
truefalse True and false builtins.
tty Return terminal name.
uname Print system information.
unlink Remove a directory entry.
whoami Print out username of current user.
There is an full working cvs parser ready to use in examples/loadables directory: csv.c!!
Under Debian GNU/Linux based system, you may have to install bash-builtins package by
apt install bash-builtins
Using loadable bash-builtins:
enable -f /usr/lib/bash/csv csv
From there, you could use csv as a bash builtin.
With my sample: 12,22.45,"Hello, ""man"".","A, b.",42
csv -a myArray '12,22.45,"Hello, ""man"".","A, b.",42'
printf "%s\n" "${myArray[#]}" | cat -n
1 12
2 22.45
3 Hello, "man".
4 A, b.
5 42
Then in a loop, processing a file.
while IFS= read -r line;do
csv -a aVar "$line"
printf "First two columns are: [ '%s' - '%s' ]\n" "${aVar[0]}" "${aVar[1]}"
done <myfile.csv
This way is clearly the quickest and strongest than using any other combination of bash builtins or fork to any binary.
Unfortunely, depending on your system implementation, if your version of bash was compiled without loadable, this may not work...
Complete sample with multiline CSV fields.
Conforming to RFC 4180, a string like this single CSV row:
12,22.45,"Hello ""man"",
This is a good day, today!","A, b.",42
should be splitted as
1 12
2 22.45
3 Hello "man",
This is a good day, today!
4 A, b.
5 42
Full sample script for parsing CSV containing multilines fields
Here is a small sample file with 1 headline, 4 columns and 3 rows. Because two fields do contain newline, the file are 6 lines length.
1234,Cpt1023,"Energy counter",34213
2343,Sns2123,"Temperatur sensor
to trigg for alarm",48.4
42,Eye1412,"Solar sensor ""Day /
And a small script able to parse this file correctly:
enable -f /usr/lib/bash/csv csv
exec {FD}<"$file"
read -ru $FD line
csv -a headline "$line"
printf -v fieldfmt '%-8s: "%%q"\\n' "${headline[#]}"
while read -ru $FD line;do
while csv -a row "$line" ; (( ${#row[#]} < numcols )) ;do
read -ru $FD sline || break
printf "$fieldfmt\\n" "${row[#]}"
This may render: (I've used printf "%q" to represent non-printables characters like newlines as $'\n')
Id : "1234"
Name : "Cpt1023"
Desc : "Energy\ counter"
Value : "34213"
Id : "2343"
Name : "Sns2123"
Desc : "$'Temperatur sensor\nto trigg for alarm'"
Value : "48.4"
Id : "42"
Name : "Eye1412"
Desc : "$'Solar sensor "Day /\nNight"'"
Value : "12199.21"
You could find a full working sample there: csvsample.sh.txt or
In this sample, I use head line to determine row width (number of columns). If you're head line could hold newlines, (or if your CSV use more than 1 head line). You will have to pass number or columns as argument to your script (and the number of head lines).
Of course, parsing CSV using this is not perfect! This work for many simple CSV files, but care about encoding and security!! For sample, this module won't be able to handle binary fields!
Read carefully csv.c source code comments and RFC 4180!

From the man page:
-d delim
The first character of delim is used to terminate the input line,
rather than newline.
You are using -d, which will terminate the input line on the comma. It will not read the rest of the line. That's why $y is empty.

We can parse csv files with quoted strings and delimited by say | with following code
while read -r line
field1=$(echo "$line" | awk -F'|' '{printf "%s", $1}' | tr -d '"')
field2=$(echo "$line" | awk -F'|' '{printf "%s", $2}' | tr -d '"')
echo "$field1 $field2"
done < "$csvFile"
awk parses the string fields to variables and tr removes the quote.
Slightly slower as awk is executed for each field.

In addition to the answer from #Dennis Williamson, it may be helpful to skip the first line when it contains the header of the CSV:
while IFS=, read -r col1 col2
echo "I got:$col1|$col2"
} < myfile.csv

If you want to read CSV file with some lines, so this the solution.
while IFS=, read -ra line
test $i -eq 1 && ((i=i+1)) && continue
for col_val in ${line[#]}
echo -n "$col_val|"
done < "$csvFile"


I have a .csv file which has dates and the answer about enjoyable or not:
What I want to do is to print the day of the week in the third column seperate by ',' like this:
I tried:
dates=$(awk '{FS=","}{print $1,$2}' weather_stat.csv')
for vars in $dates[first_row]
echo $(date -j -f '%Y-%m-%d' $vars "+%w")
The first part of the code works without any problem, but in the second part, I am confused about how to get the data in the first row (so I use dates[first_row] to mean the first row in dates variable) from the variable "dates" so we can apply 'date' method on it
And for the third part, I want to merge these two tables together. I found the 'join' function but it seem to work on two files instead of two variables(I don't want to have any new files during the process)
Could anyone tells me how to get the rows in a variable instead of a file in shell and the way to merge two table-like variables?
As you're learning shell scripting, here's some code to study:
to read your csv file, and get the weekday number for each date in the file:
while IFS=, read -r date rest; do echo "$date,$(date -d "$date" +%w)"; done < file.csv
to join the output of that command with your file:
weekdays=$(while IFS=, read -r date rest; do echo "$date,$(date -d "$date" +%w)"; done < file.csv)
join -t, file.csv <(echo "$weekdays")
or, without needing to store the result in an intermediate variable
join -t, file.csv <(
while IFS=, read -r date rest; do echo "$date,$(date -d "$date" +%w)"; done < file.csv
The newlines within the <() are not necessary, but useful for maintainable code.
However, you can see that this is less efficient because you have to process the file twice. With awk you only have to read through the file once.
With GNU awk:
awk' BEGIN{FS=OFS=","}
{ split($1,a,"-")
t=sprintf("%0.4d %0.2d %0.2d 00 00 00",a[1],a[2],a[3]);
print $0,strftime("%w",mktime(t))
}' file.csv
With only your Bourne shell, so less efficient than awk if you have a lot of lines in your CSV file:
while IFS=, read date enjoy; do
date -d "$date" +"$date,$enjoy,%w"
done < your.csv

Input file, fruits.txt:
Expected output file:
For getting the above output, below code is used:
declare -A m_arr
cat fruits.txt > /tmp/ID.part
while read line
Month=$(echo $line | cut -d, -f1)
Fruits=$(echo $line | cut -d, -f2)
done < /tmp/ID.part
for i in ${!m_arr[#]}
echo "$i,${m_arr[$i]}"
This works fine for small number of data in input file. I have 200 000 entries and observed that cut command is very slow. Tried with awk as well, did not get a better result. My requirement is to read the file from row1, with the key as column1. I need to updated entry for each key.
I think this can be done pretty easily with Awk, you just need to hash the values of $1 in $2 once you delimit the file with a , separator
awk -v FS=, -v OFS=, '{key[$1]=$2; next}END{for (i in key) print i,key[i]}' file
Also if you want to speed up things while processing a million line file, you can change the localization settings to speed up the execution while parsing, you can pass LC_ALL=C locally to the command. See Stéphane Chazelas's answer on what "LC_ALL=C" does?
In bash version 4, you can declare an associative array and populate it with the result of read, splitting your lines with a custom IFS:
$ declare -A a
$ while IFS=, read key value; do a["$key"]="$value"; done < fruits.txt
$ declare -p a
declare -A a=([MAR]="APPLE" [FEB]="APPLE" [JAN]="ORANGE" )
If you want to generate that specific output from the array, you'll also require a loop:
$ for key in "${!a[#]}"; do printf '%s,%s\n' "$key" "${a[$key]}"; done
The shortest one using GNU datamash:
datamash -st, -g1 last 2 <file
g1 - group by the 1st column
last 2 - keep the last value of the group
The output:

I need to split a TSV file by date using whatever standard CLI tools come with OS X 10.10; e.g. sed, awk, etc. FYI the shell is Bash
The input file has a header row and follows a tab separated format (the date and time is in the first column) — I'm adding "\t" bellow to show the tabs, and "…" to indicate the rows have many more columns:
Transaction Date\t Account Number\t…
9/16/2004 12:00:00 AM\t ABC00147223\t…
9/17/2004 12:00:00 AM\t ABC00147223\t…
10/05/2004 12:00:00 AM\t ABC00147223\t…
The output should be:
A separate file for each unique year AND month (based on the example above I would get 2 output files: 9/2004 and 10/2004)
Maintain the first/header row of the original file
Filename in the form YYYYMM.txt
Thank you for your help.
If you want to do pure in bash shell do as below...
while read line
# counter to keep track of line number
ctr=$((ctr + 1))
# skip header line for processing
if [[ $ctr -gt 1 ]];
# create filename using date field present in record
vdate=${line%% *}
vday=`printf "%02d" $vday1` # day with padding 0
vyear=${vdate##*/} # year
vfilename="${vyear}${vday}.txt" # filname in YYYYMM.txt format
# check if file exists or not then put header record in it
if [ ! -f $vfilename ]; then
head -1 $datafile > $vfilename
# put the record in that file
echo "$line" >> $vfilename
done < $datafile
Not sure how big your data files are but its never a good idea to parse large files using shell scripting instead use other utils like awk, sed, grep, etc for it.
For big files and using nawk / gawk one-liner use as below ... it will do all you need.
# use nawk or gawk if you don't get the expected results using awk
$nawk '{if(NR==1)h=$0;} {if(NR>1){ split($1,a,"/"); fn=sprintf("%04d%02d.txt",a[3],a[1]); if(system( "[ ! -f " fn " ] ")==0)print h >> fn; print >> fn;} }' inputdatafile.dat

I'm relatively new to shell scripting and am writing a script to organize my music library. I'm using awk to parse the id3 tag info and am generating a newline separated list like so:
Kanye West
College Dropout
All Falls Down
I want to store each field in a separate variable so I can easily compose some mkdir and mv commands. I've tried piping the output to IFS=$'\n' read artist album title but each variable remains empty. I'm open to producing a different output from awk, but I still want to know how to parse a newline separated list using bash.
It turns out that by piping directly to read by doing:
id3info "$filename" | awk "$awkscript" | {read artist; read album; read title;}
WILL NOT WORK. It results in the variables existing in a different scope. I found that using a herestring works best:
{read artist; read album; read title;} <<< "$(id3info "$filename" | awk "$awkscript")"
read normally reads one line at a time. So, if your id3 info is in the file testfile.txt, you can read it in as follows:
{ read artist ; read album ; read song ; } <testfile.txt
echo "artist='$artist' album='$album' song='$song'"
# insert your mkdir and mv commands....
When run on your test file, the above outputs:
artist='Kanye West' album='College Dropout' song='All Falls Down'
You can just read the file into a bash array and loop through the array like so:
IFS=$'\r\n' content=($(cat ${filepath}))
for ((idx = 0; idx < ${#content[#]}; idx+=3)); do
Or read three lines in a loop.
yourscript |
while read artist; do # read first line of input
read album # read second line of input
read song # read third line of input
: self-destruct if the genre is rap
This loop will consume input lines in groups of three. If there is not an even multiple of three lines of input, the reads after that inside the loop will simply fail and the variables will be empty.
You can read the output from awk into an array. E.g.
readarray -t array <<< "$(printf '%s\n' 'Kanye West' 'College Dropout' 'All Falls Down')"
for ((i=0; i<${#array[#]}; i++ )) ; do
echo "array[$i]=${array[$i]}"
array[0]=Kanye West
array[1]=College Dropout
array[2]=All Falls Down

If I have a csv file, is there a quick bash way to print out the contents of only any single column? It is safe to assume that each row has the same number of columns, but each column's content would have different length.
You could use awk for this. Change '$2' to the nth column you want.
awk -F "\"*,\"*" '{print $2}' textfile.csv
yes. cat mycsv.csv | cut -d ',' -f3 will print 3rd column.
The simplest way I was able to get this done was to just use csvtool. I had other use cases as well to use csvtool and it can handle the quotes or delimiters appropriately if they appear within the column data itself.
csvtool format '%(2)\n' input.csv
Replacing 2 with the column number will effectively extract the column data you are looking for.
Landed here looking to extract from a tab separated file. Thought I would add.
cat textfile.tsv | cut -f2 -s
Where -f2 extracts the 2, non-zero indexed column, or the second column.
Here is a csv file example with 2 columns
To get the first column, use:
cut -d, -f1 myTooth.csv
f stands for Field and d stands for delimiter
Running the above command will produce the following output.
To get the 2nd column only:
cut -d, -f2 myTooth.csv
And here is the output
Another use case:
Your csv input file contains 10 columns and you want columns 2 through 5 and columns 8, using comma as the separator".
cut uses -f (meaning "fields") to specify columns and -d (meaning "delimiter") to specify the separator. You need to specify the latter because some files may use spaces, tabs, or colons to separate columns.
cut -f 2-5,8 -d , myvalues.csv
cut is a command utility and here is some more examples:
cut -b list [-n] [file ...]
cut -c list [file ...]
cut -f list [-d delim] [-s] [file ...]
I think the easiest is using csvkit:
Gets the 2nd column:
csvcut -c 2 file.csv
However, there's also csvtool, and probably a number of other csv bash tools out there:
sudo apt-get install csvtool (for Debian-based systems)
This would return a column with the first row having 'ID' in it.
csvtool namedcol ID csv_file.csv
This would return the fourth row:
csvtool col 4 csv_file.csv
If you want to drop the header row:
csvtool col 4 csv_file.csv | sed '1d'
First we'll create a basic CSV
[dumb#one pts]$ cat > file
Then we get the 1st column
[dumb#one pts]$ awk -F , '{print $1}' file
Many answers for this questions are great and some have even looked into the corner cases.
I would like to add a simple answer that can be of daily use... where you mostly get into those corner cases (like having escaped commas or commas in quotes etc.,).
FS (Field Separator) is the variable whose value is dafaulted to
space. So awk by default splits at space for any line.
So using BEGIN (Execute before taking input) we can set this field to anything we want...
awk 'BEGIN {FS = ","}; {print $3}'
The above code will print the 3rd column in a csv file.
The other answers work well, but since you asked for a solution using just the bash shell, you can do this:
AirBoxOmega:~ d$ cat > file #First we'll create a basic CSV
And then you can pull out columns (the first in this example) like so:
AirBoxOmega:~ d$ while IFS=, read -a csv_line;do echo "${csv_line[0]}";done < file
So there's a couple of things going on here:
while IFS=, - this is saying to use a comma as the IFS (Internal Field Separator), which is what the shell uses to know what separates fields (blocks of text). So saying IFS=, is like saying "a,b" is the same as "a b" would be if the IFS=" " (which is what it is by default.)
read -a csv_line; - this is saying read in each line, one at a time and create an array where each element is called "csv_line" and send that to the "do" section of our while loop
do echo "${csv_line[0]}";done < file - now we're in the "do" phase, and we're saying echo the 0th element of the array "csv_line". This action is repeated on every line of the file. The < file part is just telling the while loop where to read from. NOTE: remember, in bash, arrays are 0 indexed, so the first column is the 0th element.
So there you have it, pulling out a column from a CSV in the shell. The other solutions are probably more practical, but this one is pure bash.
You could use GNU Awk, see this article of the user guide.
As an improvement to the solution presented in the article (in June 2015), the following gawk command allows double quotes inside double quoted fields; a double quote is marked by two consecutive double quotes ("") there. Furthermore, this allows empty fields, but even this can not handle multiline fields. The following example prints the 3rd column (via c=3) of textfile.csv:
gawk -- '
if (substr($c, 1, 1) == "\"") {
$c = substr($c, 2, length($c) - 2) # Get the text within the two quotes
gsub("\"\"", "\"", $c) # Normalize double quotes
print $c
' c=3 < <(dos2unix <textfile.csv)
Note the use of dos2unix to convert possible DOS style line breaks (CRLF i.e. "\r\n") and UTF-16 encoding (with byte order mark) to "\n" and UTF-8 (without byte order mark), respectively. Standard CSV files use CRLF as line break, see Wikipedia.
If the input may contain multiline fields, you can use the following script. Note the use of special string for separating records in output (since the default separator newline could occur within a record). Again, the following example prints the 3rd column (via c=3) of textfile.csv:
gawk -- '
RS="\0" # Read the whole input file as one record;
# assume there is no null character in input.
FS="" # Suppose this setting eases internal splitting work.
ORS="\n####\n" # Use a special output separator to show borders of a record.
nof=patsplit($0, a, /([^,"\n]*)|("(("")*[^"]*)*")/, seps)
for (i=1; i<=nof; i++){
if (field==c) {
if (substr(a[i], 1, 1) == "\"") {
a[i] = substr(a[i], 2, length(a[i]) - 2) # Get the text within
# the two quotes.
gsub(/""/, "\"", a[i]) # Normalize double quotes.
print a[i]
if (seps[i]!=",") field=0
' c=3 < <(dos2unix <textfile.csv)
There is another approach to the problem. csvquote can output contents of a CSV file modified so that special characters within field are transformed so that usual Unix text processing tools can be used to select certain column. For example the following code outputs the third column:
csvquote textfile.csv | cut -d ',' -f 3 | csvquote -u
csvquote can be used to process arbitrary large files.
I needed proper CSV parsing, not cut / awk and prayer. I'm trying this on a mac without csvtool, but macs do come with ruby, so you can do:
echo "require 'csv'; CSV.read('new.csv').each {|data| puts data[34]}" | ruby
I wonder why none of the answers so far have mentioned csvkit.
csvkit is a suite of command-line tools for converting to and working
with CSV
csvkit documentation
I use it exclusively for csv data management and so far I have not found a problem that I could not solve using cvskit.
To extract one or more columns from a cvs file you can use the csvcut utility that is part of the toolbox. To extract the second column use this command:
csvcut -c 2 filename_in.csv > filename_out.csv
csvcut reference page
If the strings in the csv are quoted, add the quote character with the q option:
csvcut -q '"' -c 2 filename_in.csv > filename_out.csv
Install with pip install csvkit or sudo apt install csvkit.
Simple solution using awk. Instead of "colNum" put the number of column you need to print.
cat fileName.csv | awk -F ";" '{ print $colNum }'
csvtool col 2 file.csv
where 2 is the column you are interested in
you can also do
csvtool col 1,2 file.csv
to do multiple columns
You can't do it without a full CSV parser.
If you know your data will not be quoted, then any solution that splits on , will work well (I tend to reach for cut -d, -f1 | sed 1d), as will any of the CSV manipulation tools.
If you want to produce another CSV file, then xsv, csvkit, csvtool, or other CSV manipulation tools are appropriate.
If you want to extract the contents of one single column of a CSV file, unquoting them so that they can be processed by subsequent commands, this Python 1-liner does the trick for CSV files with headers:
python -c 'import csv,sys'$'\n''for row in csv.DictReader(sys.stdin): print(row["message"])'
The "message" inside of the print function selects the column.
If the CSV file doesn't have headers:
python -c 'import csv,sys'$'\n''for row in csv.reader(sys.stdin): print(row[1])'
Python's CSV library supports all kinds of CSV dialects, so if your CSV file uses different conventions, it's possible to support them with relatively little change to the code.
Been using this code for a while, it is not "quick" unless you count "cutting and pasting from stackoverflow".
It uses ${##} and ${%%} operators in a loop instead of IFS. It calls 'err' and 'die', and supports only comma, dash, and pipe as SEP chars (that's all I needed).
err() { echo "${0##*/}: Error:" "$#" >&2; }
die() { err "$#"; exit 1; }
# Return Nth field in a csv string, fields numbered starting with 1
csv_fldN() { fldN , "$1" "$2"; }
# Return Nth field in string of fields separated
# by SEP, fields numbered starting with 1
fldN() {
local me="fldN: "
local sep="$1"
local fldnum="$2"
local vals="$3"
case "$sep" in
-|,|\|) ;;
*) die "$me: arg1 sep: unsupported separator '$sep'" ;;
case "$fldnum" in
[0-9]*) [ "$fldnum" -gt 0 ] || { err "$me: arg2 fldnum=$fldnum must be number greater or equal to 0."; return 1; } ;;
*) { err "$me: arg2 fldnum=$fldnum must be number"; return 1;} ;;
[ -z "$vals" ] && err "$me: missing arg2 vals: list of '$sep' separated values" && return 1
fldnum=$(($fldnum - 1))
while [ $fldnum -gt 0 ] ; do
fldnum=$(($fldnum - 1))
echo ${vals%%$sep*}
$ CSVLINE="example,fields with whitespace,field3"
$ $ for fno in $(seq 3); do echo field$fno: $(csv_fldN $fno "$CSVLINE"); done
field1: example
field2: fields with whitespace
field3: field3
You can also use while loop
while read name val; do
echo "............................"
echo Name: "$name"
