1

I have the following CSV File called test.csv

First Line, 100
Second Line, 200
Third Line, 300
Fourth Line, 400

I want to split each line at the comma and store the first part into an array in a bash script using awk. I have done the following:

#!/bin/bash
declare -a MY_ARRAY

MY_ARRAY=$(awk -F ',' '{print $1}' test.csv)
echo 'Start'
for ENTRY in ${MY_ARRAY}
do
echo ${ENTRY}
done
echo 'Stop'

And the output is as follows:

Start
First
Line
Second
Line
Third
Line
Fourth
Line
Stop

How can I get the array to hold the following?:

First Line
Second Line
Third Line
Fourth Line
Harry Boy
  • 4,159
  • 17
  • 71
  • 122
  • Why do you want to use `awk` to populate a bash array? In every case it would be easier to use bash directly. Because of the spaces inside the fields you need more bash code than `array=( $(awk ...) )` anyway (note the `( )` around $( )`; you forgot them). – Socowi Mar 27 '21 at 00:01
  • a few issues with the current code: **a)** assigning values to an array typically requires wrapping the right side of the assignment in parents, eg, `MY_ARRAY=( $(awk ... ) )`; **b)** since the `awk` output lines contain white space you need to redefine the default field separator as a `\n` for proper parsing into the array, eg, `IFS=$'\n' MY_ARRAY=( $(awk ... ) )` (all on one line so the `IFS` redefinition only applies to this command); **c)** to reference the individual elements of the array you need a wildcard match for the index (wrapped in `[]'s`), eg, `for ENTRY in "${MY_ARRAY[@]}"` – markp-fuso Mar 27 '21 at 15:18

2 Answers2

2

If you want to assign a bash array to the first columns of csv file, you do not have to use awk. Please try instead:

readarray -t my_array < <(cut -d, -f1 test.csv)
for e in "${my_array[@]}"; do
    echo "$e"
done

Output:

First Line
Second Line
Third Line
Fourth Line
  • The command cut -t, -f1 file splits each line on comma and prints the first field.
  • The expression <(command) is a process substitution and can redirect the output of the command to another command (readarray in this case).
  • The command readarray -t my_array reads lines from the standard input and assigns my_array to the lines.

Going back to your posted script, the variable MY_ARRAY is not assigned as an array. It just holds a single string which contains whitespaces and newlines. If it is referred as for ENTRY in ${MY_ARRAY}, the string is split on the whitespaces and the newlines due to the word splitting of bash.

tshiono
  • 21,248
  • 2
  • 14
  • 22
1
$ cat input 
First Line, 100
Second Line, 200
Third Line, 300
Fourth Line, 400

$ cat so.sh 
#!/bin/bash

while IFS=',' read -r -a array; do
    echo $array
done

$ cat input | ./so.sh 
First Line
Second Line
Third Line
Fourth Line


dgan
  • 1,349
  • 1
  • 15
  • 28