Split the even and the odd lines of one column into two separate columns
Let’s start with the requirement in the title. You have this:
# file 1
1
A
2
B
3
C
And you want this:
1 A
2 B
3 C
In other words, you want odd rows (#1, #3, #5, etc) in column 1 and even rows (#2, #4, #6, etc) in column 2.
You can use the paste
command like this:
$ paste --serial --delimiters '\t\n' <(echo '1
A
2
B
3
C')
1 A
2 B
3 C
That seemed like a hack to me. Let’s break down the command above step by step:
<( command_to_run )
: This is called process substitution. It allows the output of a command to be treated as a file-like object. In this case, the output of theecho
command will be used as input “file” for thepaste
command.paste
: This is a command that merges lines from multiple files. In this case, it will merge lines from the file-like object provided by the process substitution, we have only 1 file.--delimiters '\t\n'
: This option specifies the delimiters to be used between merged lines. In this case, it alternates between a tab character (\t
) and a newline character (\n
) as delimiters.--serial
: I left this option last because it’s a bit tricky. It tellspaste
to paste one file at a time instead of in parallel. Lets see some examples first.
# file 1
1
2
3
# file 2
a
b
c
Default paste:
$ paste <( echo -e '1\n2\n3' ) <( echo -e 'a\nb\nc' )
1 a
2 b
3 c
With serial option:
$ paste --serial <( echo -e '1\n2\n3' ) <( echo -e 'a\nb\nc' )
1 2 3
a b c
# or with even more rows for file 1:
$ paste --serial <( echo -e '1\n2\n3\n4\n5' ) <( echo -e 'a\nb\nc' )
1 2 3 4 5
a b c
Also, an example if the files had 2 columns:
# file 1
1 Name1
2 Name2
3 Name3
# file 2
a 10
b 20
c 39
$ paste <( echo -e '1 Name1\n2 Name2\n3 Name3') <( echo 'a 10\nb 20\nc 39' )
1 Name1 a 10
2 Name2 b 20
3 Name3 c 39
And with 2 columns and --serial
:
$ paste --serial <( echo -e '1 Name1\n2 Name2\n3 Name3' ) <( echo -e 'a\nb\nc' )
1 Name1 2 Name2 3 Name3
a b c
So, when you run paste
, by default:
- It will go to the 1st line of the resulting file, and it will print the whole 1st line from file 1.
- In the same line, it will add a delimiter and it will print the whole 1st line from file 2.
- In the same line, it will add a delimiter and it will print the whole 1st line from file 3, and so on, for each file.
- It will go the 2nd line of the resulting file, and it will print the whole 2nd line from file 1.
- In the same line, it will add a delimiter and it will print the whole 2nd line from file 2.
- In the same line, it will add a delimiter and it will print the whole 2nd line from file 3, and so on, for each file.
- And so on, for each line.
Hence the merges lines from multiple files you saw before.
If you run paste with the --serial
option, it will do the following:
- It will go to the 1st line of the resulting file, and it will print the whole 1st line from file 1.
- In the same line, it will add a delimiter and it will print the whole 2nd line from file 1.
- In the same line, it will add a delimiter and it will print the whole 3rd line from file 1, and so on, for each line of file 1.
- It will go the 2nd line of the resulting file, and it will print the whole 1st line from file 2.
- In the same line, it will add a delimiter and it will print the whole 2nd line from file 2.
- In the same line, it will add a delimiter and it will print the whole 3rd line from file 2, and so on, for each line of file 2.
- And so on, for each file.
In other words, with the --serial
option, the paste
command will print all the lines of each file, separated by the delimiter, in the same line. And each file will be in a different line.
Without the --serial
option, the paste
command will print each line of each file in order, separated by the delimiter, in the same line. And each line of the input files will be in a different line.
Let’s see some examples with the --delimiters
option to understand what’s going on:
$ paste --delimiters '^' <( echo -e '1\n2\n3' ) <( echo -e 'a\nb\nc' )
1^a
2^b
3^c
Ok, it replaces the default tab delimiter with the ^
character, this makes sense.
What if we add another character in the --delimiters
because man curl
says that: -d
, --delimiters=LIST
, reuse characters from LIST instead of TABs
$ paste --delimiters '^#' <( echo -e '1\n2\n3' ) <( echo -e 'a\nb\nc' )
1^a
2^b
3^c
So, nothing changed here. That is because it did not have the need for an additional delimiter, the files were only two.
But if you use the --serial
option, there is room for 2 delimiters because each file has 3 lines:
$ paste --delimiters '^#' --serial <( echo -e '1\n2\n3' ) <( echo -e 'a\nb\nc' )
1^2#3
a^b#c
And now let’s add again the tab and the new line as delimiters, the same as the original solution:
$ paste --delimiters '\t\n' --serial <( echo -e '1\n2\n3' ) <( echo -e 'a\nb\nc' )
1 2
3
a b
c
Not the result we wanted to achieve in the title, but, in our defense, the input is different! And I also hope that the original solution makes more sense now.
If you use serial, and you limit the input to 1 file, you’ll get the same result:
paste --delimiters '\t\n' --serial <( echo -e '1\na\n2\nb\n3\nc' )
1 a
2 b
3 c
Quiz
You have the following file:
1
a
A
2
b
B
3
c
C
How do you transform it to:
1 a A
2 b B
3 c C
Solution:
# notice the x2 spaces in the delimiters
paste --delimiters ' \n' --serial \
<( echo -e '1\na\nA\n2\nb\nB\n3\nc\nC' )
1 a A
2 b B
3 c C
Use awk if you want more flexibility
You can achieve the same result with awk and the NR
built-in variable (it stores the row number):
$ awk '{ if (NR % 2) { printf "%s", $1 } else { printf "\t%s\n", $1}}' \
<(echo "1
A
2
B
3
C")
1 A
2 B
3 C
The above says, if the row number is odd, print the first column, else (the row number is even), print a tab character, the first column, and a newline character (change the line). Repeat this for every line in the file.
Links
Other things to read
Popular
- Reveal animations on scroll with react-spring
- Gatsby background image example
- Extremely fast loading with Gatsby and self-hosted fonts