Get specific columns from multiple files and paste (combine) them into a new file
Assuming that the first file shows the block size of a file in column 1 and the file path in column 2:
# file 1
$ du -sh ~/bin/*
0 /c/Users/asdf/bin/abswin
0 /c/Users/asdf/bin/abwin
37M /c/Users/asdf/bin/BrowserStackLocal.exe
0 /c/Users/asdf/bin/cloc
0 /c/Users/asdf/bin/curltime
960K /c/Users/asdf/bin/cvdump.exe
And file 2 has the file size in column 5:
# "file" 2
$ ls -lha ~/bin/* | awk '{print $5}'
lrwxrwxrwx 1 asdf 197121 27 Nov 21 2022 /c/Users/asdf/bin/abswin -> /c/abs-from-laragon/abs.exe
lrwxrwxrwx 1 asdf 197121 56 Nov 20 2022 /c/Users/asdf/bin/abwin -> /c/laragon/bin/apache/httpd-2.4.47-win64-VS16/bin/ab.exe*
-rwxr-xr-x 1 asdf 197121 37M Dec 16 2022 /c/Users/asdf/bin/BrowserStackLocal.exe*
lrwxrwxrwx 1 asdf 197121 66 Oct 28 2022 /c/Users/asdf/bin/cloc -> /c/Users/asdf/Desktop/dev/personal-projects/shell-scripts/clock.sh*
lrwxrwxrwx 1 asdf 197121 69 Nov 2 2022 /c/Users/asdf/bin/curltime -> /c/Users/asdf/Desktop/dev/personal-projects/shell-scripts/curltime.sh*
-rwxr-xr-x 1 asdf 197121 957K May 3 14:40 /c/Users/asdf/bin/cvdump.exe*
~/bin
directory, you can grab some files from the /bin
directory. For example, replace du -sh ~/bin/*
with du -sh /bin/* | head -n 5
. The bin
folder of your home directory (the ~
character means home directory), is in the PATH
variable, so you can refer and execute them from any folder.Let’s assume that I want the following output. First, the column 2 of file 1 (name), second, the column 1 of file 1 (size), and finally, column 5 of file 2 (size):
/c/Users/asdf/bin/abswin 27 0
/c/Users/asdf/bin/abwin 56 0
/c/Users/asdf/bin/BrowserStackLocal.exe 37M 37M
/c/Users/asdf/bin/cloc 66 0
/c/Users/asdf/bin/curltime 69 0
/c/Users/asdf/bin/cvdump.exe 957K 960K
Use the cut --fields LIST
or awk '{print $5}'
command to print specific columns from a file, and use the paste
utility to combine them into a new file. See the following command for example:
$ paste \
<(du -sh ~/bin/* | cut -f 2) \
<(ls -lha ~/bin/* | awk '{print $5}') \
<(du -sh ~/bin/* | cut -f 1)
/c/Users/asdf/bin/abswin 27 0
/c/Users/asdf/bin/abwin 56 0
/c/Users/asdf/bin/BrowserStackLocal.exe 37M 37M
/c/Users/asdf/bin/cloc 66 0
/c/Users/asdf/bin/curltime 69 0
/c/Users/asdf/bin/cvdump.exe 957K 960K
In the command above, I use process substitution <(...)
to generate the output of the du
, cut
, and awk
commands as input to the paste utility. See the process substitution as something that creates a temporary file that doesn’t get saved anywhere. The du
command retrieves the disk usage of files, the cut
command extracts specific fields, and the awk
command prints the fifth field.
I also use linux pipes, we combine single letter options, e.g. -sh
= -s
+ -h
, the -h
option in all the previous commands means human readable, the -f
option means field (or column). If you don’t understand a command, copy and paste it in the Explain shell web application.
Links
Other things to read
Popular
- Reveal animations on scroll with react-spring
- Gatsby background image example
- Extremely fast loading with Gatsby and self-hosted fonts