Linux: how to find files in a directory and compress them
One of the most common operations as a sysadmin / devops engineer is to find files in a directory and compress them. Let's see some handy…
One of the most common operations as a sysadmin / devops engineer is to find files in a directory and compress them. Let's see some handy examples.
Finding files and compressing them individually
Assume that we have the following files and we want to compress each file using gzip
ls -ltrh
total 32
-rw-r--r-- 1 kpatronas staff 5B May 31 22:44 1.txt
-rw-r--r-- 1 kpatronas staff 5B May 31 22:44 2.txt
-rw-r--r-- 1 kpatronas staff 5B May 31 22:44 3.txt
-rw-r--r-- 1 kpatronas staff 5B May 31 22:44 4.txtTo do this enter the following in the terminal, we can see that each file is compressed using Gzip, in this example find filters only files ending to *.txt, and the -exec parameter instructs find to compress each file using gzip
❯ find . -name '*.txt' -type f -exec gzip "{}" \;
❯ ls -ltrh
total 32
-rw-r--r-- 1 kpatronas staff 31B May 31 22:44 1.txt.gz
-rw-r--r-- 1 kpatronas staff 31B May 31 22:44 2.txt.gz
-rw-r--r-- 1 kpatronas staff 31B May 31 22:44 3.txt.gz
-rw-r--r-- 1 kpatronas staff 31B May 31 22:44 4.txt.gzAdding a timestamp to compressed files
There are cases where you need a time stamp to be appended to the compressed files, to do this is very simple using find and gzip
find . -type f -name "*.txt" -exec gzip --suffix $(date +"_%Y%d%m.gz") {} \;
❯ ls -ltrh
total 32
-rw-r--r-- 1 kpatronas staff 57B May 31 22:44 1.txt_20243105.gz
-rw-r--r-- 1 kpatronas staff 57B May 31 22:44 2.txt_20243105.gz
-rw-r--r-- 1 kpatronas staff 57B May 31 22:44 3.txt_20243105.gz
-rw-r--r-- 1 kpatronas staff 57B May 31 22:44 4.txt_20243105.gzIn this case, we used three commands, find, gzip, and date. date accepts a timestamp format string which in our case is %Y%d%m (Year — Day — Month)
Compressing files to a single file
What if we want to compress found files to a single file? that is easy, open in your terminal enter
❯ find . -type f -name "*.txt" -print0 | tar -czvf all.tar.gz --null -T -
a ./4.txt
a ./3.txt
a ./2.txt
a ./1.txt
❯ ls -ltrh
total 40
-rw-r--r-- 1 kpatronas staff 31B May 31 22:44 1.txt
-rw-r--r-- 1 kpatronas staff 31B May 31 22:44 2.txt
-rw-r--r-- 1 kpatronas staff 31B May 31 22:44 3.txt
-rw-r--r-- 1 kpatronas staff 31B May 31 22:44 4.txt
-rw-r--r-- 1 kpatronas staff 223B May 31 23:09 all.tar.gzWe can see that a file named all.tar.gz was created which contains all *.txt files in this directory
- print0 prints to the stdout the filenames separated by null char
- the tar command reads from stdin the filenames and compress them to a file named all.tar.gz
- — null indicates that filenames are separated by null char
- -T — read filenames from stdin
Conclusion
In this article, we saw how we can do various handy tricks using find, gzip, and tar.. i hope you found this article useful and made your sysadmin / devops life easier!