Linux: Troubleshoot programs with strace like a pro

What you can do when a program crass without any error or log indicating that something goes wrong? and you don't have the source code? use…

Linux: Troubleshoot programs with strace like a pro
Photo by Marten Bjork on Unsplash
Join Medium with my referral link - Konstantinos Patronas
Read every story from Konstantinos Patronas (and thousands of other writers on Medium). Your membership fee directly…

What you can do when a program crass without any error or log indicating that something goes wrong? and you don't have the source code? use strace!

What strace does?

strace is a useful diagnostic, instructional, and debugging tool. System administrators, diagnosticians and trouble-shooters will find it invaluable for solving problems with programs for which the source is not readily available since they do not need to be recompiled in order to trace them. Students, hackers and the overly-curious will find that a great deal can be learned about a system and its system calls by tracing even ordinary programs. And programmers will find that since system calls and signals are events that happen at the user/kernel interface, a close examination of this boundary is very useful for bug isolation, sanity checking and attempting to capture race conditions.

A simple example

Type the following program and save it as readfile.cpp, its a very simple C++ program that tries to open file test.txt, in case of success prints a message, in case of failure does not print anything; as you can already imagine if we don't have the source code of the program or any relevant documentation we are not able to understand what is happening because the program exits without any clue.#include <fstream>
#include <iostream>
#include <vector>
#include <string>using namespace std;int main()
{
 const string fileName = "test.txt";ifstream istr;
 istr.open(fileName.c_str());
 if (istr.fail())
 {
  exit (1);
 }
  cout <<"File test.txt can be opened"<<endl;
 return (0);
}

To compile the program enter, if the compilation was successful a file named readfile will be in the same directory, this is our compiled program.g++ readfile.cpp -o readfile

Now if we run the program and test.txt exists in the same directory we have the following output.$ ./readfile
File test.txt can be opened

if not test.txt does not exist will exit without any message.$ ./readfile
kpatronas@prometheus:~$

Using strace

Delete file test.txt and run the same program again with strace, we can see that strace has recorded 3 system calls that returned an error.$ strace -c ./readfile
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
37.42    0.001249          56        22           mmap
18.48    0.000617          88         7           mprotect
14.26    0.000476          79         6           pread64
 6.17    0.000206          41         5           close
 5.87    0.000196          65         3           brk
 5.36    0.000179          35         5           fstat
 3.45    0.000115          57         2         1 arch_prctl
 3.39    0.000113          18         6         1 openat
 2.91    0.000097          97         1           munmap
 2.70    0.000090          22         4           read
 0.00    0.000000           0         1         1 access
 0.00    0.000000           0         1           execve
------ ----------- ----------- --------- --------- ----------------
100.00    0.003338                    63         3 total

Lets examine the errors, strace accepts the -e parameter which allows to filter only errors of specific type like file errors or network errors, but since we don't know the exact error lets try a more generic approach.strace ./readfile
.
.
.
mprotect(0x558b0f738000, 4096, PROT_READ) = 0
mprotect(0x7f98a0a89000, 4096, PROT_READ) = 0
munmap(0x7f98a0a49000, 77360)           = 0
brk(NULL)                               = 0x558b10e2e000
brk(0x558b10e4f000)                     = 0x558b10e4f000
openat(AT_FDCWD, "test.txt", O_RDONLY)  = -1 ENOENT (No such file or directory)
exit_group(1)                           = ?
+++ exited with 1 +++

We can identify two things

  • The program exited with a return code of “1” which indicates that did not executed successfully
  • And the last error before the exit code wasopenat(AT_FDCWD, "test.txt", O_RDONLY)  = -1 ENOENT (No such file or directory

From this output we can understand that the program exited because test.txt could not be found; create test.txt and run the program again$ strace ./readfile
.
.
.
openat(AT_FDCWD, "test.txt", O_RDONLY)  = 3
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x3), ...}) = 0
write(1, "File test.txt can be opened\n", 28File test.txt can be opened
) = 28
close(3)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

Now we can see that the program exited normally and printed the message we know that indicates that runs correctly.

strace is a very useful tool with many options, but i believe giving a simple example is what you need to get started!

Join Medium with my referral link - Konstantinos Patronas
Read every story from Konstantinos Patronas (and thousands of other writers on Medium). Your membership fee directly…