Linux: Troubleshoot programs with strace like a pro
What you can do when a program crass without any error or log indicating that something goes wrong? and you don't have the source code? use…
What you can do when a program crass without any error or log indicating that something goes wrong? and you don't have the source code? use strace!
What strace does?
strace is a useful diagnostic, instructional, and debugging tool. System administrators, diagnosticians and trouble-shooters will find it invaluable for solving problems with programs for which the source is not readily available since they do not need to be recompiled in order to trace them. Students, hackers and the overly-curious will find that a great deal can be learned about a system and its system calls by tracing even ordinary programs. And programmers will find that since system calls and signals are events that happen at the user/kernel interface, a close examination of this boundary is very useful for bug isolation, sanity checking and attempting to capture race conditions.
A simple example
Type the following program and save it as readfile.cpp, its a very simple C++ program that tries to open file test.txt, in case of success prints a message, in case of failure does not print anything; as you can already imagine if we don't have the source code of the program or any relevant documentation we are not able to understand what is happening because the program exits without any clue.#include <fstream>
#include <iostream>
#include <vector>
#include <string>using namespace std;int main()
{
const string fileName = "test.txt";ifstream istr;
istr.open(fileName.c_str());
if (istr.fail())
{
exit (1);
}
cout <<"File test.txt can be opened"<<endl;
return (0);
}
To compile the program enter, if the compilation was successful a file named readfile will be in the same directory, this is our compiled program.g++ readfile.cpp -o readfile
Now if we run the program and test.txt exists in the same directory we have the following output.$ ./readfile
File test.txt can be opened
if not test.txt does not exist will exit without any message.$ ./readfile
kpatronas@prometheus:~$
Using strace
Delete file test.txt and run the same program again with strace, we can see that strace has recorded 3 system calls that returned an error.$ strace -c ./readfile
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
37.42 0.001249 56 22 mmap
18.48 0.000617 88 7 mprotect
14.26 0.000476 79 6 pread64
6.17 0.000206 41 5 close
5.87 0.000196 65 3 brk
5.36 0.000179 35 5 fstat
3.45 0.000115 57 2 1 arch_prctl
3.39 0.000113 18 6 1 openat
2.91 0.000097 97 1 munmap
2.70 0.000090 22 4 read
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
------ ----------- ----------- --------- --------- ----------------
100.00 0.003338 63 3 total
Lets examine the errors, strace accepts the -e parameter which allows to filter only errors of specific type like file errors or network errors, but since we don't know the exact error lets try a more generic approach.strace ./readfile
.
.
.
mprotect(0x558b0f738000, 4096, PROT_READ) = 0
mprotect(0x7f98a0a89000, 4096, PROT_READ) = 0
munmap(0x7f98a0a49000, 77360) = 0
brk(NULL) = 0x558b10e2e000
brk(0x558b10e4f000) = 0x558b10e4f000
openat(AT_FDCWD, "test.txt", O_RDONLY) = -1 ENOENT (No such file or directory)
exit_group(1) = ?
+++ exited with 1 +++
We can identify two things
- The program exited with a return code of “1” which indicates that did not executed successfully
- And the last error before the exit code wasopenat(AT_FDCWD, "test.txt", O_RDONLY) = -1 ENOENT (No such file or directory
From this output we can understand that the program exited because test.txt could not be found; create test.txt and run the program again$ strace ./readfile
.
.
.
openat(AT_FDCWD, "test.txt", O_RDONLY) = 3
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x3), ...}) = 0
write(1, "File test.txt can be opened\n", 28File test.txt can be opened
) = 28
close(3) = 0
exit_group(0) = ?
+++ exited with 0 +++
Now we can see that the program exited normally and printed the message we know that indicates that runs correctly.
strace is a very useful tool with many options, but i believe giving a simple example is what you need to get started!