Below are some of the useful linux commands to diagnose issues..
To know about processes, parent and call heirarchy and to know the process's resource usage etc., we can use pstree, ps, top -H etc.,
Example:
pstree
init─┬─Xvnc
├─crond
├─firefox───9*[{firefox}]
├─gnome-terminal─┬─bash
│ ├─gnome-pty-helpe
│ └─{gnome-terminal}
├─gnome-terminal─┬─bash───su───bash───startWebLogic.s───java───103*[{java}]
ps -e f
132 ? Sl 0:01 gnome-terminal
137 ? S 0:00 \_ gnome-pty-helper
138 pts/1 Ss 0:00 \_ bash
353 pts/1 S 0:00 | \_ su
357 pts/1 S+ 0:00 | \_ bash
460 pts/1 S 0:00 | \_ /bin/sh ./startWebLogic.sh
510 pts/1 Sl 26:44 | \_ /...../jdk/bin/java -server -Xms256m -Xmx1024m -Dwe
993 pts/0 Ss+ 0:00 \_ bash
while ps -ef can give you full command of the process
And, it will also show the parent process id and process cpu, rss size etc.,
To know all the list of files that the process has accessed, lsof is the command..
/usr/sbin/lsof -p 510
So the weblogic server which is a JVM has accessed 2K+ files. And example is below..
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 510 root cwd DIR 202,2 4096 5657417 .../DefaultDomain
FD -is the file descriptor like below
cwd current working directory;
mem memory-mapped file;
mmap memory-mapped device;
pd parent directory;
rtd root directory;
tr kernel trace file
TYPE is the file type like a REGulare or DIRectory..
NODE is the inode number
DEVICE indicates the device type and partition numbers
There are couple of other useful commands to debug issues like
top -H -p will give the thread level processor utilization which is useful when debugging high CPU consumption
issues in a JVM..
strace is another useful command to know what a process is doing at OS level like socket connections, reads etc.,
netstat is another command with options like -nap gives the information on the connections and rec/send Q which can indicate any network or program slow or being blocked and which
connections in which state etc.,
Disk space check commands like
df -h, du -sh . etc., are useful to verify sizes and space on disk and network shares
free -g is another useful command to know how much memory has been consumed in physical, swap and cache/buffer areas.
SAR is another great collection of metrics rangning from processes, memory, swap activity, CPU, load average, disk, network etc.,
To flush out the cache and buffers
drop_caches is what that needs to be cleared out. Example as shown below
free -m
total used free shared buffers cached
Mem: 15500 15330 169 0 249 11778
-/+ buffers/cache: 3302 12197
Swap: 10047 1704 8342
echo 1 > /proc/sys/vm/drop_caches --- this will flush out the cache and buffers.
free -m
total used free shared buffers cached
Mem: 15500 3367 12132 0 0 635
-/+ buffers/cache: 2731 12768
Swap: 10047 1704 8342
The above is useful if you want something to be loaded into memory again with some runtime changes and also to test out something like a disk speed
To check
disk speed: DD - a linux command is a very useful one.
Example Write test:
date; time dd if=ip/test.txt of=op/test.txt bs=1024k count=1000
Sun Apr 23 06:46:17 PDT 2017
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 3.15715 seconds, 332 MB/s
real 0m3.164s
user 0m0.000s
sys 0m1.104s
-- date is to print out which date and time is to time the command and it is writing a file from ip dir to op dir with block size as 1K and a 1000 times i.e. a total of 1GB. This has spent 3.16s total i.e. 'real' means the elapsed time and system mode CPU time is 1.1 s and nothing really in user mode. So its basically the speed i.e. 332 MB/s took that much time..
But to test it in isolation i.e. read and write speeds, one can use /dev/zero which is a special null char file gives as many as read..
date; time dd of=/dev/zero if=test.txt bs=1024k count=1000 --- read speed
date; time dd if=/dev/zero of=test.txt bs=1024k count=1000 --- write speed
some of other commands like ping, tracert can be used to test the network speed and network paths..