- Troubleshooting the JVM - Moazam Raja
- general jvm info
- the jvm is like a self-contained computer
- 32 bit vs 64 bit
- maint. rels have *tons* of fixes
- collecting data
- Collecting data at the OS level
- sometimes you need native debugging info
- jvm might have threading probs
- jni or native code
- maybe it's actually an OS problem
- OS level tools
- pstack
- prstat -L
- shows process and LWP status in detail
- pldd
- pmap
- mem and lib use in detail
- kill -3
- gets java-level stacktrace dump
- Doesn't work if the -Xrs flag has been passed to the jvm
- gcore
- creates a core of a hung or running proc.
- gdb can also do this
- pkg_app
- Solaris tool to pack up an executable for debugging.
- java level
- helpful if debugging app code or class libs
- if debugging GC or mem tuning probs
- find if a class is leaking
- if suspect prob is in vvm or java class libs
- java level tools
- -verbose:gc|jni|class
- produce verbose debugging info
- -Xloggc:[file]
- logs GC to file, includes timestamps
- 1.5 only, maybe backported to later 1.4
- -XX:+PrintClassHistogram
- outputs heap population whenever a SIGQUIT is sent to the proc
- -Xcheck:jni
- additional JNI debugging info
- -XX:+TraceClassResolution
- prints insanely verbose class resolution (with line nums) as classes are loaded
- GC info
- -XX:+PrintGCDetails
- -XX:+PrintTenuringDistribution
- -XX:+PrintHeapAtGC
- very detailed heap output
- -XX:+PrintCompilation
- prints trace as methods are compiled
- Crashes
- hs_err_pid.log
- Contains crash signal & HotSpot ErrorID
- -XX:OnError="command %p"
- Runs user spec'd script on JVM abort
- -XX:+ShowMessageBoxOnError
- analyzing data
- where did the crash happen?
- look at the last stack frame of the pstack data
- look at java stack trace & exceptions thrown
- crash in VM GC adaptive resizing code?
- try running -Xmx/-Xms of equal values
- Is the crash in compiled code?
- Run with -XX:PrintCompilation
- which method to exclude from compilation?
- usually the last meth. comp'd b4 crash
- How to exclude?
- create a .hotspot_compiler file in working dir
- Put ??? in that file...
- GC
- Hangs
- Is it really hung, or just slow?
- is cpu load high, or has it gone down?
- socket issues?
- truss or strace the process - brute force...
- Thread hangs
- get stack trace
- get snapshot of the LWPs
- get native stacktrace (pstack)
- kill -32 (solaris specific) tells OS to allocate a new LWP
- note that the solaris 8 thread model sucks. solaris 9 model has been backported
- force the process to dump core
- Out of memory errors
- oom error can be a vvery concise pointer to real prob
- check native stack size (-Xss) and OS swap availability
- does the app just need more RAM?
- check the Permanent Generation size
- MXBeans help monitor & understand your app
- moazam@unixville.com