We are running a Mission Critical 24x7 (10.2.0.5) on HP-UX platform a stand alone HP Itanium server. One morning when everything else was running smoothly, we were alerted of connection requests timing out on this database.
We were already facing a lot of Network/Firewall issues, so we thought it might be the network. As a formality, I tried to login into the server with the intention to check the incoming sessions traffic in the listener logs. But to my surprise, my connection request was taking toooo long by the server and then timed out.
I got in touch with the System Admin and he was able to get a session into the server. We saw that the whole server was in a hanged or in an extremely slow state!
On checking for the top resources, we found that the Oracle processes were utilizing 100% CPU, hence causing the hang.
I was then lucky to have a session of OEM opened on that database, and saw the top wait event was "Cursor S Pin" related waits.
I later checked that there were more than 700 sessions waiting for mutex to be released. As it became almost impossible to execute any more commands, it was decided to bounce the server to get the services back online.
Till then we were not sure whether the problem was in the hardware, the OS or the database. The only clue was:
1. 100% CPU utilization in the server
2. An AWR report, which just "luckily" completed before the Server Reboot !
I checked the metalink, and found the exact same symptoms for a bug:
Bug 6904068 - High CPU usage when there are "cursor: pin S" waits [ID 6904068.8]
Applied the patch at midnight, and haven't had the issue since then.
Basically what is happening is that a session tries to get the mutex (kind of a latch) in S mode, but is unable to, and immediately yields so another session can come in and request it, and this can start causing excessive CPU usage and cause extreme performance degradation or hangs. This is a very common bug when the "cursor: pin S" wait is seen.
Applying this patch allows you to set _first_spare_parameter to wait for a fixed time instead of yielding when trying to obtain the mutex in S mode and there is no X holder.
No comments:
Post a Comment