Pages

Monday, January 23, 2017

OEM 13c: The Next-Gen OPATCH version 13.9 and issues on Solaris 10 - "opatch: test: unknown operator =="

To those running OEM 13c - you may have noticed the bug "ORA-06553: PLS-306: wrong number or types of arguments in call to 'REPORT_SQL_MONITOR_LIST'" causing the SQL Monitoring page to fail.

Rejoice - the fix is now available "Oracle® Management Service 13c Release 2 (13.2.0.0.0) System Patch Patch for Bug # 25197714 - Enterprise Manager for OMS Plug-ins 13.2.0.0.0"

Readme --> HERE

This post is to do mostly with "Pre-requisite #5":
-----
Ensure that you have the latest version of OPatch 13.9.0.0.0 and OMSPatcher 13.8.0.0.1 on all OMS instance platform homes.
If you are not sure about the OPatch version you have, or to download the latest version, follow the instructions outlined in the My Oracle Support note 224346.1 available at:
https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=224346.1

If you are not sure about the OMSPatcher version you have, or to download the latest version, follow the instructions outlined in the My Oracle Support note 2203007.1 available at:
https://support.oracle.com/epmos/faces/DocumentDisplay?id=2203007.1
----

OMSPatcher upgrade is simple - download --> unzip --> replace folder [the usual (but now historical) procedure used to upgrade OPatch].

Hang on, what do you mean by "historical" procedure? What's new with OPatch upgrade? 

OPatch now comes bundled in a 'jar' file that needs to be installed using at least JRE 1.7. Luckily, OMS 13c installation comes built in with this version of JAVA.

Command is as below:
[oracle->ogcoms]/u01/software/6880880]$OMS_HOME/oracle_common/jdk/bin/java -jar -d64 opatch_generic.jar -silent oracle_home=$OMS_HOME

(-d64 parameter is required if you are installing OPatch in a 64 bit setup)

Should work like a charm for any other OS, except Solaris (as of the version 13.9).

Issue:
Upgrade was successful without any errors. But, "opatch" commands were failing:

[oracle->ogcoms]/u01/app/oracle/middleware13c/OPatch]./opatch 
./opatch: test: unknown operator == 
[oracle->ogcoms]/u01/app/oracle/middleware13c/OPatch]./opatch version 
./opatch: test: unknown operator == 

This is a bug! (yes, yet another bug!)

Bug 16393828 [ - OPATCH.SH FAILS ON SOLARIS
Bug 16393904 [ - OPATCH-NEXTGEN: OPATCH NOT WORKING IN SOLARIS SPARC 10

opatch is an ascii file, so looking for the text "==" in the file gives you the faulty line:

[oracle->ogcoms]/home/oracle/OPatch]grep == opatch
#    ==> yes: ready to 'callOPatch'
#    ==> no: go to #2
#    ==> yes: ready to 'callOpatch'
#    ==> no: go to #3
#    ==> yes: ready to 'callOpatch'
#    ==> no: Error and Exit
   if [ "$PLATFORM" == "Linux" -a "$ARCH" == "x86_64" ];then  
# JDK 8:  MaxPermSize <=== replaced by ===> MaxMetaspaceSize:  default = 128M
   # if JDK8 and has flag 'MaxPermSize', replace MaxPermSize <==> MaxMetaspaceSize

Faulty line is:
   if [ "$PLATFORM" == "Linux" -a "$ARCH" == "x86_64" ];then

Solution:
Issue with Solaris server is that "==" will not work, therefore I modified it to "=" in the opatch and opatch.pl files:

   if [ "$PLATFORM" = "Linux" -a "$ARCH" = "x86_64" ];then

I tried opatch command again and voola - we have a perfectly working OPatch!

[oracle->ogcoms]/u01/app/oracle/middleware13c/OPatch]./opatch version 
OPatch Version: 13.9.1.0.0 

OPatch succeeded. 

I miss the "non-buggy", simple unzip and replace OPatch directory method!

ORA-600 [ktu_format_nr no disk align] - Oracle 12c Database Migration stuck at Convert Datafile of UNDO tablespace

Bugs Bugs Bugs... 3rd bug this month!

So, I was performing a simple cross-platform Database Migration from "Solaris[tm] OE (64-bit)" to "HP-UX IA (64-bit)". As both the source and the target OS runs on the BIG endian format, the migration using "RMAN transportable database" is the simplest - requiring to convert only the datafiles/tablespaces containing some kind of UNDO segment.

Steps for this migration can be found at:
Cross-Platform Database Migration (across same endian) using RMAN Transportable Database (Doc ID 1401921.1)

Now, I had 1 file from SYSTEM tablespace and a few from UNDOTBS_01 to convert.
Most of them converted without a hickup, but one file "undotbs_01_01.dbf" would get stuck while conversion.

RMAN> CONVERT 
DATAFILE '/ofdmdb/STAGE_DIR/undotbs_01_01.dbf' 
FORMAT '/ofdmdb/undotbs_01_01.dbf' 
FROM PLATFORM 'Solaris[tm] OE (64-bit)' ; 

Starting conversion at target at 18-JAN-17 
using channel ORA_DISK_1 
channel ORA_DISK_1: starting datafile conversion 
input file name=/ofdmdb/STAGE_DIR/undotbs_01_01.dbf 

--- -- - Stuck here since 2 hours and counting... 

I check the alert.log and found the below:

Thu Jan 19 15:06:46 2017 
Errors in file /u01/app/oracle/diag/rdbms/ofdm/ofdm/trace/ofdm_ora_16742.trc (incident=65019): 
----->>> ORA-00600: internal error code, arguments: [ktu_format_nr no disk align], [104], [33281], [45056], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/ofdm/ofdm/incident/incdir_65019/ofdm_ora_16742_i65019.trc 
Thu Jan 19 15:06:49 2017 
Dumping diagnostic data in directory=[cdmp_20170119150649], requested by (instance=1, osid=9513668), summary=[incident=65019]. 
Use ADRCI or Support Workbench to package the incident. 
See Note 411.1 at My Oracle Support for error and packaging details. 
Thu Jan 19 15:07:08 2017 
Sweep [inc][65019]: completed 
Sweep [inc2][65019]: completed 


On searching for related documents I found that this was a Bug and there were no straight patches available yet:

Bug 24332831 - RMAN: ORA-600 [ktu_format_nr no disk align] while convert from HP-UX ( Doc ID 24332831.8 ) 
or a more accessible post:
http://remidian.com/2012/07/transporting-an-oracle-database-to-another-os-platform-the-fastest-way

I then had to raise a ticket with Oracle Support and they prepared a bugfix patch (Patch 24332831 for HP-UX) for us based on our version and patch level.

I then applied the patch using the usual "opatch apply", and then tried to convert the file again:


oracle->sysdev]/u01/software/bugfix/24332831>rman target /
Recovery Manager: Release 12.1.0.2.0 - Production on Mon Jan 23 09:06:45 2017
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
connected to target database: OFDM (not mounted)

RMAN> CONVERT DATAFILE '/ofdmdb/STAGE_DIR/undotbs_01_01.dbf'
FORMAT '/ofdmdb/undotbs_01_01.dbf'
FROM PLATFORM 'Solaris[tm] OE (64-bit)' ;2> 3>

Starting conversion at target at 23-JAN-17
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=122 device type=DISK
channel ORA_DISK_1: starting datafile conversion
input file name=/ofdmdb/STAGE_DIR/undotbs_01_01.dbf
converted datafile=/ofdmdb/undotbs_01_01.dbf
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:15
Finished conversion at target at 23-JAN-17

Finally, I proceeded with the next steps in migration without another hiccup.

Monday, November 21, 2016

Oracle 12c RAC: A step by step manual PSU patching (October 2016) without using opatchauto

Why would you want to 'not' use OPATCHAUTO? Maybe because there's no bash shell in your server due to some reason, or you just want to avoid any complications by opatchauto, or simply you like a more granular control over the patching process.

I primarily work on Oracle RAC setups on HP-UX servers - where the bash shell is not installed (and cant be due to policy). You will agree, life is much easier on Linux. Therefore I must make use of the old and dependable 'opatch' utility!

About the October 2016 patch:
Patch File:   p24412235_121020_HPUX-IA64.zip
Contents:    DB PSU patch #24006101 and OCW PSU patch #23854735
The unzipped patch file path: /u01/software/psuoct2016/12cGI/24412235

About the environment:
- In the below example, I am applying the patch on a 2-node RAC in a rolling mode. I will be sharing only the tested steps and not the outputs.

- I usually have 4 putty sessions open on each node:
  1. GRID_HOME environment variables set
  2. RDBMS_HOME environment variables set
  3. Monitoring the activity in the alert.log of the crs
  4. Monitoring the activity in the alert.log of the database
- The GRID and RDBMS home owner is 'oracle' in this example.

Prerequisites: 
Make sure your have the latest OPatch in both the RDBMS_HOME and GRID_HOME. 

$ opatch version
OPatch Version: 12.2.0.1.8
OPatch succeeded.

Let's begin...

Node 1:


1. As RDBMS owner (oracle), shutdown the instance on node 1 and stop all services on the RDBMS_HOME
$ srvctl stop instance -d testdb -i testdb1

# NOTE: Make sure the shutdown database was clean by verifying the message "Instance shutdown complete" in the alert log.

$ $ORACLE_HOME/bin/srvctl stop home -o $RDBMS_HOME -s /tmp/status -n node1

2. Unlock the CRS as ‘root’ user:
export GRID_HOME=/u01/app/12.1.0/grid 
$GRID_HOME/crs/install/rootcrs.pl -prepatch

3. As Grid Infrastructure Owner (oracle), apply the OCW patch on the Grid Home:
$GRID_HOME/OPatch/opatch napply -oh $GRID_HOME -local /u01/software/psuoct2016/12cGI/24412235/23854735

4. As Grid Infrastructure Owner (oracle), apply the DB PSU patch on the Grid Home:
$GRID_HOME/OPatch/opatch napply -oh $GRID_HOME -local /u01/software/psuoct2016/12cGI/24412235/24006101

5.
As RDBMS Owner (oracle), apply the OCW PSU patch on the RDBMS Home:
$ cd /u01/software/psuoct2016/12cGI/24412235/23854735/custom/scripts
$ ./prepatch.sh
$ $ORACLE_HOME/OPatch/opatch napply -oh $RDBMS_HOME -local /u01/software/psuoct2016/12cGI/24412235/23854735

6. As RDBMS Owner (oracle), apply the DB PSU patch on the RDBMS_HOME and run ‘postpatch’:
$ORACLE_HOME/OPatch/opatch napply -oh $RDBMS_HOME -local /u01/software/psuoct2016/12cGI/24412235/24006101
cd /u01/software/psuoct2016/12cGI/24412235/23854735/custom/scripts 
./postpatch.sh

7.
As root:
/u01/app/12.1.0/grid/rdbms/install/rootadd_rdbms.sh
$GRID_HOME/crs/install/rootcrs.pl -postpatch

Executing the final command "rootcrs.pl -postpatch" with lock the crs files and start the cluster services

Wait for 2-3 minutes – till all CRS components are started and DB is in OPEN state
/u01/app/12.1.0/grid/bin/crsctl stat res -t

8. As ‘oracle’:  
$ORACLE_HOME/bin/srvctl start home -o $RDBMS_HOME -s /tmp/status1 -n node1
  (No Output)

9. As ‘oracle’:
ps -ef | grep pmon 

10. Verify the Patches on both GI and RDBMS HOMES:
$ORACLE_HOME/OPatch/opatch lsinventory -oh $RDBMS_HOME
$ORACLE_HOME/OPatch/opatch lsinventory -oh $RDBMS_HOME | grep 24007012
$ORACLE_HOME/OPatch/opatch lsinventory -oh $RDBMS_HOME | grep 23854735
$ORACLE_HOME/OPatch/opatch lsinventory -oh $GRID_HOME
$ORACLE_HOME/OPatch/opatch lsinventory -oh $GRID_HOME | grep 24007012
$ORACLE_HOME/OPatch/opatch lsinventory -oh $GRID_HOME | grep 23854735

11. Check the status of CRS components:
$ crs_stat -t 



Node 2:


1. As RDBMS owner (oracle), shutdown the instance on node 1 and stop all services on the RDBMS_HOME
$ srvctl stop instance -d testdb -i testdb2

# NOTE: Make sure the shutdown database was clean by verifying the message "Instance shutdown complete" in the alert log.

$ $ORACLE_HOME/bin/srvctl stop home -o $RDBMS_HOME -s /tmp/status -n node2

Follow steps 2-7 from the Node 1 section

8. As ‘oracle’:  
$ORACLE_HOME/bin/srvctl start home -o $RDBMS_HOME -s /tmp/status1 -n node2
  (No Output)

9. As ‘oracle’:
ps -ef | grep pmon 

10. Verify the Patches on both GI and RDBMS HOMES:
$ORACLE_HOME/OPatch/opatch lsinventory -oh $RDBMS_HOME
$ORACLE_HOME/OPatch/opatch lsinventory -oh $RDBMS_HOME | grep 24007012
$ORACLE_HOME/OPatch/opatch lsinventory -oh $RDBMS_HOME | grep 23854735
$ORACLE_HOME/OPatch/opatch lsinventory -oh $GRID_HOME
$ORACLE_HOME/OPatch/opatch lsinventory -oh $GRID_HOME | grep 24007012
$ORACLE_HOME/OPatch/opatch lsinventory -oh $GRID_HOME | grep 23854735

11. Check the status of CRS components:
$ crs_stat -t 

Post -Patch Instructions:

On Node 1:
cd $RDBMS_HOME/OPatch
./datapatch -verbose


Hope this has helped you in some way...
If yes, please don't forget to add a comment!

Sunday, November 20, 2016

Oracle RAC 12c - Simple steps to manually rollback a PSU Patch from a node

I will use the latest October 2016 PSU patch as a test case for this post...

Environment:
Server/OS - HP UX Itanium (11.31)
2 Node RAC - no ACFS, no shared homes, ASM is used

Method of applying patches -
Fully Manual
Non-Rolling
Why? 
 - better control,
 - non-availability of bash in  our HP UX servers which is required to run opatchauto

What do should know about the October PSU patch for RAC?
OCW PSU patch number  -- 23854735
DB PSU patch number      -- 24006101

Below are very very simple steps to rollback the patches in your environment:

DB HOME:
opatch nrollback -local -id 24006101 -oh $ORACLE_HOME
cd /u01/software/psuoct2016/12cGI/24412235/23854735/custom/scripts
./prepatch.sh
opatch nrollback -local -id 23854735 -oh $ORACLE_HOME
./postpatch.sh

GRID HOME:
$GRID_HOME/crs/install/rootcrs.pl -prepatch [-nonrolling]
opatch nrollback -local -id 24006101 -oh $GRID_HOME
opatch nrollback -local -id 23854735 -oh $GRID_HOME
$GRID_HOME/crs/install/rootcrs.pl -postpatch  [-nonrolling]

Use -nonrolling only if you have patched using the nonrolling mode.

Hope this helps!
Cheers

Wednesday, November 16, 2016

EM 13c: ORA-06553: PLS-306: wrong number or types of arguments in call to 'REPORT_SQL_MONITOR_LIST'


[UPDATE: Bugfix Bundle Patch is out: OMS Patch 25197714 ]



Enterprise Manager 13c Installation... Check!
Agents pushed to target servers... Check!
Cluster and Database targets discovered and configured... Check!

Aaah! EM 13c looks and feels brilliant... 

Starting to carry out the usual daily health checks on the core-banking database...
Performance page looks great...
No Locking/Blocking either...
Heading on to the SQL Monitoring page.... WHAAAAT ??? 
ORA-06553: PLS-306: wrong number or types of arguments in call to 'REPORT_SQL_MONITOR_LIST'


After a successful deployment of Oracle Enterprise Manager 13.2, I've encountered a serious BUG in the 1st hour of my EM 13c experience! 

DBAs usually depend on the "SQL Monitoring" page of the OEM to communicate the long running active queries to the App Support Teams and take necessary actions. And, the graphs always make a better impact...

So naturally, on searching for a solution on Oracle Support, I came across the below document:

EM13c: Accessing SQL Monitoring Raises ORA-06553: PLS-306: wrong number or types of arguments in call to 'REPORT_SQL_MONITOR_LIST' (Doc ID 2199723.1)

Apparently, if you are accessing a database with a lower version that 12c, you are most likely to hit this bug!
I then raised a ticket with the support, and the assigned engineer informed me that the fix is expected in the next bundle patch releasing anytime between last week of Nov and 1st week of Dec 2016. 

Stay tuned for more...


Tuesday, November 15, 2016

EM 13c: What happens when you don't run root.sh after the Agent installation?

Admit it, there are times when you skip execution of the little "root.sh" script, especially when you don't have direct or indirect root access. Also, you may skip it when it was already executed for a previous installation on the same server.

Obviously, root.sh is "the most" important part of a RAC setup as it configures the CRS services and brings them up, but why in OEM 13c? Hmmm...

As a DBA, it's always a best practice to check what's different, or what's new in the root.sh files when installing a newer version of any Oracle based software. That's what I did when I implemented the OEM 13c's Oracle Management Server (OMS) and deployed agents across numerous Unix, Solaris and Windows servers that were hosting Oracle Databases.

We all know that the Enterprise Manager is no longer restricted to just monitoring databases. Especially from 12c, the Enterprise Manager has been Cloudified (oooo). It can monitor and manage almost any infrastructure hardware or software (if configured). It's a complete ENTERPRISE Cloud Management Solution! Why do I say this now? Wait for it...

So, after pushing Agents on the numerous servers, I skipped executing the root.sh, as I wanted to find out if there were any impact on any of the proceeding steps.

Everything was smooth, I discovered cluster and database targets, and added them to the EM. I was able to login into these targets and perform tasks as normal. I left for home, watched the Walking Dead, and when I returned in the morning, I noticed something interesting:

The Database Targets were stuck in "PENDING STATE"!

Initially, I thought maybe a network issue causing the Upload to pause. I then did the usual things to try and fix that problem:

/u01/app/oracle/agent13c/agent_inst/bin/emctl stop agent
/u01/app/oracle/agent13c/agent_inst/bin/emctl clearstate agent
/u01/app/oracle/agent13c/agent_inst/bin/emctl start agent
/u01/app/oracle/agent13c/agent_inst/bin/emctl upload agent
/u01/app/oracle/agent13c/agent_inst/bin/emctl status agent

But it didn't work. I put on my Oracle Support socks and Google hat, and started searching for an answer. Looked at similar problems and solutions provided by my favorite Oracle Blogs - Pythian and DBAKevlar

Nada! didn't solve my problem.

Last resort - I applied the root.sh on one of the servers, and voola! In a few minutes the database targets of that server started showing Up status!

Ohhh ! I always wanted to know what happens if I don't run the root.sh!




Later, I checked if I can get the same information from somewhere else. So I headed to on of the Agent's page and guess what I found?




ERROR: NMO not setuid-root (Unix-only). What does that mean? And why does it require root.sh to be executed?

NMO is an executable file in the sbin folder of the Agent's Home. Some of the executables in this folder need to be owned by root in order for the "Enterprise" manager to be able to work for the whole "Enterprise". Clearly, the "oracle" user ownership and permissions are nearly not enough.

Before root.sh:

oracle@xxxxx /u01/app/oracle/agent13c/agent_13.2.0.0.0/sbin> ls -l
total 68594
-rwx--x--x   1 oracle   dba        22840 Sep 30 23:29 nmb.0
-rwx--x--x   1 oracle   dba       114768 Sep 30 23:29 nmgsshe.0
-rwx--x--x   1 oracle   dba       100528 Sep 30 23:31 nmhs.0
-rwx--x--x   1 oracle   dba      8695880 Sep 30 23:29 nmo.0
-rwx--x--x   1 oracle   dba      8608224 Sep 30 23:29 nmoconf
-rwx--x--x   1 oracle   dba      8617024 Sep 30 23:31 nmopdpx.0
-rwx--x--x   1 oracle   dba      8617024 Sep 30 23:31 nmosudo.0
-rwx------   1 oracle   dba        87776 Sep 30 23:31 nmr.0
-rw-r-----   1 oracle   dba         9615 Aug  1 15:37 nmr_macro_list
-rwx------   1 oracle   dba        14976 Sep 30 23:31 nmrconf

After root.sh:

oracle@xxxxx /u01/app/oracle/agent13c/agent_13.2.0.0.0/sbin> ls -l
total 203890
-rwsr-x---   1 root       dba          87224 Nov 14 10:30 nmb
-rwx--x--x   1 oracle     dba          87224 Oct  1 06:53 nmb.0
-rwxr-xr-x   1 root       dba          78656 Nov 14 10:30 nmgsshe
-rwx--x--x   1 oracle     dba          78656 Oct  1 06:53 nmgsshe.0
-rwsr-x---   1 root       dba          99680 Nov 14 10:30 nmhs
-rwx--x--x   1 oracle     dba          99680 Oct  1 06:59 nmhs.0
-rwsr-x---   1 root       dba        13002232 Nov 14 10:30 nmo
-rwx--x--x   1 oracle     dba        13002232 Oct  1 06:53 nmo.0
-rwx------   1 root       sys        13002232 Nov 14 10:30 nmo.new.bak
-rw-r-----   1 root       dba            188 Nov 14 10:30 nmo_public_key.txt
-rwx--x--x   1 oracle     dba        12857336 Oct  1 06:53 nmoconf
-rwxr-xr-x   1 root       dba        12862728 Nov 14 10:30 nmopdpx
-rwx--x--x   1 oracle     dba        12862728 Oct  1 06:59 nmopdpx.0
-rwxr-xr-x   1 root       dba        12862728 Nov 14 10:30 nmosudo
-rwx--x--x   1 oracle     dba        12862728 Oct  1 06:59 nmosudo.0
-rwsr-x---   1 root       dba         148312 Nov 14 10:30 nmr
-rwx------   1 oracle     dba         148312 Oct  1 06:59 nmr.0
-rwx------   1 root       sys         148312 Nov 14 10:30 nmr.new.bak
-rw-r-----   1 root       dba           9615 Aug  1 22:37 nmr_macro_list
-rwx------   1 oracle     dba          80648 Oct  1 06:59 nmrconf


I hope you see the difference in permissions and ownership of the files.

Once I applied the root.sh on the remaining servers, all the cluster and database targets were up and running.

Let me know if you have any questions or a conflict of thoughts :) !

Cheers,

Thursday, May 12, 2016

Getting started in Oracle Cloud - Database as a Service

Cloud - the recent talk of the tech-town. We all know Oracle has fired up the sales teams all over the world across all spheres of Businesses with a single goal - convince the customer to move to Cloud now !

Well, as technical beasts - we are more interested in how we can transition into this wave of Cloud computing, especially if you are an Oracle DBA.

Oracle is currently giving you 1 month free trial of the Cloud Services - so please make hay while the sun shines (that's lame :P)... 

To begin with, register HERE
Under Platform --> Data --> Database --> Try it --> Database as a Service --> Star Trial

P.S. - I had some difficulty to register myself as the Request Code that is sent through SMS didn't arrive at all. For any difficulty - please click the "Chat" icon to get immediate support from Oracle. They are pretty helpful.

You will receive an email with the subject "Oracle Cloud Access Details" which will give you direct access links to the dashboard.

A great place to start with following the instructions here - DBaaS Quick Start Guide

This is a simple guide that will guide you through the simple task of creating your 1st Database Instance as a Cloud Service.

This exercise will also give you an insight as to how easy it is to provision either a standalone database or a 2-node RAC database.

Happy Clouding!