Pages

Tuesday, November 15, 2016

EM 13c: What happens when you don't run root.sh after the Agent installation?

Admit it, there are times when you skip execution of the little "root.sh" script, especially when you don't have direct or indirect root access. Also, you may skip it when it was already executed for a previous installation on the same server.

Obviously, root.sh is "the most" important part of a RAC setup as it configures the CRS services and brings them up, but why in OEM 13c? Hmmm...

As a DBA, it's always a best practice to check what's different, or what's new in the root.sh files when installing a newer version of any Oracle based software. That's what I did when I implemented the OEM 13c's Oracle Management Server (OMS) and deployed agents across numerous Unix, Solaris and Windows servers that were hosting Oracle Databases.

We all know that the Enterprise Manager is no longer restricted to just monitoring databases. Especially from 12c, the Enterprise Manager has been Cloudified (oooo). It can monitor and manage almost any infrastructure hardware or software (if configured). It's a complete ENTERPRISE Cloud Management Solution! Why do I say this now? Wait for it...

So, after pushing Agents on the numerous servers, I skipped executing the root.sh, as I wanted to find out if there were any impact on any of the proceeding steps.

Everything was smooth, I discovered cluster and database targets, and added them to the EM. I was able to login into these targets and perform tasks as normal. I left for home, watched the Walking Dead, and when I returned in the morning, I noticed something interesting:

The Database Targets were stuck in "PENDING STATE"!

Initially, I thought maybe a network issue causing the Upload to pause. I then did the usual things to try and fix that problem:

/u01/app/oracle/agent13c/agent_inst/bin/emctl stop agent
/u01/app/oracle/agent13c/agent_inst/bin/emctl clearstate agent
/u01/app/oracle/agent13c/agent_inst/bin/emctl start agent
/u01/app/oracle/agent13c/agent_inst/bin/emctl upload agent
/u01/app/oracle/agent13c/agent_inst/bin/emctl status agent

But it didn't work. I put on my Oracle Support socks and Google hat, and started searching for an answer. Looked at similar problems and solutions provided by my favorite Oracle Blogs - Pythian and DBAKevlar

Nada! didn't solve my problem.

Last resort - I applied the root.sh on one of the servers, and voola! In a few minutes the database targets of that server started showing Up status!

Ohhh ! I always wanted to know what happens if I don't run the root.sh!




Later, I checked if I can get the same information from somewhere else. So I headed to on of the Agent's page and guess what I found?




ERROR: NMO not setuid-root (Unix-only). What does that mean? And why does it require root.sh to be executed?

NMO is an executable file in the sbin folder of the Agent's Home. Some of the executables in this folder need to be owned by root in order for the "Enterprise" manager to be able to work for the whole "Enterprise". Clearly, the "oracle" user ownership and permissions are nearly not enough.

Before root.sh:

oracle@xxxxx /u01/app/oracle/agent13c/agent_13.2.0.0.0/sbin> ls -l
total 68594
-rwx--x--x   1 oracle   dba        22840 Sep 30 23:29 nmb.0
-rwx--x--x   1 oracle   dba       114768 Sep 30 23:29 nmgsshe.0
-rwx--x--x   1 oracle   dba       100528 Sep 30 23:31 nmhs.0
-rwx--x--x   1 oracle   dba      8695880 Sep 30 23:29 nmo.0
-rwx--x--x   1 oracle   dba      8608224 Sep 30 23:29 nmoconf
-rwx--x--x   1 oracle   dba      8617024 Sep 30 23:31 nmopdpx.0
-rwx--x--x   1 oracle   dba      8617024 Sep 30 23:31 nmosudo.0
-rwx------   1 oracle   dba        87776 Sep 30 23:31 nmr.0
-rw-r-----   1 oracle   dba         9615 Aug  1 15:37 nmr_macro_list
-rwx------   1 oracle   dba        14976 Sep 30 23:31 nmrconf

After root.sh:

oracle@xxxxx /u01/app/oracle/agent13c/agent_13.2.0.0.0/sbin> ls -l
total 203890
-rwsr-x---   1 root       dba          87224 Nov 14 10:30 nmb
-rwx--x--x   1 oracle     dba          87224 Oct  1 06:53 nmb.0
-rwxr-xr-x   1 root       dba          78656 Nov 14 10:30 nmgsshe
-rwx--x--x   1 oracle     dba          78656 Oct  1 06:53 nmgsshe.0
-rwsr-x---   1 root       dba          99680 Nov 14 10:30 nmhs
-rwx--x--x   1 oracle     dba          99680 Oct  1 06:59 nmhs.0
-rwsr-x---   1 root       dba        13002232 Nov 14 10:30 nmo
-rwx--x--x   1 oracle     dba        13002232 Oct  1 06:53 nmo.0
-rwx------   1 root       sys        13002232 Nov 14 10:30 nmo.new.bak
-rw-r-----   1 root       dba            188 Nov 14 10:30 nmo_public_key.txt
-rwx--x--x   1 oracle     dba        12857336 Oct  1 06:53 nmoconf
-rwxr-xr-x   1 root       dba        12862728 Nov 14 10:30 nmopdpx
-rwx--x--x   1 oracle     dba        12862728 Oct  1 06:59 nmopdpx.0
-rwxr-xr-x   1 root       dba        12862728 Nov 14 10:30 nmosudo
-rwx--x--x   1 oracle     dba        12862728 Oct  1 06:59 nmosudo.0
-rwsr-x---   1 root       dba         148312 Nov 14 10:30 nmr
-rwx------   1 oracle     dba         148312 Oct  1 06:59 nmr.0
-rwx------   1 root       sys         148312 Nov 14 10:30 nmr.new.bak
-rw-r-----   1 root       dba           9615 Aug  1 22:37 nmr_macro_list
-rwx------   1 oracle     dba          80648 Oct  1 06:59 nmrconf


I hope you see the difference in permissions and ownership of the files.

Once I applied the root.sh on the remaining servers, all the cluster and database targets were up and running.

Let me know if you have any questions or a conflict of thoughts :) !

Cheers,

4 comments:

  1. Let me speak from a different perspective... I work as a sysadmin in a secure environment, and DBA's wish to run a script as root.... however they cannot tell me what it does, and examination of the code shows that it creates a number of SUID binaries which I cannot review the function of... nor is there any documentation stating what those now SUID binaries are doing... then, just to make me feel extra uncomfortable about the whole thing, another binary which is deployed is called nmosudo... can the (now suid nmo), and the nmosudo too operate together to function as sudo... I cannot say, and as such I cannot allow root.sh to be run.

    Discuss.

    ReplyDelete
    Replies
    1. Running root.sh will allow arbitrary OS command execution against the host. MOS Note 271598.1 describes some of the functionality that will be lost due to the missing setuid privilege:

      Running a Job against this host : this can be OS Command, SQLPlus Script etc

      Setting / Testing Preferred Credentials for this host

      Configuring the database backup settings for a database on this host

      Trying to Clone an ORACLE_HOME, running the Patching Wizard / Deployment procedure against targets in this host

      Executing a User-Defined Metric against target in this host

      Delete
    2. from MOS note: EM 12c, EM 13c: Fix the Cloud Control Agent error 'ERROR: NMO Not Setuid-root (Unix-only)' (Doc ID 1465278.1)
      Note:

      A. The /root.sh script modifies the permission and ownership of the below executables:

      1. nmo - this is a root setuid executable used by the Agent for remote execution purposes like executing a job submitted from the EM console. This is similar to "su", which takes credentials and executes a given command.nmo has setuid bit enabled similar to "su" and this is used to switch the effective user on confirm that the credentials provided are valid. This is used to authenticate OS user credentials, impersonate user (i.e. to change ownership of the process to an OS user, whose credentials are passed to nmo process) and execute the command (shell, perl, etc) that would represent the job.

      2. nmb - this is a root setuid executable used by the Agent to calculate memory usage details of OS processes. This is used to determine physical memory (private + shared) usage by a process.

      3. nmhs - executable is used by the Agent to collect host storage metrics (disk related information) - some of these commands required root privileges. As these are read-only operations, the setuid executable is being used to query that information and is displayed as part of the Storage report for this host target in the Console.

      4. nmosudo - Enterprise Manager uses a trust-based model that permits specification of responsibilities with a high degree of granularity. Administrators can set up sudo or pbrun configuration entries to assign specific Enterprise Manager functional privileges to their OS users. The Management Agent executable nmosudo allows administrators to configure sudo/pbrun such that a less privileged user can run nmosudo as a more privileged user. This is specifically designed to be invoked only by the agent. Agent generates a cryptographically strong signature and provides that as one of the input arguments to nmosudo which validates the signature and only on successful confirmation, does it parse the rest of the options. This is meant to simplify the sudo configuration required for performing operation.

      5. nmgsshe is used when Host ssh credentials are used / configured for this host from the Console.

      Delete