SUM error in phase MAIN/SHDCRE/SUBMOD_SHDDBCLONE/DBCLONE => ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT

During a support package stack upgrade, in Preprocessing phase of SUM, more explicit MAIN/SHDCRE/SUBMOD_SHDDBCLONE/DBCLONE the following error appeared:

 

 

 

 

 

 

 

 

 

 

If the upgrade fails during DBCLONE phase, there are not much details of the error that can be found in the logs of SUM.
During this stage, the job DBCLONE (that is actually divided in several DBCLONEX jobs running in parallel) is running, so if this job gets canceled it is important to check the job log, syslog and short dumps.

 

 

 

 

In my case the job was getting canceled with the short dump: DBSQL_SQL_ERROR CX_SY_OPEN_SQL_DB SQL error “SQL code: 39” occurred while accessing table “FDT_FNCT_0110S”. Database error text: “SQL message: ORA-00039: error during periodic action #ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT”:

 

 

 

 

 

This error can occur under Oracle database 12c, where a new parameter related to PGA, in addition to the already existing PGA_AGGREGATE_TARGET has been introduced => PGA_AGGREGATE_LIMIT.
Until Oracle 12c, the sessions that used memory from the PGA for activities like sorting, group-by, hash-joins etc… could have grown without restrictions, even more than PGA_AGGREGATE_TARGET value, this was only set a a soft limit. It could have happened that this impacted the overall memory usage of the database. That is why PGA_AGGREGATE_LIMIT has to be seen as a hard limit, when it is reached, Oracle will kill the sessions that are using the most untunable memory.
In our case DBCLONE jobs were terminated when the value of PGA_AGGREGATE_LIMIT was reached. In order to solve this you can increase the value of the parameter PGA_AGGREGATE_LIMIT, that I would say it is the recommended method, but you have also the option to deactivate the hard limit, parameter PGA_AGGREGATE_LIMIT, by setting it to 0. In my case setting the parameter PGA_AGGREGATE_LIMIT to 4Gb from the default 2Gb (alter system set PGA_AGGREGATE_LIMIT=4096M scope=both;), was sufficient in order for the job to complete with success. The good part is that the parameter is dynamic, so no restart of the database is necessary to activate the parameter and you also have the option to test the correct value for you, more easily.

SUM error Unable to locate the requested resource

A common problem with SUM, is when the tool is attempted to be started and in browser the following error is being displayed:

SUM error

 

 

 

 

“Not Found
Unable to locate the requested resource”

Even if after multiple attempts and maybe after a hostagent update, the same error is being displayed.
In this case there is a problem with one of host agent processes that remained hanged and if you check the processes at OS level you will notice 4 processes associated to the host agent:

ps -ef | grep host
root 7183 6053 0 17:29:32 ? 0:08 /usr/sap/hostctrl/exe/saphostexec pf=/usr/sap/hostctrl/exe/host_profile
root 24899 6053 0 15:50:32 ? 0:03 /usr/sap/hostctrl/exe/saposcol -l -w60 pf=/usr/sap/hostctrl/exe/host_profile
sapadm 24829 6053 0 15:50:31 ? 0:02 /usr/sap/hostctrl/exe/sapstartsrv pf=/usr/sap/hostctrl/exe/host_profile -D
root 24826 6053 0 15:50:30 ? 0:00 /usr/sap/hostctrl/exe/saphostexec pf=/usr/sap/hostctrl/exe/host_profile

The first process, that you can see with a different timestamp, it is the hanged one, kill this process and retry opening SUM tool, it should start successfully now.
Normally for hostagent, you should only see 3 process all the time, something similar to this:

sapadm 15877 28579 0 Jun 24 ? 206:27 /usr/sap/hostctrl/exe/sapstartsrv pf=/usr/sap/hostctrl/exe/host_profile -D
root 16692 28579 0 Jun 24 ? 559:45 /usr/sap/hostctrl/exe/saposcol -l -w60 pf=/usr/sap/hostctrl/exe/host_profile
root 15793 28579 0 Jun 24 ? 14:12 /usr/sap/hostctrl/exe/saphostexec pf=/usr/sap/hostctrl/exe/host_profile