Thursday, November 21, 2013

ORA-4030: out of process memory when trying to allocate 64544 bytes (sort subheap/sort key)

An oracle ORA-4030 occurred when we were testing a failover to a Data Guard physical standby.

We checked the trace files but they didn't give us much other than the statement concerned. What was really odd about this was that the query had worked fine on the primary database and the oracle system parameters where all the same (apart from those specifically relating to the Data Guard setup).

We found that by reducing the sga_target and pga_aggregate_target that the query seems to work! But as soon as more users came on, the problem reoccurred.

The following parameters all seemed to be concerned with this problem.
SGA_MAX_SIZE, SGA_TARGET, PGA_AGGREGATE_TARGET, SORT_AREA_SIZE, HASH_AREA_SIZE. We hadn't set MEMORY_MAX_TARGET or MEMORY_TARGET having upgraded from 10g and not got round to it yet. We also considered the hidden parameter _PGA_MAX_SIZE.

Details:  the SGA was set to 7Gb and the PGA to 1Gb. We assumed this meant that approximately 8Gb of the total 25Gb was being used by oracle. On inspection, pretty much all of the memory was being used by oracle.

Reading about pga_aggregate_target, you'll find that it represents the maximum amount of memory used by ALL of the user processes. But this isn't true. It's only a target. We had assumed that if you add up the size of the sga and the pga you get the total memory consumption. However this isn't the case. Oracle seems happy to take as much memory as it can.

Solution: the oracle user (AIX) had insufficient ulimit settings. At 11.2 the installation guide recommends unlimited setting for FILE, CPU, DATA, STACK and Real Memory. This means the following (as root)...

tail /etc/security/limits
oracle:
            fsize = -1
            core = -1
            cpu = -1
            data = -1
            rss = -1
            stack = -1
            nofiles = -1

These parameters are dynamic so there is no need to restart the database or the server, however, this should have been dealt with in the installation.

*** UPDATE: the parameters for ulimit might not be dynamic. We found that old logins to the server didn't assume the new values when we connected to the database. Also the web servers weren't showing any improvement so they may need to restart their connections.


Some other bits......

1) ulimit -Ha shows the maximum default setting but ulimit -a shows the actual settings.

2) ipcs -bm shows the memory in use by the SGA. Look for the SEGSZ (bytes) owner by the oracle user.

3) the memory used by the PGA can be assessed thus. It's return the memory in Mb.

     ps -elf | egrep " oracle${ORACLE_SID} | ora_.*_${ORACLE_SID} " | grep -v grep |
      awk '{sum += $10} END {print sum/1024}'

(You might need to change the quotes if you cut and paste this).

    This is an estimate and might by overly large. Remember the memory sizes are constantly changing.

4) svmon -G shows the server memory usage. Unfortunately the memory sizes given are in pages on AIX, so they must be multiplied by 4096 before being divided by 1024*1024 to get Mb.

5) topas on AIX is very useful. It's like 'top' on Linux. Use 'c' to switch between CPU views and 'M' to display the memory usage.

HappyJohn.



No comments:

Post a Comment