Performance Issues? Don’t Panic… Yet.
As a database Administrator, ensuring optimal database performance is one of your main duties and one that should not be taken lightly(unless of course you like getting phoned in the middle of the night). Throughout your career it is likely to face performance problems from time to time. The below paragraphs explain at a high level, some steps you could take to troubleshoot and fix performance problems. The information presented is gathered from my experience solving such problems and from material I’ve read over time
Let’s Clear the Air Before the Tuning Begins
Before I proceed, I would like to state that there is a myriad of reasons for why a database or application could be performing slowly (It may or may not even be a database problem e.g it could be the OS, Network, Storage etc) and there exists an even greater number of techniques that could be employed to fix these. The subject of performance tuning to be covered in its entirety requires an entire bookshelf, and I could not hope to touch on all the techniques not even all of those that I know of In this blog post.
I would also like to state upfront that “performance” as an issue has to do with the user expectations of how fast or slow an application or SQL should run. Further, the database is never fast or slow, rather the sessions that connect to the database and run SQL against it can be fast or slow down when they hit a bump. DML statements are those most prone to encounter such roadblocks. Your main goal in performance tuning is to reduce response time and improve throughput(optimize/reduce cost of system resources)
Who Broke the Database? Let’s Investigate… and also : THINK!
The first step in performance tuning is to identify problem SQL. This can be done using the command:
SELECT
sess.username,
sess.sid,
sess.serial#,
sess.sql_id,
sqlst.executions,
ROUND (sqlst.elapsed_time/1000000, 2) total_time,
ROUND (sqlst.cpu_time/1000000, 2) cpu_seconds
FROM V$SESSION sess
INNER JOIN V$SQLSTATS sqlst ON sess.sql_id = sqlst.sql_id
ORDER BY elapsed_time desc
FETCH FIRST 10 rows only
/
-- The above query ranks the transactions by the total number of elapsed seconds.
-- You can also rank the statements according to CPU seconds used.
-- Once you have the value for the SQL_ID column for the problematic SQL,
-- it's a simple matter to find out the execution plan for this statement
after identifying the problematic session or SQL statement by querying the the v$session and v$sqlstats, the first step to fixing any performance issue is to step back from the SQL and focus on the business requirement. Attempt to answer the question:
What is this given statement trying to achieve? and is the means employed to achieve it the best ?
Is it possible that the results could be easily gotten from another table? perhaps this data is a subset of another already existing report? or it may not be needed at all? Could it be better to re-write this SQL from scratch? etc. Generally, think of what is trying to be accomplished and think! Think! “could this be done another way first?” rather than jumping to fix the means that has been employed to accomplish it.
assuming that yes the means employed fits the requirements, then start looking at the problem SQL and it’s execution plan; which shows you the steps the optimizer took to resolve the particular statement. One of the first things to look at in the execution plan is the cardinality.
When Cardinality lies, the optimizer cries
Cardinality refers to the number of unique values in a database column. For every sql statement the optimizer has such estimates for the number of rows it expects that query to return and if this differs from the actual number of rows returned then the optimizer cannot return a good plan. That is also an indication that the optimizer does not have good statistics or that stats are stale or it may need more than regular stats perhaps extended stats or a histogram as the issue could be related to correlation and/or skew depending on the query being run; First attempt to fix the issue by making sure the optimizer has the neccessary stats and then run the query again. stats can be gathered by using the dbms_stats package
To see the cardinality, run the query with the /*+ Gather_plan_statistics*/ hint and compare the output from the execution plan:
E-Rows vs A-Rows
E-Rows: This shows the estimated rows; the number of records optimizer “thought” or expected a particular step in the plan will return
A-Rows: The actual number of rows that where returned by that step
The closer these numbers are to each other the greater the chance the optimizer will pick a good plan
consider the below output:
SQL> SELECT * FROM dbms_xplan.display_cursor('5z04arnxmfmrm', format => 'ALLSTATS LAST');
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 5z04arnxmfmrm, child number 0
-------------------------------------
select firstname /*+ GATHER_LAN_STATISTICS */ || ' ' || lastname as
"customer_name" , order#, '$'|| paideach || ' CAD' as "total_paid",
model || ' ' || storage as "Item(S) bought" from customers natural join
orders natural join orderitems natural join inventory order by 3 DESC
Plan hash value: 553642985
-------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 35 |00:00:00.01 | 8 | | | |
| 1 | SORT ORDER BY | | 1 | 33 | 35 |00:00:00.01 | 8 | 4096 | 4096 | 4096 (0)|
|* 2 | HASH JOIN | | 1 | 33 | 35 |00:00:00.01 | 8 | 1162K| 1162K| 1342K (0)|
|* 3 | HASH JOIN | | 1 | 35 | 35 |00:00:00.01 | 6 | 1265K| 1265K| 1350K (0)|
| 4 | MERGE JOIN | | 1 | 35 | 35 |00:00:00.01 | 4 | | | |
| 5 | TABLE ACCESS BY INDEX ROWID| INVENTORY | 1 | 35 | 35 |00:00:00.01 | 2 | | | |
| 6 | INDEX FULL SCAN | INVENTORY_PK | 1 | 35 | 35 |00:00:00.01 | 1 | | | |
|* 7 | SORT JOIN | | 35 | 35 | 35 |00:00:00.01 | 2 | 2048 | 2048 | 2048 (0)|
| 8 | TABLE ACCESS FULL | ORDERITEMS | 1 | 35 | 35 |00:00:00.01 | 2 | | | |
| 9 | TABLE ACCESS FULL | ORDERS | 1 | 34 | 34 |00:00:00.01 | 2 | | | |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 10 | TABLE ACCESS FULL | CUSTOMERS | 1 | 31 | 33 |00:00:00.01 | 2 | | | |
-------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CUSTOMERS"."CUSTOMER#"="ORDERS"."CUSTOMER#")
3 - access("ORDERS"."ORDER#"="ORDERITEMS"."ORDER#")
7 - access("ORDERITEMS"."PRODUCT_ID"="INVENTORY"."PRODUCT_ID")
filter("ORDERITEMS"."PRODUCT_ID"="INVENTORY"."PRODUCT_ID")
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Note
-----
- this is an adaptive plan
In the above example, e-rows and a-rows are close so there is a good chance the optimizer picked a correct plan.
Smart SQL, Happy Database
If stats are fine, then look at the SQL; there are a multitude of things with the way in which a SQL is written that could greatly affect its performance. With a quick scan through the SQL, you can easily spot these. Fixing them could provide instant performance benefits. Some things to look out for include ;
- Using Functions on predicate columns e.g
SELECT employee_id, name, salary
FROM emp
WHERE name = upper('john') ;
-- is better than
SELECT employee_id, name, salary
FROM empWHERE upper(name) = 'JOHN';
- Use of string literals instead of bind variables causing hard parsing
- Row by row fetches instead of bulk collects
- Using union where union all could suffice
- Nested sorts in a subquery
For example consider the following query and it’s plan;
SELECT * FROM
(SELECT
upper(model)
,manufacturer
,round(cost,2) total_cost
,round(retail,2) retail_price
FROM
( SELECT *
FROM inventory
ORDER BY cost DESC
)
)
WHERE rownum <= 10
/
...
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.11 | | | |
| 1 | SORT ORDER BY | | 1 | 10 | 10 |00:00:00.11 | 2048 | 2048 | 2048 (0)|
|* 2 | COUNT STOPKEY | | 1 | | 10 |00:00:00.10 | | | |
| 3 | VIEW | | 1 | 2283 | 10 |00:00:00.10 | | | |
|* 4 | SORT ORDER BY STOPKEY| | 1 | 2283 | 10 |00:00:00.10 | 2048 | 2048 | 2048 (0)|
|* 5 | FIXED TABLE FULL | X$KKSSQLSTAT | 1 | 2283 | 971 |00:00:00.10 | | | |
--------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------
2 - filter(ROWNUM<=10)
4 - filter(ROWNUM<=10)
5 - filter((INTERNAL_FUNCTION("CON_ID") AND "INST_ID"=USERENV('INSTANCE')))
compared to the below;
SELECT
upper(model)
,manufacturer
,round(cost,2) total_cost
,round(retail,2) retail_price
FROM
inventory
ORDER by total_cost DESC
fetch first 10 rows only ;
...
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.01 | 2 | | | |
| 1 | SORT ORDER BY | | 1 | 10 | 10 |00:00:00.01 | 2 | 2048 | 2048 | 2048 (0)|
|* 2 | VIEW | | 1 | 10 | 10 |00:00:00.01 | 2 | | | |
|* 3 | WINDOW SORT PUSHED RANK| | 1 | 35 | 10 |00:00:00.01 | 2 | 2048 | 2048 | 2048 (0)|
| 4 | TABLE ACCESS FULL | INVENTORY | 1 | 35 | 35 |00:00:00.01 | 2 | | | |
----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10)
3 - filter(ROW_NUMBER() OVER ( ORDER BY ROUND("COST",2) DESC )<=10)
Both Queries return the same result however the first query is more expensive than the second due to the extra sorts inside the subquery which could be done only once at the end.
- Multiple commits inside of the same transaction
- No indexes or even poorly defined indexes e.g. filtering on a column that is in an index but is not the first column in the index or a BTREE instead of a bitmap or maybe a regular index where a reversed key index could help (see: selecting an indexing strategy)
- Making use of SQL macros which improve code reusability and reduce context switching between PL/SQL and SQL engines
- Avoid using hints in production code!
There are many more of such conditions that could be identified. implementing your fixes along with a thorough examination of the explain plan for anything else that could be going wrong in the SQL can go a long way to help fix performance issues and point you in the right direction.
When You Need Oracle’s Magic Tricks
If in the event none of the above can help fix the query or you find yourself unable to pinpoint the root cause of the problem or maybe your brain is just tired. Good news! Oracle provides facilities inside of the database to help you tune your queries. these include;
- the SQL tuning advisor which could be passed a SQL ID and provide recommendations on how to improve the performance of the sub-optimally performing statements such as: SQL profiles or index recommendations or a SQL plan baseline.
- There also exists dynamic sampling; which enables the database server to estimate missing or outdated database statistics to improve the accuracy of query predicate selectivities.
The World your query lives in, Matters
As mentioned earlier there are many issues that could be causing a performance issue, sometimes it could be a problem at the operating system or at the network level or maybe with the application code or even inside the database like poorly set system parameters or application design causing lots of locks and sometimes the requirement just can’t be met or the cost of meeting it is not worth the performance hit that it will cause to the database and explaining this to the user, developer or management could be the solution.
No Amount of Tuning Can Fix That Schema
Lastly but not the least, I believe that performance tuning does not begin when you are required to fix an issue but rather while designing your application. A poorly designed database with no constraints(or worse constraints defined inside the application code), extra large columns than required(making everything varchar2(4000) 🙃) no referential integrity, Using one large application table, unnormalized data, bit buckets or using a generic model etc. ; then there is no amount of tuning that can fix your query. Proper database design as well as adherence to normalization standards could save you hours of “performance tuning” or even worse; a dreaded midnight phone call!
Cheers🎊


