OptimalDBA
Tuesday, June 23, 2009
It's Still All About The Data
A colleague was looking at an execution plan displayed in Oracle Grid Control. Most of the columns (operation, rows, bytes, cost) are either self explanatory or well known. The one that really bothered my colleague was cpu cost for the sort order by operation. The index scan and table look up were measured in the 10s of 1000s, but the sort order by was being measured in 1,000,000s (actually close to 6,000,000). Cost and io cost were less than 20, so the large number really stood out.
What is the unit of measure? According to the documentation for v$sql_plan, the column cpu_cost means "CPU cost of the operation as estimated by the optimizer's cost-based approach". No mention of a unit of measure. And a cpu cost of almost 6,000,000 to sort an expected 12 rows adds 1 to the statement cost.
A number without a unit of measure is just a number.
The second frustration with Grid Control was on the statistics page for a sql statement. I was looking at the top resource consuming user statement and found the information related to executions, cpu time, logical and physical reads, etc. Quite interesting data, but ultimately useless information. The interface fails to provide any temporal context to the numbers.
8,000,000,000 logical reads is certainly impressive, as are 600,000,000 executions. But what is the sampling time frame? An hour, a day, since the last hard parse, since instance startup? Without the relevant time information, the impact of optimization cannot be known.
Temporal context is critical to time relevant numbers.
Sunday, June 21, 2009
It's all about the Data!
Think about a program that warns you about viruses every time you start your computer. The first few times you run your antivirus program and find nothing. After a few reboots, you begin to ignore the warning...only to have it actually report a virus one day. A virus that takes out all your personal data.
If data displayed is inaccurate or irrelevant, people become conditioned to ignore it. If the data is critical, only display it when the situation requires. If the signs go up only when the path is under water or a box displayed only when a virus is actually found, people will pay attention (or at least are more likely to pay attention).
I recently purchased a car after doing research online. Vehix.com is a pretty good site with lots of cars listed and over 30 filtering options (year, number of doors, trim/style). It did not take long for me to figure out the car I wanted, but searching for a 2005 - 2007 Honda Accord Sedan 4 cylinder automatic transmission EX trim level gray fabric interior was not easy. Although each of these conditions could be specified, it was pretty pointless.
The quality of search on Vehix depends on the quality of data input. It is not unusual to find that Unspecified is the most common subcategory. What is the point of a search category if 90% of the data input is useless for filtering?
In looking for a place to stay for my ride in the Courage Classic Cycling Tour, I used VRBO.com. One of the "features" is a calendar indicating if the unit is available or booked. Unfortunately, this information is often incorrect. Not one of the units I inquired about was available for the time I needed, despite the calendar showing it available. In fact, several of the responses indicated that the unit had been booked for several months!
Interfaces are fluff and noise if the underlying data is not accurate and usable.
Wednesday, April 29, 2009
Advert - Online SQL Tuning Workshop
Frustrated with poorly performing SQL? Unsure where to begin the tuning process? Then join Daniel Fink, Oracle performance specialist, for an Online SQL Tuning Workshop. This hands-on course will provide you with the skills necessary to create scalable, high performance SQL-based Oracle applications (proactive SQL tuning). You will also learn how to identify existing high-load, poor performing SQL statements and tune them (reactive SQL tuning). Particular attention is paid to making the best use of Oracle architecture, bench-marking various SQL formulations and identifying the best schema object (e.g. heap table, IOT, etc.) for a given situation. A full description can be found at http://www.skillbuilders.com/instructor-led-training/Course_outlines/oracle-10g-sql-tuning.cfm
This class runs during the afternoons (1:00 PM through 5:00 PM EDT) of Monday May 4th - Friday May 8th. All you need is an internet and audio (phone or VOIP) connection. No travel, reduced time away from work. The class is limited to 6 students. To register, please call Gary at +1-888-803-5607 or email gary@skillbuilders.com. Mention OptimalDBA for a 15% discount for the class.
Wednesday, April 15, 2009
Damnit Oracle!
In checking the logon triggers, I came across a little situation.
select trigger_name, trigger_body
from dba_triggers
where owner = 'DEMO'
and triggering_event = 'LOGON'
no rows selected
SELECT distinct triggering_event
from dba_triggers
where owner = 'DEMO'
TRIGGERING_EVENT
----------------------------------------
INSERT
LOGON
INSERT OR UPDATE OR DELETE
UPDATE OR DELETE
UPDATE
INSERT OR UPDATE
6 rows selected.
So...LOGON is certainly a valid event, so I decide to retype the query and rerun it...
select trigger_name, trigger_body
from dba_triggers
where owner = 'DEMO'
and triggering_event = 'LOGON'
no rows selected
After trying the query several times and expecting different results (yes...I know...the definition of insanity), I checked the actual data being returned using the DUMP function.
select distinct triggering_event, dump(triggering_event)
from dba_triggers
TRIGGERING_EVENT
----------------------------------------------------------------------------------------------------
DUMP(TRIGGERING_EVENT)
----------------------------------------------------------------------------------------------------
UPDATE
Typ=1 Len=6: 85,80,68,65,84,69
UPDATE OR DELETE
Typ=1 Len=16: 85,80,68,65,84,69,32,79,82,32,68,69,76,69,84,69
INSERT OR UPDATE OR DELETE
Typ=1 Len=26: 73,78,83,69,82,84,32,79,82,32,85,80,68,65,84,69,32,79,82,32,68,69,76,69,84,69
DELETE
Typ=1 Len=6: 68,69,76,69,84,69
DROP
Typ=1 Len=5: 68,82,79,80,32
LOGON
Typ=1 Len=6: 76,79,71,79,78,32
INSERT
Typ=1 Len=6: 73,78,83,69,82,84
INSERT OR UPDATE
Typ=1 Len=16: 73,78,83,69,82,84,32,79,82,32,85,80,68,65,84,69
It seems that the view includes a single space (ascii 32) to the end of some of the events. In looking at the view text, I see this when they concatenate triggering events together. Way to code a view!
So if you are using the TRIGGERING_EVENT as part of a predicate, add a trailing space or wrap it in the TRIM() function or you may not get the results you want!
select trigger_name
from dba_triggers
where owner = 'DEMO'
and triggering_event = 'LOGON '
TRIGGER_NAME
------------------------------
KATHY_START_TRACE
TIM_START_TRACE
START_TRACE
DFINK_START_TRACE
CLOSE_START_TRACE
5 rows selected.
select trigger_name
from dba_triggers
where owner = 'DEMO'
and TRIM(triggering_event) = 'LOGON'
TRIGGER_NAME
------------------------------
KATHY_START_TRACE
TIM_START_TRACE
START_TRACE
DFINK_START_TRACE
CLOSE_START_TRACE
5 rows selected.
Thursday, February 19, 2009
When is a sql statement too long?
There are certainly some red flags
- Multiple accesses of the same table
- UNIONs where the tables and columns are the same, but predicates are different
- 2+ levels of nested subqueries
- Queries that take several pages to display
One metric I came up with (at lunch...so it's totally untested or thought out...should make for good debate fodder) is to compare the number of items in the column list with the number of lines.
- Each column is on a separate line
- Expressions (CASE, DECODE) are considered a single column
- Each table in the FROM clause is on a separate line
- Each predicate clause is on a separate line
- Views are expanded into the statement
- ORDER BY and GROUP BY clauses are ignored for the purposes of calculation
If there are more than 3 lines in the FROM and WHERE/HAVING clauses in a sql statement for every column/expression in the column list, the statement is a prime candidate for returning to development with a request to "start over".
The one question that is unanswered and may be unanswerable is "Is a long sql statement an indication of a bad data model?"
Friday, January 23, 2009
Multiple Databases?
- Security - the data/access must be totally separated from other data/access. While the usage of schemas, privileges, roles, etc. can be used to restrict access within an application, it does not restrict access to those with broad privileges (such as the dba team). If there is a legal or other requirement that no one be able to access both sets of data, then multiple databases makes sense.
- Recovery - different schemas may have different recovery requirements. If the database is several hundred terabytes, but only a few hundred gigabytes is business critical, then it makes sense to separate the business critical component. In case of database recovery, the business critical data can be restored promptly while the remainder is recovered less urgently.
- Performance - although much of the performance setting has been made 'automatic', there are still times where different applications need different settings/configuration for optimal performance. While some of these settings can be handled via login triggers or sql hints, it may be easier to separate them out into separate databases. Each release seems to reduce the need for different parameters, so this issue is decreasing in importance.
- Features/Versioning - this is probably the main reason for separate databases. When a single database is used for multiple applications supplied by different vendors, you can find yourself in a situation where the vendor requirements are in conflict. VendorA needs FeatureA, but that feature actually breaks the code for VendorB. Not only do you need separate databases, but you often need separate Oracle versions/homes.
User Groups
At a time when training budgets are being cut, regional user groups offer tremendous value for the money. Memberships are usually low cost and meetings are often free for members. The larger training events, like RMOUG Training Days, usually cost a few hundred dollars, but pay for themselves in the quality and convenience. If you are willing to volunteer, the cost is even less...and the experience more valuable.
Over the years, I have volunteered as Program Chair and Newsletter editor for RMOUG. One of the constant challenges was getting presentations and articles. Presenting and/or writing is a great way to share your knowledge and learn something new. One of my first articles for RMOUG's newsletter was an exploration of rollback segments...a topic I knew little about when I decided to write the article.
If you are part of a regional Oracle User Group and are looking for articles and presentations, please contact me at daniel.fink@optimaldba.com
Tuesday, January 13, 2009
KCOUG - Advanced SQL
This presentation covers Analytical Functions and Regular Expressions, two topics that I find very useful, but not well understood. The content is very heavy on actual implementation (about 1/3 of the time is demo). I have been using analytical functions to report on performance metrics for the past few years and have found them to be very useful in minimizing table access and maximizing performance of sometimes complicated queries. I recently used regular expressions to remove multiple white spaces from sql text with a single query instead of a pl/sql loop.
Advanced registration is not required. The exact time has not been set, but it will be in the evening. Check the www.kcoug.org site for updates.