OptimalDBA

Tuesday, June 23, 2009

It's Still All About The Data

I was recently confronted with another situation where the interface rendered data useless.

A colleague was looking at an execution plan displayed in Oracle Grid Control. Most of the columns (operation, rows, bytes, cost) are either self explanatory or well known. The one that really bothered my colleague was cpu cost for the sort order by operation. The index scan and table look up were measured in the 10s of 1000s, but the sort order by was being measured in 1,000,000s (actually close to 6,000,000). Cost and io cost were less than 20, so the large number really stood out.

What is the unit of measure? According to the documentation for v$sql_plan, the column cpu_cost means "CPU cost of the operation as estimated by the optimizer's cost-based approach". No mention of a unit of measure. And a cpu cost of almost 6,000,000 to sort an expected 12 rows adds 1 to the statement cost.

A number without a unit of measure is just a number.

The second frustration with Grid Control was on the statistics page for a sql statement. I was looking at the top resource consuming user statement and found the information related to executions, cpu time, logical and physical reads, etc. Quite interesting data, but ultimately useless information. The interface fails to provide any temporal context to the numbers.

8,000,000,000 logical reads is certainly impressive, as are 600,000,000 executions. But what is the sampling time frame? An hour, a day, since the last hard parse, since instance startup? Without the relevant time information, the impact of optimization cannot be known.

Temporal context is critical to time relevant numbers.

Sunday, June 21, 2009

It's all about the Data!

One of my riding routes is the path along the South Platte River. Because the river has a tendency to run high during the spring melt, signs have been placed at the underpasses to indicate that the path might be underwater. Unfortunately, these signs stay up all year and you quickly learn to ignore them as 99.99% of the time you ride the path is dry and clear. Cyclists become conditioned to ignore these signs, with potentially damaging results when the river rises.

Think about a program that warns you about viruses every time you start your computer. The first few times you run your antivirus program and find nothing. After a few reboots, you begin to ignore the warning...only to have it actually report a virus one day. A virus that takes out all your personal data.

If data displayed is inaccurate or irrelevant, people become conditioned to ignore it. If the data is critical, only display it when the situation requires. If the signs go up only when the path is under water or a box displayed only when a virus is actually found, people will pay attention (or at least are more likely to pay attention).

I recently purchased a car after doing research online. Vehix.com is a pretty good site with lots of cars listed and over 30 filtering options (year, number of doors, trim/style). It did not take long for me to figure out the car I wanted, but searching for a 2005 - 2007 Honda Accord Sedan 4 cylinder automatic transmission EX trim level gray fabric interior was not easy. Although each of these conditions could be specified, it was pretty pointless.

The quality of search on Vehix depends on the quality of data input. It is not unusual to find that Unspecified is the most common subcategory. What is the point of a search category if 90% of the data input is useless for filtering?

In looking for a place to stay for my ride in the Courage Classic Cycling Tour, I used VRBO.com. One of the "features" is a calendar indicating if the unit is available or booked. Unfortunately, this information is often incorrect. Not one of the units I inquired about was available for the time I needed, despite the calendar showing it available. In fact, several of the responses indicated that the unit had been booked for several months!

Interfaces are fluff and noise if the underlying data is not accurate and usable.

Wednesday, April 29, 2009

Advert - Online SQL Tuning Workshop

Frustrated with poorly performing SQL? Unsure where to begin the tuning process? Then join Daniel Fink, Oracle performance specialist, for an Online SQL Tuning Workshop. This hands-on course will provide you with the skills necessary to create scalable, high performance SQL-based Oracle applications (proactive SQL tuning). You will also learn how to identify existing high-load, poor performing SQL statements and tune them (reactive SQL tuning). Particular attention is paid to making the best use of Oracle architecture, bench-marking various SQL formulations and identifying the best schema object (e.g. heap table, IOT, etc.) for a given situation. A full description can be found at http://www.skillbuilders.com/instructor-led-training/Course_outlines/oracle-10g-sql-tuning.cfm

This class runs during the afternoons (1:00 PM through 5:00 PM EDT) of Monday May 4th - Friday May 8th. All you need is an internet and audio (phone or VOIP) connection. No travel, reduced time away from work. The class is limited to 6 students. To register, please call Gary at +1-888-803-5607 or email gary@skillbuilders.com. Mention OptimalDBA for a 15% discount for the class.

Wednesday, April 15, 2009

Damnit Oracle!

/* This is on Oracle 10gR2 on AIX. I have not checked it on other releases or platforms*/

In checking the logon triggers, I came across a little situation.

select trigger_name, trigger_body
from dba_triggers
where owner = 'DEMO'
and triggering_event = 'LOGON'

no rows selected

SELECT distinct triggering_event
from dba_triggers
where owner = 'DEMO'

TRIGGERING_EVENT
----------------------------------------
INSERT
LOGON
INSERT OR UPDATE OR DELETE
UPDATE OR DELETE
UPDATE
INSERT OR UPDATE

6 rows selected.

So...LOGON is certainly a valid event, so I decide to retype the query and rerun it...

select trigger_name, trigger_body
from dba_triggers
where owner = 'DEMO'
and triggering_event = 'LOGON'

no rows selected

After trying the query several times and expecting different results (yes...I know...the definition of insanity), I checked the actual data being returned using the DUMP function.


select distinct triggering_event, dump(triggering_event)
from dba_triggers

TRIGGERING_EVENT
----------------------------------------------------------------------------------------------------
DUMP(TRIGGERING_EVENT)
----------------------------------------------------------------------------------------------------
UPDATE
Typ=1 Len=6: 85,80,68,65,84,69

UPDATE OR DELETE
Typ=1 Len=16: 85,80,68,65,84,69,32,79,82,32,68,69,76,69,84,69

INSERT OR UPDATE OR DELETE
Typ=1 Len=26: 73,78,83,69,82,84,32,79,82,32,85,80,68,65,84,69,32,79,82,32,68,69,76,69,84,69

DELETE
Typ=1 Len=6: 68,69,76,69,84,69

DROP
Typ=1 Len=5: 68,82,79,80,32

LOGON
Typ=1 Len=6: 76,79,71,79,78,32

INSERT
Typ=1 Len=6: 73,78,83,69,82,84

INSERT OR UPDATE
Typ=1 Len=16: 73,78,83,69,82,84,32,79,82,32,85,80,68,65,84,69

It seems that the view includes a single space (ascii 32) to the end of some of the events. In looking at the view text, I see this when they concatenate triggering events together. Way to code a view!

So if you are using the TRIGGERING_EVENT as part of a predicate, add a trailing space or wrap it in the TRIM() function or you may not get the results you want!

select trigger_name
from dba_triggers
where owner = 'DEMO'
and triggering_event = 'LOGON '

TRIGGER_NAME
------------------------------
KATHY_START_TRACE
TIM_START_TRACE
START_TRACE
DFINK_START_TRACE
CLOSE_START_TRACE

5 rows selected.

select trigger_name
from dba_triggers
where owner = 'DEMO'
and TRIM(triggering_event) = 'LOGON'

TRIGGER_NAME
------------------------------
KATHY_START_TRACE
TIM_START_TRACE
START_TRACE
DFINK_START_TRACE
CLOSE_START_TRACE

5 rows selected.

Thursday, February 19, 2009

When is a sql statement too long?

One of my tasks is to diagnose and optimize sql statements. One of the challenges I face is when do you not try to optimize the statement and just push it back to development with the request to "start over"? Perhaps even as far as looking at their data model?

There are certainly some red flags
  • Multiple accesses of the same table
  • UNIONs where the tables and columns are the same, but predicates are different
  • 2+ levels of nested subqueries
  • Queries that take several pages to display
It is the last red flag that is of interest today. How long is too long for a sql statement? I have seen some monsters (especially from Oracle's own applications/code). One that I am currently looking at is over 400 lines (with the column lists concatenated into single lines). The problem with these long statements is where do you begin the diagnosis process? Answer - you begin by going back to the developer and ask 'What are you trying to achieve with this query?'.

One metric I came up with (at lunch...so it's totally untested or thought out...should make for good debate fodder) is to compare the number of items in the column list with the number of lines.
  • Each column is on a separate line
  • Expressions (CASE, DECODE) are considered a single column
  • Each table in the FROM clause is on a separate line
  • Each predicate clause is on a separate line
  • Views are expanded into the statement
  • ORDER BY and GROUP BY clauses are ignored for the purposes of calculation
Let's say that each column comes from a different table and has a single filter condition. Add in the join condition and you have 4 lines per column per sql statement. Of course, many statements pull multiple columns from a single table and have compound filters or joins. This leaves me still at 4 lines per column per statement. This means that 75% of the statement are in the FROM and WHERE/HAVING clauses.

If there are more than 3 lines in the FROM and WHERE/HAVING clauses in a sql statement for every column/expression in the column list, the statement is a prime candidate for returning to development with a request to "start over".

The one question that is unanswered and may be unanswerable is "Is a long sql statement an indication of a bad data model?"

Friday, January 23, 2009

Multiple Databases?

A colleague recently asked me about using separate databases for a single business function/project. With a little thought (he needed feedback quickly), I came up with four reasons.

  • Security - the data/access must be totally separated from other data/access. While the usage of schemas, privileges, roles, etc. can be used to restrict access within an application, it does not restrict access to those with broad privileges (such as the dba team). If there is a legal or other requirement that no one be able to access both sets of data, then multiple databases makes sense.
  • Recovery - different schemas may have different recovery requirements. If the database is several hundred terabytes, but only a few hundred gigabytes is business critical, then it makes sense to separate the business critical component. In case of database recovery, the business critical data can be restored promptly while the remainder is recovered less urgently.
  • Performance - although much of the performance setting has been made 'automatic', there are still times where different applications need different settings/configuration for optimal performance. While some of these settings can be handled via login triggers or sql hints, it may be easier to separate them out into separate databases. Each release seems to reduce the need for different parameters, so this issue is decreasing in importance.
  • Features/Versioning - this is probably the main reason for separate databases. When a single database is used for multiple applications supplied by different vendors, you can find yourself in a situation where the vendor requirements are in conflict. VendorA needs FeatureA, but that feature actually breaks the code for VendorB. Not only do you need separate databases, but you often need separate Oracle versions/homes.
Well...that's what I came up with (in a few moments notice). I'd like to hear what I am missing or just plain wrong about!

User Groups

One of my New Year's Resolutions is to be more active in regional Oracle User Groups. I have to admit I've been slacking off with my participation for RMOUG and I will be changing that for this year.

At a time when training budgets are being cut, regional user groups offer tremendous value for the money. Memberships are usually low cost and meetings are often free for members. The larger training events, like RMOUG Training Days, usually cost a few hundred dollars, but pay for themselves in the quality and convenience. If you are willing to volunteer, the cost is even less...and the experience more valuable.

Over the years, I have volunteered as Program Chair and Newsletter editor for RMOUG. One of the constant challenges was getting presentations and articles. Presenting and/or writing is a great way to share your knowledge and learn something new. One of my first articles for RMOUG's newsletter was an exploration of rollback segments...a topic I knew little about when I decided to write the article.

If you are part of a regional Oracle User Group and are looking for articles and presentations, please contact me at daniel.fink@optimaldba.com. I'd love to contribute and help your group with their mission to increase the Oracle knowledge and skills of it's members.

Tuesday, January 13, 2009

KCOUG - Advanced SQL

On January 27th, I will be presenting on Advanced SQL at the Kansas City OUG quarterly meeting.

This presentation covers Analytical Functions and Regular Expressions, two topics that I find very useful, but not well understood. The content is very heavy on actual implementation (about 1/3 of the time is demo). I have been using analytical functions to report on performance metrics for the past few years and have found them to be very useful in minimizing table access and maximizing performance of sometimes complicated queries. I recently used regular expressions to remove multiple white spaces from sql text with a single query instead of a pl/sql loop.

Advanced registration is not required. The exact time has not been set, but it will be in the evening. Check the www.kcoug.org site for updates.

Followers

Blog Archive

About Me