Tuesday, December 11, 2007

How useful is the wait interface?

I write this blog with all due respect for Cary Millsap, Gaja Vaidyanatha, K Gopal, Richmond Shee and Kirti Deshpande. They have all made amazing strides in troubleshooting and performance optimization and I hold them in the highest regard. In no way am I criticizing their ideas and hard work, I am simply challenging current thought as they have all challenged it (and continue to do so).

I was recently preparing a presentation on "Why Tune SQL?" for the UKOUG. This is a contrarian presentation in more ways than one. In terms of format, it contained all of 8 slides (1 title slide, 3 "blank" slides, 3 content slides and 1 end slide) for 45 minutes of talk (and I used all 45). The focus was on the idea that all of our attempts to tune SQL gain us little because SQL is not the root of the problem...but that is fodder for another blog entry.

One of the topics I covered was the lack of instrumentation in non-database tiers. As I was preparing this part of my presentation, a thought occurred to me. A very contrarian thought and one that is sure to cause some controversy. A subsequent discussion with a fellow member of the Oak Table has convinced me that this is an issue to be raised.

Democracy is the worst form of government except for all those others that have been tried.” - Winston Churchill

How useful is the wait (or timed event) interface?

I do not ask this question to disparage the wait interface. Actually, my original assertion was “The wait interface is nearly useless”. This statement is far too aggressive and confrontational, but the idea remains the same. The critical point that I am trying to make is that there are some problems with the interface, but having some instrumentation is better than not having any!

My intention is not to dissuade people from using the wait interface. Quite the opposite. It has and will continue to be a valuable tool for diagnosing performance problems. It cannot and will not be the only tool....no tool can be the one and only. Every tool offers you a view of the data/situation that the tool developer thinks is important. Every tool has limitations, blind spots, misinterpretations...flaws. This does not mean the tool is useless, we just have to know how and when to use the tool.

I point out the problems with the instrumentation for two reasons. First, we need to understand it's limitations. Second, Oracle (and others) need to continue to enhance their instrumentation layer so we can see more, analyze more and ultimately optimize more.

What timed events don't tell you

The wait interface is part of the current implementation of Oracle kernel instrumentation. It was introduced in version 7 and continues to 11g. Each release adds to the number of events (calls) that are instrumented. New events are not always related to new functionality; they may be instrumentation of existing code.

Oracle instrumentation often tells us where the session is NOT spending time in the database tier. Unfortunately, the system tiers that are outside of the database lag far behind Oracle in terms of instrumentation. Application, network, operating system and storage tier instrumentation is nonexistent, inconsistent or plain hard to understand (at least for me). Nothing is more frustrating to me than to work with my counterparts in the non-DBA world and
ask them where time is being spent. Much of the time they throw up their hands and say “I don't know” (usually followed by blaming the database tier for the poor performance).

Instrumentation can tell you that something has occurred or is occurring, but it cannot tell you why. I can see 'db file sequential read' as the top timed event, but that does not tell my why. Excess I/O, slow SAN, read consistent view generation? I can see that a session is spending a large amount of time requesting data from disk, but I cannot tell whether there are issues with the operating system, server hardware or storage array. A single statement can have multiple dbfs events with the time ranging from several microseconds to several seconds. So...why such a spread in time? I can't tell you...because the wait interface can't tell you.

Or “SQL*Net message from client”, a supposed idle event. In reality, it often is a high contributor to response time because it is the 'bucket' where user and application time goes, certainly an area where response time is consumed. And yet, we are told to ignore this time and tools don't report on it. Time and again, I am asked to diagnose a user's session that is “hung” only to find that the database session is waiting on SQL*Net message from client...but the user is still seeing the spinning hour glass. Obviously, the problem is somewhere else...but where? And the user does not care, they want to get their work done, so the DBA sounds like he/she is passing the buck when, in reality, we probably know much more about what a session is doing/has done than any of the other tiers in the system stack.

What about all the other time?

Much of the time spent “inside” the database is not instrumented! Logical I/O, CPU activity, etc. is not instrumented. Some years ago, I did some research (along with Kirti Deshpande) on tracing logical i/o. We found several 'events' we could set to dump out information on logical i/o. All of it was undocumented, much of it was indecipherable and none of it included any timing information. I could look at trace files to see that a session was creating consistent versions of the block...but I could not tell how long it was taking.

What about CPU time that is not logical i/o? In 10g, some timing information is being exposed thanks to the DB Time model. I can see some summary information on parse time, pl/sql time, java time, etc. Definitely an improvement...but the devil is often in the details. I can query v$sesstat and see the “CPU used by this session” value increasing, but I can't tell you much more. My next step for diagnosing problems is to drill down...but the lack of more detailed instrumentation has stopped me.

Profiler obfuscation

When running extended sql trace, it is not unusual for a single business process/user action to generate a trace file that is millions of lines long. While this makes for enlightening reading, it also slows the process of problem determination when performing troubleshooting. In order to summarize the information, the trace file is processed by a profiler, which aggregates the information and removes almost all of the detail.

By removing these details, the profiler loses key troubleshooting information. For the same action performed by 2 different users, one has 100 “db file sequential reads” while the other has 100,000 “db file sequential reads”. Or the first session took 100 minutes for “db file sequential reads”, while the second took 1000 seconds. Why the difference? Different execution plans, read consistency, non-database problems are all possible explanations.

One of the effects of current profiling technology is that it hides variations in the detail. I can see that there were 10 “db file sequential read” events consuming 10 seconds. I do not know if there were 10 1 second events or 1 9 second event and 9 1/9th of a second events. Unless I dive into the actual trace file and read each timed event call. And even that won't tell me why there was an even distribution or a skew.

Instrumentation is absolutely critical

We need two kinds of instrumentation at every level, time line instrumentation and resource instrumentation (time is a resource...but it is important enough that I separate it out). Without this instrumentation we are left with vast parts of the system stack we cannot analyze. We cannot optimize what we cannot analyze! We can only guess.

We have to find the balance between instrumentation and the intrusion effect. If we instrumented everything in detail...we would likely see that the time/resource required for instrumentation was greater than the time/resource required for the actual work.

Consider the alternative

The wait interface is somewhat useless...the only thing more useless is not having a wait interface at all. If we don't have it or, worse, don't use it...we are working blind. We guess.

What can we do? We can continue to raise the issue of instrumentation within our community and organizations. If you are a developer, look for ways to instrument your code. Perhaps not every call, but look at the main calls, the main groups of actions. Start somewhere!


I am reminded of the fable of the blind men and the elephant. Until we can see all of the elephant in detail, our understanding is incomplete and perhaps just plain wrong.

7 comments:

Gary Myers said...

Interesting topic.
As well as Waits, Stats (especially session stats) can be useful and it would be useful to see, for example, "undo records read" for SQL statements as a contributor to the workload.
However unless someone can put some 'typical' costs (or common basis) to those stats it can be hard to prioritise which ones are worth looking at.
Waits have that inherent time component to use for prioritising.

[PS. Develop your Apps in Apex, and get rid of all that mid-tier crud. User -> Browser -> DB.]

Noons said...

Much of the time they throw up their hands and say “I don't know” (usually followed by blaming the database tier for the poor performance).

Aye! Wish I had a buck for everytime I heard that one.

Interesting post. And very pertinent to us: just found out we had lost a disk in the SAN purely because I decided to run my own IO benchmark instead of relying only on Oracle's and AIX IO stats.

dombrooks said...

Interesting post and interesting presentation - I attended that in Birmingham.

It's so true that instrumentation is poor everywhere and the truth hits home the hardest when you've got a problem.

BUT, in my experience, at least 80% of the time, a performance problem does originate in the database and >90% of the time it is a poor SQL statement.

So, why tune SQL? Because most of the people (non-Oracle specialists) writing the SQL haven't given it enough attention/thought during development and before release. And don't get me started on why the problem SQL wasn't then picked up in testing...

yas said...

I especially agree with the part about the instrumentation of the cpu activity. Being able to know what percentage of cpu time is spent on what (logical I/O?, reading undo blocks?, etc...) would help very much in some cases.

Robyn said...

The wait interface is useful but it doesn't solve everything, and unless other layers follow suit, it will continue to be an incomplete view.

The wait interface provides clues; the reader still has to follow the trail and figure out the solution. Sometimes I think the prime benefit is that it provides documentation of suspected issues. It's much easier to get code rewritten or new resources with clear evidence.

Nice post ... Robyn

Allan Nelson said...

So.....we have Linux, Apache, enough open source J2EE stacks. Let's start lobbying or coding. This could be a clear differentiator between open source and closed source. Even Oracle could get involved. They have their own distribution and could extend the wait event concept into Linux. As Oracle works on instrumenting the DB could not some value be added to other tiers and if it is really a good idea, perhaps its advantages could be shown by that method.

Anonymous said...

The article is certainly interesting and absolutely correct, application instrumentation is vital and should ideally be considered when writing applications. In addition documentation regarding the wait interface is limited especially when wanting to dig into the detail. We have foreground and background events which is fine but the relationships between wait events is difficult to establish, some are clearly defined e.g log file sync included log file parallel write, LGWR wait on LNS (dataguard config) e.t.c but others are not defined so building up a clear breakdown of response time is not easy, I have yet to establish one with certainty.

Followers

Blog Archive

About Me