This is not a discussion of parsing and peeking with bind variables. Those topics are well covered by others, especially Tom Kyte and Jonathan Lewis.
A frequent diagnostic task is to determine why a sql statement has started performing poorly or that the performance is very erratic. The first question to answer is "What is different now than when it was running well?" Perhaps it is processing more data, reading more blocks from disk or memory, or using a different execution plan.
The key to querying the historical activity of a sql statement is being able to locate that statement in your performance tables (Statspack, AWR or home grown). When a statement is parsed, a sql id is assigned. The rules about formatting and literals apply, so two statements that are 'nearly identical' will get different sql_ids. If the statements are identical, then you will get the same sql_id.
As an aside, sql_id is the best way to locate a sql statement as the value of hash_value may not be "unique enough". I always thought there was only unique and nonunique, apparently there are shades of unique gray.
Two recent diagnosis tasks started with the same description "This process used to run well, now it is running slowly". One statement (Query1) used bind variables and had the same sql_id over time. The other statement (Query2) was being dynamically generated, used literals and had a different sql_id for previous runs.
For Query1, I queried the Statspack tables and found that the elapsed time for a statement had suddenly increased (though the cpu time, rows processed, buffer gets, disk reads had remained basically the same). Further diagnosis located the root cause (SAN configuration issue).
For Query2, an extended sql_trace revealed a shockingly bad execution plan. Unfortunately, as the sql_id was changing, I could not look at the historical performance (without manually reviewing all the query code in the performance tables...a task that would have taken weeks/months). We were left without any real clues as to 'what changed' and we could not duplicate better performance in any of the non-production environments.
Had Query2 used bind variables instead of literals, we could have gone back through history. This may or may not have revealed the root cause of the performance problem, but we would have had a reasonable opportunity.
Quite often diagnosis and optimization is being able to answer that question "What is different now than when it was running well?". Proper use of bind variables can be very beneficial in gathering the data to answer that question.
Wednesday, September 03, 2008
Subscribe to:
Post Comments (Atom)
3 comments:
Back in the late 80s/early 90s, at the same time as creating Statspack, my colleagues Dave Ensor and Graham Wood wrote a program (whose name I now forget) which parsed trace files and made some tuning suggestions (suitable for use against Oracle 6.0.x at the time).
As part of the parsing, SQL statements themselves were parsed and compared. One of the most common recommendation was "you have statements that only differ in use of bind variables/literals" or the differences could be in case and space, in the order of columns fetched etc; the recommendation was to use bind variables, to standardize on one statement, etc.
The same code was part of the never-released Oracle Metrics (c 1994).
Given an SQL parse tree, it's (fairly) easy to spot "similar" (though textually different) SQL statements. Should be a piece of cake to incorporate that into the Diagnostic Pack (or Statspack) if it's not there already.
Sadly I don't still have the source (and it would be copyright Oracle anyhow). Graham might be able to dig it up?
That is an option. I know that the code exists (cursor_sharing, Hotsos Profiler), so it is just a matter of recreating it. I recall creating a "replace literal characters with other characters" a few years ago (based on something I saw online, perhaps AskTom), but I cannot seem to locate it. Once I replaced the literal characters, I generated a new hash_value, which I used to match up 'similar' sql.
One area where this might be problematic is with variable in lists.
I wrote a perl/dbi script that reads v$sqltext and v$sql. It looks for literals and replaces them with tokens like 's' for a string or n for a number. The hash value for each query is stored with the modified query. When everything has been read, the program outputs the modified sql and all of the hash values for queries that were translated into the modified sql.
It's not a perfect program by any means, but it might be useful for someone who wants to make a case about using bind variables. I would be willing to donate it - maybe someone with more time and chops could clean it up a bit, fix the sorting. If you would like a copy of the program, let me know in this blog and I will sent it to you via email.
Post a Comment