Tuesday, August 22, 2006

10053 to the rescue again!

I had a curious incident the other day. I had two databases, each containing an identical copy of a table. Previously, queries referencing the table in one database had run in the same time as queries referencing the table in the other database. Now, however, one of the databases ran the query painfully slowly.

When I took a look, the execution plans for the queries were not the same, so Of course, the first place I looked was in the statistics. What I found was quite interesting - the "fast" table had been analyzed with a sample of 10%, whereas the "slow" table had been analyzed 100%. When I re-analyzed the table with a 10% sample size, hey presto the original (fast) plan was once again chosen.

So I started taking a look at HOW the CBO had come up with the respective plans, and I did this with a 10053 trace. (Although the query was quite complicated, I've simplified it right down in this post, for clarity.) What I'll do here is explain first what the problem was, and then show how it affected the decision of the CBO...

The problem was that in the million row table, there was one row which had an incorrect value. It was a date of 13-JUL-8569. It was just wrong. Very wrong. There shouldn't have been anything higher than SYSDATE plus about a month. Here's what I saw (table and column names changed to protect the innocent!):


SQL> select to_char(max(my_date),'DD-MON-YYYY HH24:MI:SS') from my_table;

TO_CHAR(MAX(MY_DATE)
--------------------
13-JUL-8569 00:00:00

SQL> select to_char(min(my_date),'DD-MON-YYYY HH24:MI:SS') from my_table;

TO_CHAR(MIN(MY_DATE)
--------------------
03-JUL-1996 00:00:00


My_table has an index on my_date.

In the query we had the predicate WHERE my_date < SYSDATE. So without a histogram, the optimizer was assuming a uniform distribution of dates between 1996 and 8569, meaning that the likelihood of a date being < SYSDATE was pretty small. In fact, without this bad date, the range would be between 1996 and SYSDATE+30, meaning that the likelihood of a date being < SYSDATE was extremely high.

What had happened was that in sampling at 10%, we had not sampled the bad date, and hence calculated a sensible cardinality, but as soon as we sampled 100% we included the bad date and the cardinality was stupidly low. (Needless to say, the solution was to fix the bad data rather than to gamble on the sample missing it!)

To illustrate this I have recreated a simplified version below.

SQL> create table my_table
2 (my_date date, my_char char(100) );

Table created.

SQL> begin
2 for i in 1 .. 10000 loop
3 insert into my_table values (sysdate-dbms_random.value(1,1000), 'X');
4 end loop;
5 commit;
6 end;
7 /

PL/SQL procedure successfully completed.

SQL> create index test1_n1 on my_table(my_date);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>'SYS',tabname=>'MY_TABLE',cascade=>true);

PL/SQL procedure successfully completed.

SQL> update my_table set my_date=sysdate+1000000 where rownum=1;

1 row updated.

SQL> commit;

Commit complete.

SQL> set autotrace on
SQL> select * from my_table where my_date
<some editing here, but not of anything relevant>

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=38 Card=10000 Bytes=
1090000)

1 0 TABLE ACCESS (FULL) OF 'MY_TABLE' (Cost=38 Card=10000 Byte
s=1090000)


831 consistent gets


SQL> exec dbms_stats.gather_table_stats(ownname=>'SYS',tabname=>'MY_TABLE',cascade=>true);

PL/SQL procedure successfully completed.

SQL> select * from my_table where my_date < sysdate;

<some editing here, but not of anything relevant>

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=10 Bytes=109
0)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'MY_TABLE' (Cost=13 Card=
10 Bytes=1090)

2 1 INDEX (RANGE SCAN) OF 'TEST1_N1' (NON-UNIQUE) (Cost=3 Ca
rd=10)


10631 consistent gets


So you can see that the index range scan looks preferable when the bad data is analyzed. The operation is assigned a falsely low cost, and you can see the far greater number of consistent gets required.

As I mentioned, the original query was a quite complicated, and involved joining several tables, so when I started looking at the problem, it wasn't quite a "needle in a haystack" but it was, shall we say, not obvious. It was the 10053 trace which led me to find the cause of the problem. I did this as follows:


SQL> alter session set events '10053 trace name context forever, level 1';

Session altered.

SQL> explain plan for select * from my_table where my_date <sysdate;

Explained.


The trace files were in my udump directory, and this is what I saw:

When bad data excluded, query runs fast using Full Table Scan -

SINGLE TABLE ACCESS PATH
Column: MY_DATE Col#: 1 Table: MY_TABLE Alias: MY_TABLE
NDV: 10000 NULLS: 0 DENS: 1.0000e-04 LO: 2452971 HI: 2453970
NO HISTOGRAM: #BKT: 1 #VAL: 2
TABLE: MY_TABLE ORIG CDN: 10000 ROUNDED CDN: 10000 CMPTD CDN: 10000
Access path: tsc Resc: 38 Resp: 37
Access path: index (scan)
Index: TEST1_N1
TABLE: MY_TABLE
RSC_CPU: 74650907 RSC_IO: 9963
IX_SEL: 1.0000e+00 TB_SEL: 1.0000e+00
BEST_CST: 38.00 PATH: 2 Degree: 1

.
When bad data included, query runs slowly using Index Range Scan -

SINGLE TABLE ACCESS PATH
Column: MY_DATE Col#: 1 Table: MY_TABLE Alias: MY_TABLE
NDV: 10000 NULLS: 0 DENS: 1.0000e-04 LO: 2452971 HI: 3453971
NO HISTOGRAM: #BKT: 1 #VAL: 2
TABLE: MY_TABLE ORIG CDN: 10000 ROUNDED CDN: 10 CMPTD CDN: 10
Access path: tsc Resc: 38 Resp: 37
Access path: index (scan)
Index: TEST1_N1
TABLE: MY_TABLE
RSC_CPU: 89153 RSC_IO: 12
IX_SEL: 9.9883e-04 TB_SEL: 9.9883e-04
BEST_CST: 13.00 PATH: 4 Degree: 1

.
Here you can see here the falsely low computed cardinality, which in the case of the predicate I used, is NUM_ROWS * (value-low / high-low).

I don't propose to reproduce it all here, but the paper by Wolfgang Breitling entitled "A look under the hood of CBO" is a good place to start reading up on this stuff if you are not familiar, and of course you can't discuss this without mentioning Jonathan Lewis' excellent "Cost Based Oracle Fundamentals".

Thursday, July 27, 2006

We do not use...

I'm so excited, I just did my first ever "we do not use". I am forever indebted to Mogens for introducing me to this concept; for the uninitiated, please see www.wedonotuse.com

It was a relatively straightforward "we do not use" scenario, but I was astonished at the ease with which I was able to end an unwanted sales call whilst retaining utmost politesse.

It related simply to a magazine subscription, to which I replied "thank you but we do not use magazines" - and that was it! End of call! Thanks Mogens.

Friday, July 21, 2006

Parallel DML / Controlfile Lock

The other day one of my production databases experienced a problem... Everything was fine until suddenly an XA transaction timed out and rolled back, holding a lock that a bunch of other sessions wanted to acquire, and causing them to time out and roll back too.

The database was behaving normally, it's just that something happened to make a transaction take fractionally longer than the application designers had specified. Of course, my job was to find out what that "something" was.

It's a complex architecture, and there were many places where the problem could have been - in the network, the middleware, the database, etc. Fingers were pointing at the database, and in my usual style I defended Oracle to the hilt. However, I needed to look into it just to prove it wasn't a database problem (sadly it turned out it was!)

In the initial investigation, application support had run the following script, which joins v$lock and v$session to show blocking and waiting sessions.

SELECT   bs.username "Blocking User",
bs.username "DB User",
ws.username "Waiting User",
bs.sid "SID",
ws.sid "WSID",
bs.sql_address "address",
bs.sql_hash_value "Sql hash",
bs.program "Blocking App",
ws.program "Waiting App",
bs.machine "Blocking Machine",
ws.machine "Waiting Machine",
bs.osuser "Blocking OS User",
ws.osuser "Waiting OS User",
bs.serial# "Serial#",
DECODE (
wk.TYPE,
'MR', 'Media Recovery',
'RT', 'Redo Thread',
'UN', 'USER Name',
'TX', 'Transaction',
'TM', 'DML',
'UL', 'PL/SQL USER LOCK',
'DX', 'Distributed Xaction',
'CF', 'Control FILE',
'IS', 'Instance State',
'FS', 'FILE SET',
'IR', 'Instance Recovery',
'ST', 'Disk SPACE Transaction',
'TS', 'Temp Segment',
'IV', 'Library Cache Invalidation',
'LS', 'LOG START OR Switch',
'RW', 'ROW Wait',
'SQ', 'Sequence Number',
'TE', 'Extend TABLE',
'TT', 'Temp TABLE',
wk.TYPE
) lock_type,
DECODE (
hk.lmode,
0, 'None',
1, 'NULL',
2, 'ROW-S (SS)',
3, 'ROW-X (SX)',
4, 'SHARE',
5, 'S/ROW-X (SSX)',
6, 'EXCLUSIVE',
TO_CHAR (hk.lmode)
) mode_held,
DECODE (
wk.request,
0, 'None',
1, 'NULL',
2, 'ROW-S (SS)',
3, 'ROW-X (SX)',
4, 'SHARE',
5, 'S/ROW-X (SSX)',
6, 'EXCLUSIVE',
TO_CHAR (wk.request)
) mode_requested,
TO_CHAR (hk.id1) lock_id1,
TO_CHAR (hk.id2) lock_id2
FROM v$lock hk,
v$session bs,
v$lock wk,
v$session ws
WHERE hk.BLOCK = 1
AND hk.lmode != 0
AND hk.lmode != 1
AND wk.request != 0
AND wk.TYPE(+) = hk.TYPE
AND wk.id1(+) = hk.id1
AND wk.id2(+) = hk.id2
AND hk.sid = bs.sid(+)
AND wk.sid = ws.sid(+)
ORDER BY 1;

.
Whilst clutching at straws, they noted that in addition to the one blocking session and all the waiting sessions, there were also the following two rows, (which I've edited here):

BLOCKING_USER           WAITING_USER            LOCK_TYPE       MODE_HELD  MODE_REQUESTED
------------- ------------ --------- --------- --------------
oracle@foobar (CKPT) oracle@foobar (ARC1) Control FILE EXCLUSIVE EXCLUSIVE
oracle@foobar (CKPT) oracle@foobar (P011) Control FILE EXCLUSIVE SHARE

.
The question asked of me was, was it the checkpoint which caused the problem? I dismissed this idea quickly, but needed to give some evidence. No big surprises that a checkpoint holds an exclusive CF lock, nor was I concerned that the Archiver process was waiting its turn. But I was interested in that parallel process...

Given I know that nothing within the application should go parallel until the batch kicks in overnight, my suspicions turned to "ad hoc" activity rather than the application - it looked at bit like someone was poking about when they shouldn't have been... or something was going parallel when it shouldn't - it wouldn't be the first time a segment accidentally had a degree of "DEFAULT" or >1.

But why would it be wanting a controlfile lock? I started thinking about something like a nologging CTAS, which would need to update the unrecoverable scn in the controlfile for the datafile in question. But then again, if it was parallel DML doing direct i/o then there would need to be a Segment Level Checkpoint anyway, hence the need to access the controlfile.

A quick look at a statspack report for the time in question revealed all - it wasn't the application, it was someone connecting via sqlplus, and it was indeed a parallel CTAS. The controlfile lock had absolutely nothing to do with the original problem, but the sudden burst of activity to do a (relatively big) paralel CTAS was enough to make the database just unresponsive enough for the middleware to time a transaction out...

So ironically it was the CF lock held by CKPT which led me to look at what was using parallelism, which led me to the answer which was nothing to do with the CF lock! Still, it kept me busy for a while.

Friday, June 30, 2006

Keeping a history of CBO Stats

Whilst working with Doug Burns on CBO Stats history, the thing he blogs about here, I cobbled together a procedure which would compare one set of saved stats with another set, and tell you exactly what had changed.

However, bearing in mind the "stattab" you use wasn't really designed to be queried in this way, I couldn't find much (anything) in the way of documentation on what the columns represent - so I had to just poke about and try to find out myself.

[Please note - the info below relates to version 9.2.0.7. I haven't looked into how version specific this is...]

A given column in the stattab has a different use depending on the kind of statistics. For example, for a TABLE, the column N3 represents AVG_ROW_LEN, whereas for an index, the same column N3 represents DISTINCT_KEYS.

Here is my list, for TABLE, INDEX and COLUMN stats, of what I think the stattab columns represent. I couldn't work out what the column N3 represents when dealing with column stats. If anyone knows, I'd me very interested to hear, please post a comment. Likewise any errors or omissions, anything you disagree with.

Here it is:

STATID   User defined stat id

TYPE T=Table I=Index C=Column
C1 Table_Name Index_Name Table_Name
C2 Null Null Null
C3 Null Null Null
C4 Null Null Column_Name
C5 Owner Owner Owner
N1 Num_Rows Num_Rows Num_distinct
N2 Blocks Leaf_Blocks Density
N3 Avg_Row_Len Distinct_Keys ????????
N4 Sample_Size Avg_Leaf_Blocks_per_key Sample_Size
N5 Null Avg_Data_Blocks_per_key Num_Nulls
N6 Null Clustering_Factor First Endpoint_Value
N7 Null Blevel Last Endpoint_Value
N8 Null Sample_Size Avg_Col_Len
N9 Null pct_direct_access Histogram Flag
N10 Null Null Endpoint_Number
N11 Null Null Endpoint_Value
N12 Null Null Null
R1 Null Null High_Value
R2 Null Null Low_value
CH1 Null Null Endpoint_Actual_Value

Wednesday, June 14, 2006

A Dirty Read.

I had the misfortune to have to deal with a problem on an Informix database yesterday. The problem was to do with row-level locking, where a row was locked by a query. With my totally Oracle background, I was a little shaken by this, i.e queries needing to lock rows. (That, and the fact that you can roll back DDL! That kind of threw me for a moment, but now I understand why some folks get a bit uncomfortable with the fact that you can't do this with Oracle!)

So anyway, after introducing the developer to the concept of a dirty read, (in the nicest possible sense), there then followed a discussion between the DBAs I work with about just HOW COOL oracle's locking mechanisms are. Clearly this puts us way into the "geek" bracket, talking about locking mechanisms instead of the World Cup, but I got Equador in the office sweepstake, so I can kiss my fiver goodbye.

You kind of take it for granted that if you update a million rows the commit won't take an age, but when you stop to consider why, you have to say it's quite a good design. Instead of using lock tables, the lock is registered in the row header, and the transaction in the ITL in the block header, so you can never run out of locks... And when it comes to releasing the locks, if DBWR (sorry, DBWn) has already written the block, then you just wait for it to be used again, and then determine whether the transaction committed or not. Quite slick.

OK, so you have the problem of determining the commit SCN, so that when you perform a consistent get you know which image of the block to use. What if it's not in the rollback segment header any more? Not a problem, because you don't need to know exactly when the transaction committed, just whether it committed before your snapshot SCN or not.

Still, I guess it doesn't come for free, the price we pay is having to deal with the whole rollback segments thing, although Automatic Undo Management has taken some of the fun out of this.

We all still love the opportunity of diagnosing ORA-01555 errors, and don't let anyone try to tell you they don't happen any more with AUM - I've seen enough. To be fair, though, mostly due to inappropriate undo retention, so not a really tricky one. Of course, people will still insist on fetching across commits, and some will look at you strangely if you suggest committing LESS frequently, but there we are. All part of the fun.

Tuesday, May 23, 2006

Compression and my Oracle OCD

I've been poking around with segment compression lately. I started looking into it whilst preparing my presentation for the Scotland OUG meeting, but now I've started playing with it, I can't stop. I want to know a little bit more every time, it's like an obsessive compulsion. I should probably seek help.

I hadn't realized that columns of length less than 6 don't get compressed, (until I read something Geoff Moss had written on the subject), so I just wanted to prove it to myself. So I created a table with one column, VARCHAR2(5), and populated it with 4194304 rows each containing 'XXXXX'.

SQL> alter table test1 move;

Table altered.

SQL> select bytes/1048576 meg, blocks from dba_segments where segment_name='TEST1';

MEG BLOCKS
---------- ----------
49.84375 6380

SQL> alter table test1 move compress;

Table altered.

SQL> select bytes/1048576 meg, blocks from dba_segments where segment_name='TEST1';

MEG BLOCKS
---------- ----------
44.859375 5742


Eh? What? I though the columns weren't supposed to be compressed?
Why's it smaller? What happened?
A fraction of a second later I remembered that altering a table "compress" also has the effect of setting pct_free to zero (it had previously been 10). And yes, the original 6380 blocks are now 638 fewer. The attempted compression did not compress anything.

Pretty obvious, but note that altering a table back to nocompress still leaves pct_free as zero. I mean, it would, wouldn't it.

When I did the same thing with a VARCHAR2(6) I did indeed get the compression I expected (not fantastic - the longer the column the more benefit, so I expected a poor compression ratio). My table went from 6266 blocks to 5758.

Now I want to see what the reason is for the 6 byte limit. Logically it must simply be that the reference to the value in the symbol table becomes longer that the original data, but given my Oracle Obsessive Compulsive Disorder I am now have to poke about with dumping blocks and look at the internal structure of the symbol table, compressed row data and so on, and so on. Not sure how I'll get on with that, but I'll keep you posted.

My apologies to anyone who stumbled upon this post looking for information about Oracle Configurator Developer, by the way.

Wednesday, May 10, 2006

Relieved it's over!

I was stressing out quite a lot about presenting at the OUG Scotland event - never having done anything like that before. It didn't help that Doug Burns was winding me up very effectively at every possible opportunity ;-) It was he who talked me into presenting in the first place, and I think he took great delight in seeing me panicing as the date drew closer!

Seriously though, thanks to Doug, and Peter Robson for their encouragement and support, it was nervewracking to do, but I'm glad I did it. Feedback from the floor was positive (or were people just being kind?!), and will definitely spur me on to do it again - stuff like "thanks, I learned something today from your presentation" which is all I could have hoped for.

Monday, May 08, 2006

OUG Scotland

Looking forward to the UKOUG event in Edinburgh tomorrow. I've been "roped in" to giving a presentation, and I must say, although I'm looking forward to it, it's the first time I've done anything like that and I'm rather nervous. Especially given the other speakers, all of whom are highly experienced, highly respected members of the Oracle community.

I was looking at data compression for some data warehouse data and also for some historic partitioned data which needed kept online for regulatory reasons but would scarcely be queried, never mind modified. So when I was asked to speak at the UKOUG event I though "fine, I'll talk about Data Compression". It was only when I was most of the way through putting my presentation together that I found a bunch of other presentations on the knocking around on the web on the same topic. So I hope I'm not going to bore my audience with stuff they already know... Looks like the subject has already received a lot of attention. Oh well... always room for one more.

I'm going to keep the presentation really simple - it's not a Paper after all. So I'm just going to explain the basic concepts and show some results of some testing I did, and the conclusions I drew. I hope it goes down OK - if you are attending, don't give me a hard time!

Monday, May 01, 2006

First Steps

Bear with me. I appreciate that if this posting is successful it will be pitifully dull to read, but I'm still getting things sorted out with this new blog, and I'm really only posting this to "see what happens"...

Tuesday, April 25, 2006

Another Oracle Blog Is Born

My name is Simon Kelsey and I've been working with Oracle databases for about 14 years now. I'm new to this whole blogging thing, but I have been pursuaded by a fellow DBA and keen blogger that I should start a blog and record for the community anything of interest that I come across.

I'm not sure how this is going to go, but I'm pretty hopeful I'll have at least a few things to share. I'm going to leave this initial post on whilst I set up some other bits and pieces, and then I'll start in earnest...