Since 500GB are more than 400GB, we needed to understand who had generated all of this redo, and whether it may reoccur. Of course, we found that this was done over two months ago! I immediately mentioned that we would have to utilize the Oracle AWR to tell us who and/or what had done this. Thankfully, we retain 90 days of AWR with half hour snapshots.
Confidently, I strode back to my desk and begin to crack through the DBA_HIST views. To my surprise, I found no statistic that would tell me who was plowing through our redo logs like a creature in the movie "Tremors".
I did find a way to roughly identify this information, however, that I hope will be beneficial to you. I first checked DBA_HIST_SQL_STAT for any statement that had a far greater number of executions for the period in question, than they did on average.
I found none.
I then looked into the dba_hist_seg_stat view, and found the PHYSICAL_WRITES_DELTA column. Ah hah! I had my answer. To my dismay, I found that this didn’t have the information, either.
After mulling around for another hour or so, I went back to the dba_hist_seg_stat view. It was then I noticed the db_block_changes_delta column. I then identified the segment that experienced the most changes during the period, using what is below…
select to_char(begin_interval_time,'YYYY_MM_DD HH24:MI') snap_time,
dhsso.object_name,
sum(db_block_changes_delta)
from dba_hist_seg_stat dhss,
dba_hist_seg_stat_obj dhsso,
dba_hist_snapshot dhs
where dhs.snap_id = dhss.snap_id
and dhs.instance_number = dhss.instance_number
and dhss.obj# = dhsso.obj#
and dhss.dataobj# = dhsso.dataobj#
and begin_interval_time
between to_date('2008_05_02 12','YYYY_MM_DD HH24')
and to_date('2008_05_04 12','YYYY_MM_DD HH24')
group by to_char(begin_interval_time,'YYYY_MM_DD HH24:MI')
The answer immediately jumped out at me. It was a table that I recognized, but I didn’t know why it would bubble to the top of the list.
select to_char(begin_interval_time,'YYYY_MM_DD HH24:MI') snap_time,
sum(db_block_changes_delta)
from dba_hist_seg_stat dhss,
dba_hist_seg_stat_obj dhsso,
dba_hist_snapshot dhs
where dhs.snap_id = dhss.snap_id
and dhs.instance_number = dhss.instance_number
and dhss.obj# = dhsso.obj#
and dhss.dataobj# = dhsso.dataobj#
and dhsso.object_name = 'OT_PENDING'
group by to_char(begin_interval_time,'YYYY_MM_DD HH24:MI')
I found that 90% of the block changes over a three month period *all* occurred in the day and half window I was examining. Now we were really on to something!
I gathered all SQL statements that were run during the period that referenced the table, using what is below…
select to_char(begin_interval_time,'YYYY_MM_DD HH24:MI'),
dbms_lob.substr(sql_text,4000,1),
dhss.instance_number,
dhss.sql_id,executions_delta,rows_processed_delta
from dba_hist_sqlstat dhss,
dba_hist_snapshot dhs,
dba_hist_sqltext dhst
where upper(dhst.sql_text) like '%OT_PENDING%'
and dhss.snap_id=dhs.snap_id
and dhss.instance_Number=dhs.instance_number
and dhss.sql_id = dhst.sql_id
I found a statement that was being executed an average number of times, relatively speaking. However, it was updating millions of rows, whereas it normally would update hundreds, if that.
I took the SQL_ID for the statement and searched through DBA_HIST_ACTIVE_SESS_HISTORY, using what is below...
select instance_number, to_char(sample_time,'yyyy_mm_dd hh24:mi:ss'),
user_id,
program
from dba_hist_active_sess_history
where sql_id = 'bxjz5d4p6x02u'
and snap_id between 43212 and 43220
I found that a SQL*PLUS session from a development server was executing the statement, and the execution times exactly mirrored our jump in redo generation. Now I won’t even go into the lack of processes in this shop and how often developers directly access this database, as that is a topic for a different post.
However, the information for which we were searching had been found!
No comments:
Post a Comment