Job hang with Wait Event: log file switch (private strand flush incomplete)
If only someone will give us a buck for each bug that we “(re)discoverer” in 10g (precisely 10.2.0.x). I wonder if 10g (R1 & R2) is (was) really a production level product at all. Sometimes, I feel working with 10g is very much like working with a Beta product – just an expensive one :-(.
I can only hope that Oracle will keep patching 10g R2 to at least patchset 10.2.0.8 – yeah, I know, a zero chance for that to happen. I guess we’ll have to live with tons of bugs for several years.
Today, I had to kill hanged job that was spending hours waiting for the event: log file switch (private strand flush incomplete).
Perhaps you noticed occasional “Private Strand Flush Not Complete” message in alert.log, that can be safely ignored, as described in Metalink note 372557.1 Alert Log Messages: Private Strand Flush Not Complete, this notice can for example follow after you manually switch log file.
Our scenario is different:
- application developer started job via dbms_job
- inside job execution time frame, daily RMAN incremental backup started and finished (in approx. 10 minutes). Part of the backup job is also a log switch – nothing more than a detail that I think is worth mentioning
- RMAN job completed successfully, but the other job that was running simply hanged waiting on above event for several more hours – I could not confirm that the deadlock happened at exact time when log switch that is part of RMAN daily backup job kicked in
- just for the record, everything was ok with archiver and I/O subsystem
I’m not applying that log switch that was part of RMAN job “locked” other process running the job. It’s just the fact that at the time of the hang only job submitted with dbms_job and RMAN backup were active, perhaps it’s just a coincidence and those two events are not related at all!?
Anyway, a quick search on Metalink revealed the recently filled bug that resembles our case very well:
6806770 LGWR SPINS WHEN OTHER PROCESSES ARE WAITING FOR ‘LOG FILE SWITCH’
What worries me is that bug is somehow connected with a bag of 10g so called “new features behind the scene” – one such feature is In Memory Undo (IMU) and that only workaround proposed is to disable IMU by setting _in_memory_undo = FALSE.
How unfortunate is that? I was just recently reading an excellent white paper written by Craig Shallahamer about IMU.
For now, I decided not to turn IMU off – but if the problem persist then I’m afraid we’ll have to turn In Memory Undo off. (It’s becoming some kind of a folklore – get to know the cool new features, then turn them off and wait until they’re debugged:-).