From: Stan Brown on
In <1126284723.425815.37570(a)o13g2000cwo.googlegroups.com> jkstill(a)gmail.com writes:

>1. yes, no rows returned shows that no tablespaces were in backup
>mode.
>2. starting the database should be fine.
>3. you should probably make a good backup, for your own peace of mind.
>A hot backup will be fine.

>There appears to be a lot of information on that particular ORA-600.

>The suggestions given seem like a good place to start.

>You may want to take a look at IO contention, but the fact that your
>control file was unavailable for 15 minutes and caused the instance to
>crash suggests you may want to discuss this with the system
>administrator as well.

That would be me :-)

And actually I do more of that, so I;m stronger thier than on the Oracle
side.


>Faulty HW could cause this. As you are on 7.3.4 there is reason to
>suspect that this may be old HW as well. Is that the case?

Well, I suppose it is old, but it's reliable hardware, and has a lot of on
line diagnostics running on it, which have been very acurate over the
years.

My suspicion is that we had network problems, and the machine got tied up
trying ot access a NFS mount that it could not connect to. Does this amke
any sense? I'm strugling with this theory as it does not seem to me that
this should prevent access to a local disk file.


--
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin
From: jkstill on
It does not seem likely that an unavailable NFS mount would cause the
controfile on a local disk to be unavailable.

I have however learned to never rule anything out. :)

From: Umberto on
Stan Brown wrote:
> Oracle version 7.3.4.5.0 on HP-UX 10.20 (yes I know it's ancient :-)
>
> Last night (perhaps during a hot backup, I'm not certain yet), our Oracle
> instance came to a halt. The trace file has this, and:
>
> ORA-00600: internal error code, arguments: [2103], [0], [0], [1], [900],
> [], [], []
> ORA-00447: fatal error in background process
> ORA-00600: internal error code, arguments: [2103], [0], [0], [1], [900],
> [], [], []
>
> In it. I managed to connect with svrmgrl, and do a "shutdown abort" Now,
> I'm working on getting a complete set of cold backups of the system as is.
>
> I've looked at the hardware, and I cannot find any problems with it. The
> hot backup, and a subsequent dump are sent to a remote machine which is
> mounted via NFS.I see errors in dmesg out NFS server timeouts, so I strongly
> suspect this is the root cause of the disaster.
>
> I've got several questions for the gurus here.
>
> 1. Besides a complete set of backups, is there anything else I should do to
> prepare for a recovery attempt?
>
> 2. What steps should I plan on taking for the recovery attempt?
>
> 3. Is their any _safe_ way to find out if any tablespaces were left in
> backup mode, prior to getting the backups? I ask this because it's likely
> to take 2 to 3 days to get the backups using the methodology I'm familiar
> with, and I really don't think this is the time to depend on an untested
> backup technique.
>
> I welcome any suggestions.
>
> Thanks.
>

It may be that a filesystem containing a controlfile has
become not accessible. Maybe for safety reason a controlfile
is placed on a different filesystem and that one has been
unmount (for backup or something else) or communication with
that storage area has been lost.

Before starting up the instance, I'd check where
controlfiles are (initSID.ora file). Then I'd make a backup
copy of each one and compare them to see if they is not a
controlfile that is not up to date. Maybe the inaccessible
controlfile has not been written to, causing the instance to
hang. So you may receive an error message starting it up
because that controlfile is not up to date.
In this case you should copy one of the good controlfiles
over the bad one, but it would be better to backup the whole
instance before, just to be sure... ;-)

In this case you can easily return to the starting point.

Remember, if you can, backup before starting recovery too.
If you make a mistake, you can start again...

Umberto
First  |  Prev  | 
Pages: 1 2
Prev: Exec statement
Next: CBO influences