From: joel garry on
On Mar 10, 1:08 pm, Mladen Gogala <n...(a)email.here.invalid> wrote:
> On Wed, 10 Mar 2010 09:02:33 -0800, joel garry wrote:
> > Performance of a crashed instance is always the worst degradation.
>
> I beg to differ. If the database crashes, all your queries finish
> instantly. I would even suggest that database crash is the ultimate thing
> in application tuning.
>
> --http://mgogala.byethost5.com

Depends what your definition of is is. I've seen apps that don't tell
the user the db has gone away until they do some more input or tcp
times out and they get a "server connection lost" error. Though the
more common issue is they X out the client because they entered too
broad of a filter or are otherwise impatient, and happily start
another, blissfully unaware they are giving the server a stress-test
workout.

But of course, worse degradation is obscure data corruption over a
period of time. So maybe a crash isn't so bad, at least it's
recoverable. Unless of course you have some weirdo virtualization
that lies to Oracle about having written redo.

Note to John, about this group: http://dbaoracle.net/readme-cdos.htm

Welcome!

jg
--
@home.com is bogus.
A snarky answer I managed to forbear on forums:
> Our oracle server *SYS(as sysdba)* user *logon by given any password*.
>
> We are shocking and need to be arrested this issue immediately as its very danger.
> I have altered the user "SYS" with new password. Eventhough, still its logon by using any password.
>
> Kindly guide / help me to address this issue ASAP.
>
> Thanks,
> Orahar.

I agree, your DBA needs to be arrested immediately before cardiac
damage ensues.
From: vsevolod afanassiev on
Please confirm that both instances are running from the same
ORACLE_HOME.
If the are not then the issue is version-related, or installation
related.

Assuming that both instances are run from the same ORACLE_HOME:

I see two possibilities:
1. Instance is killed by something external, similar to someone doing
'kill -p <pid of LGWR>"
2. Instance dies

1. Instance is killed by something external

There are two instances on the server, but only one experiences this
problem, correct? Are instances in any way different? For example,
instance A has 3 GB SGA while instance B has 10 GB SGA?
If they are different then try to make them identical, make sure that
all init.ora parameters are the same (with obvious exceptions - things
like control_files, background_dump_dest). If possible try to achieve
it by REDUCING values, not increasing them. Once this is done we
could expect two outcomes:
- The crashes will stop. It is possible that "something" was killing
instance as it was too big. Once it is made smaller it will no longer
get killed.
- The crashes will affect both instances. This indicates that killing
wasn't based on size.

2. Instance dies

Oracle uses following facilities provided by OS:
- CPU
- memory
- disk
- IPC facilities (shared memory and semaphores on Solaris)

CPU is unlikely to disappear, IPC facilities are allocated at startup,
so most likely the issue is either
memory-related or disk-related. As this is LGWR disk issue seems more
likely.
Do you have dedicated filesystems for each instance? Or filesystems
are shared?
What filesystems you are using: UFS, ZFS, Veritas? Do you use
something fancy like
ODM (Oracle Disk Manager)?

Finally: do you use any fancy Solaris 10 stuff like zones/containers?

- - - - - - - - -

I had a similar issue on Tru64 several years ago, very frustrating.
The database had nightly cold backup. Once-twice per month the
instance would start in corrupted state - it was possible to connect
to it but not run any SQL. Oracle Support pointed to a bug where
instance gets corrupted on startup if something tries to connect to it
in the brief moment between 'startup' command and 'Oracle instance
started' message (just a second or two). It had to be SYSDBA
connection, and it was happening at 6am. What could possibly do that?
Eventually it was traced to UNIX script provided by DEC. It took
several months to locate.


From: Steve Howard on
On Mar 10, 7:07 am, Johne_uk <edg...(a)tiscali.co.uk> wrote:
> Hi,
>
> I am currently running two Oracle 10G instances from a single Solaris
> M4000 server. Every few days one of the instances crashes with the
> following error and has to be restarted.
>
> ORA-00470: LGWR process terminated with error
> PMON: terminating instance due to error 470
>
> I have spend weeks running various trace files etc with Oracle support
> and they are basically clueless as to what is the cause. They are
> saying it is a Solaris OS issue but surely this would affect both
> instances and not just one.
>
> Essentially something is killing the LGWR process and the instance is
> shutting itself down. I think the way ahead is to try and find out
> what is killing this process but I'm not sure how to go about this and
> worried that any logging may degrade server performance.
>
> Can anybody offer any suggestions ?
>
> Thanks in advance
> John

You need to ask for a better analyst, and that isn't being facetious.
The ability to predict the quality of the analyst you can get at
Oracle support is about as reliable as their support site.

Seriously, ask for the SR to be escalated/duty managed.
From: Alberto Frosi on
On 10 Mar, 13:07, Johne_uk <edg...(a)tiscali.co.uk> wrote:
> Hi,
>
> I am currently running two Oracle 10G instances from a single Solaris
> M4000 server. Every few days one of the instances crashes with the
> following error and has to be restarted.
>
> ORA-00470: LGWR process terminated with error
> PMON: terminating instance due to error 470
>
> I have spend weeks running various trace files etc with Oracle support
> and they are basically clueless as to what is the cause. They are
> saying it is a Solaris OS issue but surely this would affect both
> instances and not just one.
>
> Essentially something is killing the LGWR process and the instance is
> shutting itself down. I think the way ahead is to try and find out
> what is killing this process but I'm not sure how to go about this and
> worried that any logging may degrade server performance.
>
> Can anybody offer any suggestions ?
>
> Thanks in advance
> John

For example for the better analysis i ask, the LGWR process it's
always the same for the same instance or is it ramdom?
because otherwise could be a hundred of variables for OS for DB.