From: db2admin on
Hi,

i am setting up hacmp cluster and i have installed db2 on both nodes
and set up both admin and regular instances on primary node. i have
configured hacmp cluster and added resources and all that. i should
change the hostname value in db2nodes.cfg of instance to service IP
alias ( say srv1 ) which is nothing but another resolvable hostname
pointing to same host as original say hostname1. i can not start db2
when i change db2nodes.cfg. neither can i change DB2SYSTEM setting of
db2set command to service alias. i get following error when i try to
change DB2SYSTEM
-------------------------------------------------------------------------------------------------------------------------------------------
DBI1309E System error.

Explanation:

The tool encountered an operating system error.

User response:

A system error was encountered during registry access. Ensure that
there
is enough space on the file system where the registry is located, and
that there is a valid LAN connection if the registry is remote.
-------------------------------------------------------------------------------------------------------------------------------------------
I get following error when i try to start db2 after changing original
hostname from db2nodes.cfg to service alias
-------------------------------------------------------------------------------------------------------------------------------------------
$ db2start
06/18/2010 15:42:09 0 0 SQL6048N A communication error
occurred during START or STOP DATABASE MANAGER processing.
SQL1032N No start database manager command was issued.
SQLSTATE=57019
-------------------------------------------------------------------------------------------------------------------------------------------
db2diag.log show this
-------------------------------------------------------------------------------------------------------------------------------------------
2010-06-18-15.42.08.088452-300 I236358A432 LEVEL: Event
PID : 1249360 TID : 1 PROC : db2start
INSTANCE: db2inst1 NODE : 000
EDUID : 1
FUNCTION: DB2 UDB, base sys utilities, sqleIssueStartStop, probe:1100
DATA #1 : String, 55 bytes
/db2inst1/db2inst1/sqllib/adm/db2rstar db2profile SN 0 0
DATA #2 : Hexdump, 4 bytes
0x0FFFFFFFFFFF7CB4 : 0000 0010 ....

2010-06-18-15.42.09.175928-300 E236791A612 LEVEL: Error
PID : 1249360 TID : 1 PROC : db2start
INSTANCE: db2inst1 NODE : 000
EDUID : 1
FUNCTION: DB2 UDB, oper system services,
sqloPdbInitializeRemoteCommand, probe:110
MESSAGE : ZRC=0x810F0012=-2129723374=SQLO_COMM_ERR "Communication
error"
DATA #1 : String, 204 bytes
The remote shell program terminated prematurely. The most likely
causes are either that the DB2RSHCMD registry variable is set to an
invalid setting, or the
remote command program failed to authenticate.
DATA #2 : String, 12 bytes
/usr/bin/rsh

2010-06-18-15.42.09.176368-300 E237404A477 LEVEL: Error
PID : 1249360 TID : 1 PROC : db2start
INSTANCE: db2inst1 NODE : 000
EDUID : 1
FUNCTION: DB2 UDB, oper system services,
sqloPdbInitializeRemoteCommand, probe:200
MESSAGE : ZRC=0x810F0012=-2129723374=SQLO_COMM_ERR "Communication
error"
DATA #1 : String, 8 bytes
hostname1
DATA #2 : String, 4 bytes
srv1
DATA #3 : String, 45 bytes
[files]: Your encrypted password is invalid.

2010-06-18-15.42.09.176516-300 I237882A291 LEVEL: Event
PID : 1249360 TID : 1 PROC : db2start
INSTANCE: db2inst1 NODE : 000
EDUID : 1
FUNCTION: DB2 UDB, base sys utilities, sqleIssueStartStop, probe:80
DATA #1 : signed integer, 4 bytes
-6048




From: Joachim Müller on
On Jun 19, 12:48 am, Helmut Tessarek <tessa...(a)evermeet.cx> wrote:
> > change the hostname value in db2nodes.cfg of instance to service IP
> > alias ( say srv1 ) which is nothing but another resolvable hostname
>
> I think you mean the virtual IP address which is failing over when one node
> goes down, right?
> service ip addresses are usually used to connect directly to one of the nodes.
>
> You need to make sure that the reverse lookup points to the correct IP.
> Please add the virtual IP to your hosts file. Furthermore you have to change
> your .rhosts file to include the new hostname/ip.
>
> Here are the correct steps:
>
> install DB2 on both machines, with their physical machine names and service ip
> addresses.
> add virtual hostname / ip address to your resource group or use a new resource
> group (depending on your env)
> change the entries in db2nodes.cfg to the virtual hostname
> add virtual hostname / ip address to /etc/hosts
> add the virtual addresses to your .rhosts file
>
> that should do it.
>
> --
> Helmut K. C. Tessarek
> DB2 Performance and Development
>
> /*
>    Thou shalt not follow the NULL pointer for chaos and madness
>    await thee at its end.
> */

Hello Helmut,

A better approach than using .rhosts is to build a ssh environment;-)

http://www.ibm.com/developerworks/db2/library/techarticle/dm-0506finnie/index.html

Best regards,
Joachim
From: db2admin on
On Jun 21, 4:19 am, Helmut Tessarek <tessa...(a)evermeet.cx> wrote:
> Hi Joachim,
>
> > A better approach than using .rhosts is to build a ssh environment;-)
>
> >http://www.ibm.com/developerworks/db2/library/techarticle/dm-0506finn...
>
> You are right, I usually use that approach too, but db2admin did not mention
> anything about ssh, so I assumed that ssh was not installed. For some reason
> there are still AIX admins out there who do not install an ssh server on their
> boxes.
>
> --
> Helmut K. C. Tessarek
> DB2 Performance and Development
>
> /*
>    Thou shalt not follow the NULL pointer for chaos and madness
>    await thee at its end.
> */

thanks everyone
I have added following entries in files like ~db2inst/.rhosts ,
~db2inst1/sqllib/db2nodes.cfg , and /etc/hosts . consider node1 and
node2 as nodes

$ cat ./sqllib/db2nodes.cfg
0 srv1 0

$ cat .rhosts
node1 db2inst1
node2 db2inst1
srv1 db2inst1

$ cat /etc/hosts
-------------------lines
ommited----------------------------------------
IPaddress srv1

i still get communication error when i try to start db2inst1.
SQL6048N A communication error occurred during START or STOP DATABASE
MANAGER processing.

when i do db2set -all, i see that DB2SYSTEM is set to hostname. does
that need to be changed to service alias. i tried changing it but i
get following error
----------------------------------------------------------------------------------------------------------------------
DBI1309E System error.

Explanation:

The tool encountered an operating system error.

User response:

A system error was encountered during registry access. Ensure that
there
is enough space on the file system where the registry is located, and
that there is a valid LAN connection if the registry is remote.
----------------------------------------------------------------------------------------------------------------------

db2diag.log look like following after issuing db2start

------------------------------------------------------------------------------------------------------------------------------------
2010-06-22-10.32.58.652403-300 I158160A432 LEVEL: Event
PID : 1089586 TID : 1 PROC : db2start
INSTANCE: db2inst1 NODE : 000
EDUID : 1
FUNCTION: DB2 UDB, base sys utilities, sqleIssueStartStop, probe:1100
DATA #1 : String, 55 bytes
/db2inst1/db2inst1/sqllib/adm/db2rstar db2profile SN 0 0
DATA #2 : Hexdump, 4 bytes
0x0FFFFFFFFFFF7CB4 : 0000 0010 ....

2010-06-22-10.32.59.653023-300 I158593A686 LEVEL: Severe
PID : 1089586 TID : 1 PROC : db2start
INSTANCE: db2inst1 NODE : 000
EDUID : 1
FUNCTION: DB2 UDB, base sys utilities, sqleIssueStartStop, probe:1105
DATA #1 : String, 12 bytes
host is down
DATA #2 : String, 55 bytes
/db2inst1/db2inst1/sqllib/adm/db2rstar db2profile SN 0 0
DATA #3 : String, 4 bytes
whtp
DATA #4 : Hexdump, 4 bytes
0x0FFFFFFFFFFF7C30 : 0000 0010 ....
DATA #5 : Hexdump, 24 bytes
0x0FFFFFFFFFFF7C38 : 0000 0001 1007 9698 0000 0000 0000
0000 ................
0x0FFFFFFFFFFF7C48 : 0000 0000 0000
044C .......L

2010-06-22-10.32.59.653273-300 I159280A291 LEVEL: Event
PID : 1089586 TID : 1 PROC : db2start
INSTANCE: db2inst1 NODE : 000
EDUID : 1
FUNCTION: DB2 UDB, base sys utilities, sqleIssueStartStop, probe:80
DATA #1 : signed integer, 4 bytes
-6048

2010-06-22-10.32.59.654359-300 E159572A446 LEVEL: Error (OS)
PID : 1089586 TID : 1 PROC : db2start
INSTANCE: db2inst1 NODE : 000
EDUID : 1
FUNCTION: DB2 UDB, oper system services, sqloPdbCloseSocket, probe:5
MESSAGE : ZRC=0x870F0025=-2029060059=SQLO_INVH "invalid file handle"
DIA8534C An invalid file handle was encountered.
CALLED : OS, -, close
OSERR : EBADF (9) "Bad file number"

2010-06-22-10.32.59.654868-300 I160019A385 LEVEL: Error
PID : 1089586 TID : 1 PROC : db2start
INSTANCE: db2inst1 NODE : 000
EDUID : 1
FUNCTION: DB2 UDB, oper system services, sqloPdbFreeCmdHandle, probe:
15
DATA #1 : String, 30 bytes
Error closing socket, OSSrc =
DATA #2 : signed integer, 4 bytes
0
DATA #3 : signed integer, 4 bytes
0


------------------------------------------------------------------------------------------------------------------------------------------------------------------












From: db2admin on
On Jun 22, 2:40 pm, Helmut Tessarek <tessa...(a)evermeet.cx> wrote:
> Hi,
>
> You still seem to have problems with the reverse lookup.
>
> What is the output of:
>
> # host srv1
> srv1.domain.com has address a.b.c.d
>
> # host a.b.c.d
> is the output srv1?
>
> You can also try to change your /etcv/hosts file to include the following:
>
> ip_of_node1    node1    node1.domain.com    srv1
>
> Please also check out the following redbook:http://www.redbooks.ibm.com/redbooks/pdfs/sg247363.pdf
>
> There are so many ways to configure DB2 in an HACMP environment and your
> description is not very detailed.
>
> What filesystems are in the resource group? Which filesystems are moved to the
> other node in case of a failover?
> What is your virtual IP / hostname? Do you move the instance directory as well
> or just the database directory?
> Are you using rsh or ssh? Do you want to use ssh or are you ok with just rsh?
> What is the output of 'db2set -all' and 'db2level'?
>
> Without a sufficient description of the environment, everything is just a best
> guess.
>
> On 22.6.2010 11:37, db2admin wrote:
>
>
>
> > I have added following entries in files like ~db2inst/.rhosts ,
> > ~db2inst1/sqllib/db2nodes.cfg , and /etc/hosts . consider node1 and
> > node2 as nodes
>
> > $ cat ./sqllib/db2nodes.cfg
> > 0 srv1 0
>
> > $ cat .rhosts
> > node1 db2inst1
> > node2 db2inst1
> > srv1 db2inst1
>
> > $ cat /etc/hosts
> > -------------------lines
> > ommited----------------------------------------
> > IPaddress    srv1
>
> --
> Helmut K. C. Tessarek
> DB2 Performance and Development
>
> /*
>    Thou shalt not follow the NULL pointer for chaos and madness
>    await thee at its end.
> */

Hi,

thank you so much for all the help.
i realized what the problem is. i have to start hacmp services first
which add service alias to network. my service alias is not up yet.
that is why db2 instance is complaining. i can do all those host
commands and it returns results which looks great.

i have another question. when i try to sync hacmp configuration. i get
following error. i know it is a HACMP related issue but error is about
db2 instance. that is why i am posting her. please help.

error
--------------------------------------------------------------------------------
WARNING: HACMP is unable to determine if the .rhosts file for
instance: tp_instance is
properly configured.
The DB2 instance home directory: /db2inst1/tp_instance is not mounted
on any of the
participating nodes in the instance resource group.

Please check to ensure the entry 'srv1 tp_instance' is added to
the .rhosts file
Then re-run verification and synchronization with the instance home
directory
mounted on one of the participating nodes of the instance resource
group.
ERROR: The DB2 instance owner home directory: /db2inst1/tp_instance
for instance: tp_instance,
and user: tp_instance is incorrectly set to path:
on node: node1
Please change the instance owners home directory on node: csudwhp1 to
match the instance owner home directory.
ERROR: The DB2 instance owner home directory: /db2inst1/tp_instance
for instance: tp_instance,
and user: tp_instance is incorrectly set to path:
on node: node2
Please change the instance owners home directory on node: node2 to
match the instance owner home directory.
Completed 90 percent of the verification checks
WARNING: HACMP is unable to determine if the file /db2inst1/
tp_instance/sqllib/db2nodes.cfg for instance: tp_instance
is properly configured.
The DB2 instance home directory: /db2inst1/tp_instance is not mounted
on any of the accessible participating
nodes in the instance resource group.
-------------------------------------------------------------------------
From: db2admin on
On Jun 22, 4:22 pm, Helmut Tessarek <tessa...(a)evermeet.cx> wrote:
> I think the following paragraphs already explain what you have to do.
>
> > Then re-run verification and synchronization with the instance home
> > directory mounted on one of the participating nodes of the instance
> > resource group.
>
> Your instance home directory does not seem to be mounted.
>
> > ERROR: The DB2 instance owner home directory: /db2inst1/tp_instance for
> > instance: tp_instance, and user: tp_instance is incorrectly set to path: on
> > node: node1
> > Please change the instance owners home directory on node:
> > csudwhp1 to match the instance owner home directory.
>
> The path value seems to be empty and should be set to the instance owner home
> directory.
>
> --
> Helmut K. C. Tessarek
> DB2 Performance and Development
>
> /*
>    Thou shalt not follow the NULL pointer for chaos and madness
>    await thee at its end.
> */

thanks
are you referring to PATH variable? i added db2 instance home
directory to the path but error did not go away.