problem adding a satellite to cluster

OpenVMS clustering, cluster management utilities, protocols, building and managing clusters of any scale.

Topic author
joukj
Master
Posts: 175
Joined: Thu Aug 27, 2020 5:50 am
Reputation: 0
Status: Offline

problem adding a satellite to cluster

Post by joukj » Thu Jan 04, 2024 6:19 am

Hi.

I'm running OpenVMS IA64 V8.4-L3

When try to add a satellite to a cluster I did not succeed.. I use CLUSTER_CONFIG.COM on the boot member. After creating page/swap-files on the satellites disks it waits forever telling me it waits for the satellite to reboot. On the console of the satellite I see the following message written to it every few seconds

The timezone was not specified for this node. In order toproperly
run the Time Synchronization Service (DECdts), a valid time zone rule
must be specified


What can be the problem? Does it not get the time zone from the bootmember?

Added sometime later:

I see that befote the timezone message I get

%DCL-W-ACTIMAGE, error activating image RMISHR
-CLI-E-IMGNAME, image file BOLERO$DKA0:[SYS11.SYSCOMMON.]<SYSLIB>RMISHR.EXE
-SYSTEM-F-PROTINSTALL, protected images must be installed
%DCL-W-UNDSYM, undefined symbol - check validity and spelling
\CHECKSUM$CHECKSUM\
%NET$CONFIGURE-I-USECOMMON, using cluster common APPLICATION script
%DCL-W-UNDSYM, undefined symbol - check validity and spelling
\APPLICATION_COMMON_CHECKSUM\
%DCL-W-ACTIMAGE, error activating image RMISHR
-CLI-E-IMGNAME, image file BOLERO$DKA0:[SYS11.SYSCOMMON.]<SYSLIB>RMISHR.EXE
-SYSTEM-F-PROTINSTALL, protected images must be installed
%DCL-W-UNDSYM, undefined symbol - check validity and spelling
\CHECKSUM$CHECKSUM\
%NET$CONFIGURE-I-USECOMMON, using cluster common EVENT script
%DCL-W-UNDSYM, undefined symbol - check validity and spelling
\EVENT_COMMON_CHECKSUM\
%DCL-W-ACTIMAGE, error activating image RMISHR
-CLI-E-IMGNAME, image file BOLERO$DKA0:[SYS11.SYSCOMMON.]<SYSLIB>RMISHR.EXE
-SYSTEM-F-PROTINSTALL, protected images must be installed
%DCL-W-UNDSYM, undefined symbol - check validity and spelling
\CHECKSUM$CHECKSUM\
%NET$CONFIGURE-I-USECOMMON, using cluster common MOP_CLIENT script
%DCL-W-UNDSYM, undefined symbol - check validity and spelling
\MOP_CLIENT_COMMON_CHECKSUM\


What the hell is going on. Why is RMISHR.EXE not installed? and hat is the checsum$checksum command?


Jouk
Last edited by joukj on Thu Jan 04, 2024 9:55 am, edited 2 times in total.

User avatar

arne_v
Master
Posts: 347
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Online
Contact:

Re: problem adding a satellite to cluster

Post by arne_v » Thu Jan 04, 2024 10:14 am

I have not setup a VMS cluster in many many years, so I will not try and help with that.

The only think I can help with is the checksum$checksum symbol.

Code: Select all

$ sh symb checksum$checksum
%DCL-W-UNDSYM, undefined symbol - check validity and spelling
$ checksum login.com
$ sh symb checksum$checksum
  CHECKSUM$CHECKSUM = "393566903"
And my *guess* is that the error cause DCL to jump to some code that expect checksum$checksum but it was not set due to the error.

If my guess is correct then the checksum$checksum problem will go away when the RMISHR problem is solved.
Arne
arne@vajhoej.dk
VMS user since 1986


Topic author
joukj
Master
Posts: 175
Joined: Thu Aug 27, 2020 5:50 am
Reputation: 0
Status: Offline

Re: problem adding a satellite to cluster

Post by joukj » Thu Jan 04, 2024 10:24 am

You are probably right:
checksum.exe depends on ssl111$libcrypto_shr32.exe which depends on RMISHR.exe

Still no idea why RMISHR.EXE is not installed.

Jouk
Last edited by joukj on Thu Jan 04, 2024 10:37 am, edited 1 time in total.

User avatar

arne_v
Master
Posts: 347
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Online
Contact:

Re: problem adding a satellite to cluster

Post by arne_v » Thu Jan 04, 2024 10:42 am

That is a good question

On my 9.2-1 non-clustered system:

Code: Select all

$ sear sys$system:*.com rmishr

******************************
SYS$COMMON:[SYSEXE]SA_STARTUP.COM;1

$install add /open /shared /protect /header SYS$SHARE:RMISHR

$ install list sys$share:rmishr

DISK$X86SYS:<SYS0.SYSCOMMON.SYSLIB>.EXE
   RMISHR;1         Open Hdr SharAddr     Prot Lnkbl
What do you have?

Added in 8 minutes 19 seconds:
I suspect that SA_STARTUP is for standalone mode only and something else cause RMISHR to be installed for a normal boot.

But I assume you need to get that satelite node up more than figuring out the details of VMS startup, so maybe just put the INSTALL of that RMISHR in SYSTARTUP_VMS.COM and move on.
Arne
arne@vajhoej.dk
VMS user since 1986


Topic author
joukj
Master
Posts: 175
Joined: Thu Aug 27, 2020 5:50 am
Reputation: 0
Status: Offline

Re: problem adding a satellite to cluster

Post by joukj » Thu Jan 04, 2024 10:51 am

That is the same on V8.4-2L3

The problem occurs before vmsstartup_vms.com is called

during configuring the cluster a [sys10.sysexe]startup1.com is created. I tried to p the install in that one, but even that did not help.
Last edited by joukj on Thu Jan 04, 2024 10:56 am, edited 3 times in total.


bobwilson
VSI Expert
Contributor
Posts: 16
Joined: Sat Sep 11, 2021 10:24 pm
Reputation: 0
Status: Offline

Re: problem adding a satellite to cluster

Post by bobwilson » Thu Jan 04, 2024 11:04 am

SA_STARTUP.COM is used by the installation/upgrade (i.e. when you boot from the kit disk)

On a "normal running" VMS system: At boot time the list of files installed are in SYS$MANAGER:VMSIMAGES.DAT which, I think, is created by AUTOGEN.COM.

I haven't setup/maintained a cluster in quite a while so I have no experience with how adding a new cluster member works in recent versions of VMS...I'll ping the CLUSTER_CONFIG.COM maintainer to see what he thinks.

User avatar

arne_v
Master
Posts: 347
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Online
Contact:

Re: problem adding a satellite to cluster

Post by arne_v » Thu Jan 04, 2024 11:34 am

I can confirm that on my system where it does get installed:

Code: Select all

$ sear SYS$MANAGER:VMSIMAGES.DAT rmishr
SYS$SHARE:RMISHR              /OPEN /HEADER /SHARED=ADDRESS_DATA /PROTECT                            ! 2/208/
Added in 1 minute 4 seconds:
If it is missing on the system with then problem then maybe adding it.
Arne
arne@vajhoej.dk
VMS user since 1986


bobwilson
VSI Expert
Contributor
Posts: 16
Joined: Sat Sep 11, 2021 10:24 pm
Reputation: 0
Status: Offline

Re: problem adding a satellite to cluster

Post by bobwilson » Thu Jan 04, 2024 11:44 am

As I recall, CLUSTER_CONFIG.COM (and friends) creates the new root, setting up the new root in such a way that on-first-boot the system should run AUTOGEN.COM (with a REBOOT when done).

So, it's possible that something went awry when setting up the new root or in the first boot.

The CLUSTER_CONFIG.COM maintainer will probably weigh in at some point.


Topic author
joukj
Master
Posts: 175
Joined: Thu Aug 27, 2020 5:50 am
Reputation: 0
Status: Offline

Re: problem adding a satellite to cluster

Post by joukj » Fri Jan 05, 2024 2:13 am

Got rid of the rmishr.exe problem by adding the install in the top of the startup1.com file

but still got the timezone problem.

Added in 7 hours 34 minutes 12 seconds:
Got the satellite running, but I had to use a "dirty" trick

I killed the process on the boot-member waiting for the satellite to reboot. Than I copied files from an old backup of another satellite and placed the files in [sys10.sysmgr] and {sys10.sysexe]. The machine booted up.

But is not the way it should work.......
Last edited by joukj on Fri Jan 05, 2024 9:48 am, edited 1 time in total.


bobwilson
VSI Expert
Contributor
Posts: 16
Joined: Sat Sep 11, 2021 10:24 pm
Reputation: 0
Status: Offline

Re: problem adding a satellite to cluster

Post by bobwilson » Fri Jan 05, 2024 12:10 pm

I entered a problem report on this.

Basically when we added support for SHA1 and SHA256 to CHECKSUM it added a run-time dependency on SSLx$LIBCRYPTO_SHR32 (which is where the algorithms are implemented). SSLx$LIBCRYPTO_SHR32 is LINK'd against RMISHR (for both SSL111 and SSL3).

STARTUP1.COM is generated by CLUSTER_CONFIG[_LAN].COM, it will need to be modified to add the INSTALL ADD of RMISHR.

Post Reply