X86 Release Notes

OpenVMS x86 Field Test questions, reports, and feedback.
Post Reply

Topic author
rodprince
Contributor
Posts: 18
Joined: Mon Aug 14, 2023 6:00 pm
Reputation: 0
Status: Offline

X86 Release Notes

Post by rodprince » Mon Nov 06, 2023 3:14 pm

I am trying to find the OpenVMS X86 Release notes. In particular, I am trying to figure out what changed between HPe I64 and VSI x86 in the following areas:

1) $INSTALL CREATE
2) $LINK (shared images).
3) $CRMPSC and $CRMPSC_GPFILE_64

The reason I am looking for information in these areas is complicated and I will try to go into details later on in this post. Checking out $INSTALL, $CRMPSC and $CRMPSCGPFILE_64 is pretty easy. I can get the doc's from VSI about them and compare it to the old HPe I64 doc's and look for anything that's different. Takes a bit of time, but essentially just a matter or reading/comparing the documentation. The Linker is totally different. The Linker documentation is essentially a book, not a couple of pages. Trying to figure out what might have changed is a bit more of a challenge, hence the whole reason to ask folks that may have had to change to compile/link share images on X86 vs I64.

Now for why I am looking to see if anyone has any pointers in these areas (the messy details).

When we started porting our system from HPe I64 to VSI x86 we came across 2 interesting issues.

The first was the system will crash. The actual error on OPAO: was/is "** Bugcheck code = 0000000DC: DELCONPFN, Fatal error in delete contents of PFN". We eventually traced this back to the use of $INSTALL CREATE <shared image> /share /open /header /resid. We create a shared image as part of our normal build process. Essentially we noticed the $INSTALL CREATE was throwing an error about needing the /shared command to be /shared=address_data. While playing with the $INSTALL command we finally figured out, all we have to do is issue the $INSTALL CREATE <image> /shared / open /header /resid and the /shared can be with or without =address_data and the system will crash the next time you log out. Simple and easy to reproduce, $INSTALL followed by $LOGOUT and watch the system crash on OPAO:. This system crash and all the messy details has been reported to VSI (SP-1285). As a follow up on our side, I just need to do the research to see if anything has changed on what we need to be doing in terms of compiling and linking our shared image. After looking at the $INSTALL command and it says we need to use /shared=address_data, does this imply any changes required in the link command or just the $INSTALL command which leads into what if any thing has changed into the $LINK party. Essentially, I need to verify/insure we are actually building/linking the shared image correctly. Once VSI resolves the $install issue, we want to be good to go on our end.

Now the second issue is when we start our system we create a couple (ok a lot) of global sections that we use to share data back and forth. Some of these global sections might be considered large (embarrassing large). We have noticed that from time to time while creating these sections, we experience a mount-verify-timeout on either the system disk DKA0: or DKA100:. We place all of our exes on DKA100: and the system resides on DKA0:. We do not specify a backing file for the global sections hence they are backed by the system page file (again an embarrassing large file). This could explain why DKA0: goes off line, but why DK100: goes off line is beyond me. As part of research into this situation, we have decided to move the page file from DKA0: to DKA100: and see if dka0: continues to go off line or its just dka100: now. The other part of this investigation is to go thru the grunt work and check to see if anything might have changed in $CRMPSC & $CRMPSC_GPFILE_64, hence question 3.

More messy details

HPe I64 - OpenVMS8.4, VMS84I-Update V11.0

VSI x86 - OpenVMS 9.2-1 with VMS921XUpdate V2.0. We saw both these issues with update V1.0 if it matter


First couple lines on the x86 OPA0: crash (again already reported)

Code: Select all

**** OpenVMS x86_64 Operating System V9.2-1   - BUGCHECK ****

** Bugcheck code = 000000DC: DELCONPFN, Fatal error in delete contents of PFN
** Crash Time:            24-OCT-2023 10:44:59.92
** Crash CPU: 00000000    Primary CPU: 00000000    Node Name: BBIRCH
** Highest CPU number:    00000001
** Active CPUs:           00000000.00000003
** Current Process:       <No process name>
** Current PSB ID:        00000000

** Dumping error logs to the system disk (BBIRCH$DKA0:)
** Error logs dumped to BBIRCH$DKA0:[SYS0.SYSEXE]SYS$ERRLOG.DMP
** (used 52 out of 64 available blocks)
** Dumping memory to the system disk (BBIRCH$DKA0:)

Thank you
Rod

Also does anyone know of a way to recover from a mount-verify-timeout condition with out having to reboot the system?
Last edited by rodprince on Mon Nov 06, 2023 3:59 pm, edited 1 time in total.

User avatar

arne_v
Master
Posts: 347
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Offline
Contact:

Re: X86 Release Notes

Post by arne_v » Mon Nov 06, 2023 6:56 pm

rodprince wrote:
Mon Nov 06, 2023 3:14 pm
I am trying to find the OpenVMS X86 Release notes. In particular, I am trying to figure out what changed between HPe I64 and VSI x86 in the following areas:

1) $INSTALL CREATE
2) $LINK (shared images).
3) $CRMPSC and $CRMPSC_GPFILE_64
Very few things has been changed deliberately.

The C++ compiler is a totally different beast, but in general old stuff should just work. VMS is VMS!

"should work" does not imply that there are no bugs. Several bugs has been found. That is what field test are for.
rodprince wrote:
Mon Nov 06, 2023 3:14 pm
The reason I am looking for information in these areas is complicated and I will try to go into details later on in this post. Checking out $INSTALL, $CRMPSC and $CRMPSCGPFILE_64 is pretty easy. I can get the doc's from VSI about them and compare it to the old HPe I64 doc's and look for anything that's different. Takes a bit of time, but essentially just a matter or reading/comparing the documentation.
I believe VSI is behind updating documentation for VMS 9 on x86-64.

So if the documentation looks identical then it is either because there are no changes or because VSI has not updated the documentation yet.

I will recommend testing everything.
rodprince wrote:
Mon Nov 06, 2023 3:14 pm
The first was the system will crash. The actual error on OPAO: was/is "** Bugcheck code = 0000000DC: DELCONPFN, Fatal error in delete contents of PFN". We eventually traced this back to the use of $INSTALL CREATE <shared image> /share /open /header /resid. We create a shared image as part of our normal build process. Essentially we noticed the $INSTALL CREATE was throwing an error about needing the /shared command to be /shared=address_data. While playing with the $INSTALL command we finally figured out, all we have to do is issue the $INSTALL CREATE <image> /shared / open /header /resid and the /shared can be with or without =address_data and the system will crash the next time you log out. Simple and easy to reproduce, $INSTALL followed by $LOGOUT and watch the system crash on OPAO:. This system crash and all the messy details has been reported to VSI (SP-1285). As a follow up on our side, I just need to do the research to see if anything has changed on what we need to be doing in terms of compiling and linking our shared image. After looking at the $INSTALL command and it says we need to use /shared=address_data, does this imply any changes required in the link command or just the $INSTALL command which leads into what if any thing has changed into the $LINK party. Essentially, I need to verify/insure we are actually building/linking the shared image correctly. Once VSI resolves the $install issue, we want to be good to go on our end.
VSI will undoubtetly provide a fix.
rodprince wrote:
Mon Nov 06, 2023 3:14 pm
Now the second issue is when we start our system we create a couple (ok a lot) of global sections that we use to share data back and forth. Some of these global sections might be considered large (embarrassing large). We have noticed that from time to time while creating these sections, we experience a mount-verify-timeout on either the system disk DKA0: or DKA100:. We place all of our exes on DKA100: and the system resides on DKA0:. We do not specify a backing file for the global sections hence they are backed by the system page file (again an embarrassing large file). This could explain why DKA0: goes off line, but why DK100: goes off line is beyond me. As part of research into this situation, we have decided to move the page file from DKA0: to DKA100: and see if dka0: continues to go off line or its just dka100: now. The other part of this investigation is to go thru the grunt work and check to see if anything might have changed in $CRMPSC & $CRMPSC_GPFILE_64, hence question 3.
What virtualization software and what config?

It will probably not tell me anything, but more knowledgable people than me may want to know.
Arne
arne@vajhoej.dk
VMS user since 1986

User avatar

volkerhalle
Master
Posts: 198
Joined: Fri Aug 14, 2020 11:31 am
Reputation: 0
Status: Offline

Re: X86 Release Notes

Post by volkerhalle » Tue Nov 07, 2023 1:13 am

rodprince wrote:
Mon Nov 06, 2023 3:14 pm
Also does anyone know of a way to recover from a mount-verify-timeout condition with out having to reboot the system?
Rod,

if the Transaction Colunt (column Trans Count in SHOW DEVICE diskname) is 1, you can issue a DISMOUNT/ABORT diskname - otherwise you have to reboot to get a disk out of the mount-verify-timeout condition.

Volker.

Added in 1 hour 40 minutes 36 seconds:
rodprince wrote:
Mon Nov 06, 2023 3:14 pm
Now the second issue is when we start our system we create a couple (ok a lot) of global sections that we use to share data back and forth. Some of these global sections might be considered large (embarrassing large). We have noticed that from time to time while creating these sections, we experience a mount-verify-timeout on either the system disk DKA0: or DKA100:. We place all of our exes on DKA100: and the system resides on DKA0:.
A disk goes into Mount Verification, if there is a IO error:

https://wiki.vmssoftware.com/Mount_Verification

The disk enters the Mount-verification-Timeout state ( MntVerifyTimeout ) after MVTIMEOUT seconds (this is a system parameter, default=3600 seconds). Please look at the Mount Verification messages on OPA0: and in OPERATOR.LOG to find out, when the disk is entering and exiting mount verification and try to correlate this with your actions regarding the global sections.

You could use the SDA extension DKLOG$SDA to log all IO operations on the relevant disks and try to get more information:

Code: Select all

$ ANALYZE/SYS
SDA> DKLOG    ! will display help information

SDA> DKLOG START $1$DKA0:
...
SDA> DKLOG SHOW $1$DKA0:
...
SDA> DKLOG STOP $1$DKA0:
SDA> EXIT
Volker.
Last edited by volkerhalle on Tue Nov 07, 2023 2:54 am, edited 1 time in total.


hb
Valued Contributor
Posts: 79
Joined: Mon May 01, 2023 12:11 pm
Reputation: 0
Status: Offline

Re: X86 Release Notes

Post by hb » Tue Nov 07, 2023 8:48 am

rodprince wrote:
Mon Nov 06, 2023 3:14 pm
I am trying to find the OpenVMS X86 Release notes. In particular, I am trying to figure out what changed between HPe I64 and VSI x86 in the following areas:

1) $INSTALL CREATE
2) $LINK (shared images).
There is one difference in INSTALL. To make images resident, they must be installed with shared address data. To help you with your existing commands, INSTALL automatically adds /SHARED=ADDRESS_DATA and informs the user with an informational message. The reason for this message is that when an image is installed with shared address data, all shareable images that depend on that image must also be installed with shared address data. This is not new, it's always been the case. For example, you have image M that depends on shareable S. On other platforms, you could successfully install M with /SHARED/RESIDENT, regardless of whether and how S was installed. On x86, S must already be installed with /SHARED=ADDRESS_DATA. On x86, if it isn't, /RESIDENT (and /SHARED=ADDRESS_DATA) is removed and the image is installed with /SHARED. And INSTALL will inform you of this with another informational message. (INSTALL always worked this way; if something didn't work, it tried to install anyway. On all platforms, if you try to install M with /SHARED=ADDRESS_DATA and S is not installed that way, INSTALL installs M with /SHARED - and inform you). All these information messages - they are not errors - try to help you to determine why an image is not installed as requested.

The new requirement for x86 was described in the V9.1 release notes.

Linking (shared) images is not really different from other platforms. The only major difference is that the x86 linker puts the code in P2 space by default. That is, /SEGMENT_ATTRIBUTE=CODE=P2 is the default, which can be overridden if necessary. There are other differences, most of them under the hood. Obviously, there are different relocations on X86 than on IA64 or Alpha. There are some other changes regarding the calling standard and the unwind information. And then there are some new features to help linking threaded applications and including image initialization code. I may have forgotten some of the differences, but in general you just use the same link commands that worked on IA64.

I don't have references to the release notes, but it was all documented in there.
The first was the system will crash. The actual error on OPAO: was/is "** Bugcheck code = 0000000DC: DELCONPFN, Fatal error in delete contents of PFN".
...
This system crash and all the messy details has been reported to VSI (SP-1285).
VSI will most likely fix this. In general, installing resident images works. If you look at the running system, you will find several resident images. These are all shareable images. In other words, it's not obvious what's different/wrong in your case when the system crashes at process rundown in memory management.


Topic author
rodprince
Contributor
Posts: 18
Joined: Mon Aug 14, 2023 6:00 pm
Reputation: 0
Status: Offline

Re: X86 Release Notes

Post by rodprince » Tue Nov 07, 2023 6:44 pm

First let me say thank you for the response's.

Details about my OpenVMS VM. Using ESXi 6.7.0 update 3 build 14320388, running on a Dell T620 with E5-2620 V2 CPU's. The Openvms VM has 2 cores, 16 GB of memory, 2 disks (SATA), 143gb each, 4 NIC's. Decnet Phase IV & LAT on 2 of the NIC's with TCP/IP on the other 2 NICS. All the NIC'S are E1000'S

The information about SDA extension is nice. I hope to be able to put that to good use. Once I figure out how to reproduce the mount-verify-timeout situation, I am sure it will be of use. At this point, I have getting closer to being able to reproduce it. I have to start stripping down the code to just the parts that trigger it (or I think trigger it) so I have something simpler to play with.

As to the shared image, that information is good to know, especially the whole /SEGMENT_ATTRIBUTE=CODE=P2. I have it on my checklist to remove the reference to convshr to see if that helps with our $install situation. $install does not like if we try to install since it says we need /shared=address and convshr is not installed that way.

Code: Select all

$ INSTALL CREATE  SYS$DISK:[]TOTESOL.EXE /SHARED /HEADER /OPEN /RESID
%INSTALL-I-SHRADRADDED, '/RESIDENT requires /SHARED=ADDRESS_DATA, added for BBIRCH$DKA100:[DEV.][EXE]TOTESOL.EXE;1'
%INSTALL-I-NONSHRADR, TOTESOL installed ignoring '/SHARE=ADDRESS'
-INSTALL-I-NOTSHRADR, CONVSHR is not installed with shareable address data
%INSTALL-I-NONRESSHRADR, image installed ignoring '/RESIDENT' (and '/SHARED=ADDRESS_DATA') DISK$TDISK:<DEV.EXE>TOTESOL.EXE
Rod

User avatar

imiller
Master
Posts: 147
Joined: Fri Jun 28, 2019 8:45 am
Reputation: 0
Location: South Tyneside, UK
Status: Offline
Contact:

Re: X86 Release Notes

Post by imiller » Wed Nov 08, 2023 5:44 am

Release notes and other documentation for VSI OpenVMS V9.2-1 can be found at
https://vmssoftware.com/about/v921/

I hope you have also reported this crash to VSI Support.
Ian Miller
[ personal opinion only. usual disclaimers apply. Do not taunt happy fun ball ].


hb
Valued Contributor
Posts: 79
Joined: Mon May 01, 2023 12:11 pm
Reputation: 0
Status: Offline

Re: X86 Release Notes

Post by hb » Wed Nov 08, 2023 2:41 pm

rodprince wrote:
Tue Nov 07, 2023 6:44 pm
As to the shared image, that information is good to know, especially the whole /SEGMENT_ATTRIBUTE=CODE=P2. I have it on my checklist to remove the reference to convshr to see if that helps with our $install situation. $install does not like if we try to install since it says we need /shared=address and convshr is not installed that way.

Code: Select all

$ INSTALL CREATE  SYS$DISK:[]TOTESOL.EXE /SHARED /HEADER /OPEN /RESID
%INSTALL-I-SHRADRADDED, '/RESIDENT requires /SHARED=ADDRESS_DATA, added for BBIRCH$DKA100:[DEV.][EXE]TOTESOL.EXE;1'
%INSTALL-I-NONSHRADR, TOTESOL installed ignoring '/SHARE=ADDRESS'
-INSTALL-I-NOTSHRADR, CONVSHR is not installed with shareable address data
%INSTALL-I-NONRESSHRADR, image installed ignoring '/RESIDENT' (and '/SHARED=ADDRESS_DATA') DISK$TDISK:<DEV.EXE>TOTESOL.EXE
Rod
There is very likely nothing wrong with your image and installing it. At least the code which crashs the system is not in INSTALL. But yes, to get it installed the way you want, you have to ensure that CONVSHR and all the images it depends on are installed with shared address data. And, as far as I can see, this problem was reported to VSI.


Topic author
rodprince
Contributor
Posts: 18
Joined: Mon Aug 14, 2023 6:00 pm
Reputation: 0
Status: Offline

Re: X86 Release Notes

Post by rodprince » Wed Nov 08, 2023 4:26 pm

VSI is all over the system crash. They have reported they can reproduce it, so I am sure they will resolve it. Just a matter of time until they figure out exactly what went sideways. To be honest, its an experimental release and has done way better than anything I ever expected at this point.

Rod

Post Reply