Differences between VMWare PRo 17.5.2 to ESXi 8.02

OpenVMS virtualization: OpenVMS on VirtualBox, VMWare, Hyper-V, KVM, and more.
Post Reply

Topic author
brianreiter
Valued Contributor
Posts: 51
Joined: Fri Jun 14, 2019 4:17 pm
Reputation: 0
Location: North East England
Status: Offline

Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by brianreiter » Mon Sep 23, 2024 7:28 am

Hi there,

Are there any differences betwen VMWare Pro 17.5.2 and ESXI 8.02 which would result in an application running correctly one VM platform, but failing on the other?

The application in question relies on the network and I think uses threads under Pascal. It works fine under VMware Pro 17.5.2 but gives crashes out messily under ESXi.

Code: Select all

$ Set NoOn
$ VERIFY = F$VERIFY(F$TRNLNM("SYLOGIN_VERIFY"))
%SYSTEM-F-ACCVIO, access violation, reason mask=05, virtual address=000000007ACBE000, PC=FFFF830006C0DC8B, PS=0000001B

  Improperly handled condition, image exit forced by last chance handler.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000005
                                 000000007ACBE000
                                 FFFF830006C0DC8B
                                 000000000000001B
    Register dump:
    RAX = 0000000000000000  RDI = 000000007FF7BC20  RSI = 0000000000000001
    RDX = 0000000000000000  RCX = 000000007ACBDE28  R8  = 00000000402B2000
    R9  = 0000000000000000  RBX = FFFFFFFF804BA3A0  RBP = 000000007ACBCEC0
    R10 = 0000000000002800  R11 = 0000000000000246  R12 = 0000000000000000
    R13 = FFFFFFFF804BA220  R14 = FFFFFFFF804BA388  R15 = FFFFFFFF804BA370
    RIP = FFFF830006C0DC8B  RSP = 000000007ACBCB50  SS  = 000000007B258210
%SYSTEM-F-ACCVIO, access violation, reason mask=06, virtual address=000000000000002C, PC=FFFF830008CF33AA, PS=0000001B

  Improperly handled condition, image exit forced.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000006
                                 000000000000002C
                                 FFFF830008CF33AA
                                 000000000000001B
    Register dump:
    RAX = 0000000000000000  RDI = 0000000000000001  RSI = 000000007B3E4101
    RDX = 0000000000000004  RCX = 000000007ACBA6F0  R8  = 000000007B3E4101
    R9  = 0000000001100000  RBX = 000000007B596738  RBP = 000000007ACBA830
    R10 = 000000007ACBA6F0  R11 = 0000000000000202  R12 = 0000000000000002
    R13 = 000000007B59B148  R14 = 000000007B59B150  R15 = 000000007B59E010
    RIP = FFFF830008CF33AA  RSP = 000000007ACBA690  SS  = 0000000000261280
  MID_99_SYS   job terminated at 23-SEP-2024 13:14:26.25

  Accounting information:
  Buffered I/O count:                 99      Peak working set size:      13856
  Direct I/O count:                   49      Peak virtual size:         287472
  Page faults:                       933      Mounted volumes:                0
  Charged CPU time:        0 00:00:00.04      Elapsed time:       0 00:00:00.17
Other networking tasks work as expected. As far as I know this is the only failing process.

The ESXi machine should be a copy of the VMWare Pro machine.

User avatar

arne_v
Senior Member
Posts: 534
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Offline
Contact:

Re: Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by arne_v » Mon Sep 23, 2024 7:50 am

It sounds weird.

How many CPU's in Pro and how many in ESXi?

If 1 and >1 then a traditional concurrency issue could be the cause.
Arne
arne@vajhoej.dk
VMS user since 1986


Topic author
brianreiter
Valued Contributor
Posts: 51
Joined: Fri Jun 14, 2019 4:17 pm
Reputation: 0
Location: North East England
Status: Offline

Re: Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by brianreiter » Mon Sep 23, 2024 8:37 am

16 CPUs in both cases.

Same amount of memory, and after a struggle the same amount of disks.

The only obvious difference between the 2 VMs is that the VMWare Pro version is configured for SATA disks and the ESXi version for SCSI.


hb
Master
Posts: 139
Joined: Mon May 01, 2023 12:11 pm
Reputation: 0
Status: Offline

Re: Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by hb » Mon Sep 23, 2024 9:22 am

Can you post the VMS version (including updates if applicable), the SDA output of

Code: Select all

MAP FFFF830006C0DC8B
MAP FFFF830008CF33AA
and the Pascal version?


Topic author
brianreiter
Valued Contributor
Posts: 51
Joined: Fri Jun 14, 2019 4:17 pm
Reputation: 0
Location: North East England
Status: Offline

Re: Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by brianreiter » Mon Sep 23, 2024 9:44 am

Pascal version:

Code: Select all

(HATMS)SYSTEM>pascal/version
pascal/version
VSI Pascal x86-64 X6.4-145 (GEM 50Y7N) on OpenVMS x86_64 E9.2-3
VMS Version (same on the VMWare and ESXi versions)

Code: Select all

(HRNS)SYSTEM>sho sys/noproc
OpenVMS E9.2-3  on node HRNS   23-SEP-2024 15:38:44.04   Uptime  2 23:39:41
(HRNS)SYSTEM>
(HRNS)SYSTEM>prod sho hist *
------------------------------------ ----------- ----------- --- -----------
PRODUCT                              KIT TYPE    OPERATION   VAL DATE
------------------------------------ ----------- ----------- --- -----------
VSI X86VMS CXX A10.1-2_240805        Full LP     Install     Val 09-AUG-2024
VSI X86VMS CXX A10.1-2_240613        Full LP     Remove       -  09-AUG-2024
VSI X86VMS AVAIL_MAN_BASE E9.2-3     Full LP     Install     Val 09-AUG-2024
VSI X86VMS DWMOTIF V1.8-1            Full LP     Install     Val 09-AUG-2024
VSI X86VMS DWMOTIF_SUPPORT E9.2-3    Full LP     Install     Val 09-AUG-2024
VSI X86VMS KERBEROS V3.3-3           Full LP     Install     Val 09-AUG-2024
VSI X86VMS OPENSSH V8.9-1H01         Full LP     Install     Val 09-AUG-2024
VSI X86VMS OPENVMS E9.2-3            Platform    Install     Sys 09-AUG-2024
VSI X86VMS SSL3 V3.0-13              Full LP     Install     Val 09-AUG-2024
VSI X86VMS TCPIP V6.0-25             Full LP     Install     Val 09-AUG-2024
VSI X86VMS VMS E9.2-3                Oper System Install     Val 09-AUG-2024
VSI X86VMS AVAIL_MAN_BASE V9.2-2     Full LP     Remove       -  09-AUG-2024
VSI X86VMS DWMOTIF V1.8              Full LP     Remove       -  09-AUG-2024
VSI X86VMS DWMOTIF_SUPPORT V9.2-2    Full LP     Remove       -  09-AUG-2024
VSI X86VMS KERBEROS V3.3-2A          Full LP     Remove       -  09-AUG-2024
VSI X86VMS OPENSSH V8.9-1G           Full LP     Remove       -  09-AUG-2024
VSI X86VMS OPENVMS V9.2-2            Platform    Remove       -  09-AUG-2024
VSI X86VMS SSL3 V3.0-11              Full LP     Remove       -  09-AUG-2024
VSI X86VMS TCPIP V6.0-24             Full LP     Remove       -  09-AUG-2024
VSI X86VMS VMS V9.2-2                Oper System Remove       -  09-AUG-2024
VSI X86VMS VMS922X_PCSI V1.0         Patch       Remove       -  09-AUG-2024
VSI X86VMS VMS922X_UPDATE V2.0       Patch       Remove       -  09-AUG-2024
VSI X86VMS VMS922X_UPDATE V1.0       Patch       Remove       -  09-AUG-2024
VSI X86VMS CXX A10.1-2_240613        Full LP     Install     Val 20-JUN-2024
SCI VMS DFU V3.5-100                 Full LP     Install     (U) 20-JUN-2024
VSI X86VMS VMS922X_UPDATE V2.0       Patch       Install     Val 01-MAY-2024
VSI X86VMS VMS922X_UPDATE V1.0       Patch       Install     Val 30-APR-2024
VSI X86VMS CIVETWEB V1.17-0          Full LP     Install     Val 30-APR-2024
VSI X86VMS CURL V8.0-1A              Full LP     Install     Val 30-APR-2024
VSI X86VMS LIBMARIADB V3.1-0A        Full LP     Install     Val 30-APR-2024
VSI X86VMS LUA V5.3-5D               Full LP     Install     Val 30-APR-2024
VSI X86VMS VMS922X_PCSI V1.0         Patch       Install     Val 30-APR-2024
VSI X86VMS VMS V9.2-2                Oper System Install     Val 30-APR-2024
VSI X86VMS VMS922X_PCSI V1.0         Patch       Remove       -  30-APR-2024
VSI X86VMS VMS922X_UPDATE V2.0       Patch       Remove       -  30-APR-2024
VSI X86VMS VMS922X_UPDATE V1.0       Patch       Remove       -  30-APR-2024
VSI X86VMS OPENVMS V9.2-2            Platform    Reconfigure Sys 30-APR-2024
VSI X86VMS VMS922X_UPDATE V2.0       Patch       Install     Val 30-APR-2024
VSI X86VMS TCPIP V6.0-24             Full LP     Install     Val 28-MAR-2024
VSI X86VMS TCPIP V6.0-23             Full LP     Remove       -  28-MAR-2024
VSI X86VMS VMS922X_UPDATE V1.0       Patch       Install     Val 16-MAR-2024
VSI X86VMS VMS922X_PCSI V1.0         Patch       Install     Val 16-MAR-2024
VSI X86VMS CIVETWEB V1.17-0          Full LP     Install     Val 14-MAR-2024
VSI X86VMS CURL V8.0-1A              Full LP     Install     Val 14-MAR-2024
VSI X86VMS LIBMARIADB V3.1-0A        Full LP     Install     Val 14-MAR-2024
VSI X86VMS LUA V5.3-5D               Full LP     Install     Val 14-MAR-2024
VMSPORTS X86VMS PERL534 T5.34-0      Full LP     Install     Val 13-MAR-2024
VSI X86VMS AVAIL_MAN_BASE V9.2-2     Full LP     Install     Val 13-MAR-2024
VSI X86VMS DWMOTIF V1.8              Full LP     Install     Val 13-MAR-2024
VSI X86VMS DWMOTIF_SUPPORT V9.2-2    Full LP     Install     Val 13-MAR-2024
VSI X86VMS KERBEROS V3.3-2A          Full LP     Install     Val 13-MAR-2024
VSI X86VMS OPENSSH V8.9-1G           Full LP     Install     Val 13-MAR-2024
VSI X86VMS OPENVMS V9.2-2            Platform    Install     Sys 13-MAR-2024
VSI X86VMS SSL111 V1.1-1W            Full LP     Install     Val 13-MAR-2024
VSI X86VMS SSL3 V3.0-11              Full LP     Install     Val 13-MAR-2024
VSI X86VMS TCPIP V6.0-23             Full LP     Install     Val 13-MAR-2024
VSI X86VMS VMS V9.2-2                Oper System Install     Val 13-MAR-2024
------------------------------------ ----------- ----------- --- -----------
57 items found
And the SDA output:

Code: Select all

(HRNS)SYSTEM>anal/sys

OpenVMS system analyzer

SDA> map FFFF830006C0DC8B
Image                                         Base               End          Image Offset
EXCEPTION
    Code                                FFFF8300.06B88600 FFFF8300.06C24F0F 00000000.8008568B

    Module:     EXCEPTION + 0000402B
    Line:       68449
SDA> map FFFF830008CF33AA
Image                                         Base               End          Image Offset
PTHREAD$RTL
    Code                                FFFF8300.08CC0000 FFFF8300.08D319E7 00000000.800333AA
SDA>

User avatar

arne_v
Senior Member
Posts: 534
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Offline
Contact:

Re: Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by arne_v » Mon Sep 23, 2024 10:52 am

It does not explain what is going on, but I see that Pascal is a Xn.m not a Vn.m.
Arne
arne@vajhoej.dk
VMS user since 1986


hb
Master
Posts: 139
Joined: Mon May 01, 2023 12:11 pm
Reputation: 0
Status: Offline

Re: Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by hb » Mon Sep 23, 2024 11:00 am

It seems that PTHREADRTL wants to print something to STDERR. From the log shown, I can't tell what kind of information it wants to print: information, error, trace, debug, .... For no obvious reason it fails. Also, for no obvious reason, the EXCEPTION handling hits an ACCVIO. If it works on the same version of VMS on VMware Pro, but not on ESXi, it might be a configuration problem. As a kvm user, I have no idea what to check. It might help if you can get some debug/trace information from the application - on both sides, to compare when and where it differs. At the moment I have no other suggestion how to track this down.


Topic author
brianreiter
Valued Contributor
Posts: 51
Joined: Fri Jun 14, 2019 4:17 pm
Reputation: 0
Location: North East England
Status: Offline

Re: Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by brianreiter » Tue Sep 24, 2024 5:06 am

Its a weird one. I'll raise a formal support case if I don't come up with any bright ideas today!

All I know for sure is that this is the only bit of code that uses pthreads, but its strange that there are differences between the virtualised systems.

Added in 4 hours 12 minutes 11 seconds:
Just been through and changed the networking setup. Still no joy and still seeing crashes, albeit a slightly different one.

Code: Select all

$ Set NoOn
$ VERIFY = F$VERIFY(F$TRNLNM("SYLOGIN_VERIFY"))
%SYSTEM-F-ACCVIO, access violation, reason mask=05, virtual address=000000007ACBE000, PC=FFFF830006C0DC8B, PS=0000001B

  Improperly handled condition, image exit forced by last chance handler.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000005
                                 000000007ACBE000
                                 FFFF830006C0DC8B
                                 000000000000001B
    Register dump:
    RAX = 0000000000000000  RDI = 000000007FF7BC20  RSI = 0000000000000001
    RDX = 0000000000000000  RCX = 000000007ACBDE28  R8  = 00000000402B2000
    R9  = 0000000000000000  RBX = FFFFFFFF804BA3A0  RBP = 000000007ACBCEC0
    R10 = 0000000000002800  R11 = 0000000000000246  R12 = 0000000000000000
    R13 = FFFFFFFF804BA220  R14 = FFFFFFFF804BA388  R15 = FFFFFFFF804BA370
    RIP = FFFF830006C0DC8B  RSP = 000000007ACBCB50  SS  = 000000007B258210
%SYSTEM-F-ACCVIO, access violation, reason mask=06, virtual address=000000000000002C, PC=FFFF830008CF33AA, PS=0000001B

  Improperly handled condition, image exit forced.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000006
                                 000000000000002C
                                 FFFF830008CF33AA
                                 000000000000001B
    Register dump:
    RAX = 0000000000000000  RDI = 0000000000000001  RSI = 000000007B3E4101
    RDX = 0000000000000004  RCX = 000000007ACBA6F0  R8  = 000000007B3E4101
    R9  = 0000000001100000  RBX = 000000007B596738  RBP = 000000007ACBA830
    R10 = 000000007ACBA6F0  R11 = 0000000000000202  R12 = 0000000000000002
    R13 = 000000007B59B148  R14 = 000000007B59B150  R15 = 000000007B59E010
    RIP = FFFF830008CF33AA  RSP = 000000007ACBA690  SS  = 0000000000261280
  MID_99_SYS   job terminated at 24-SEP-2024 15:08:12.10

  Accounting information:
  Buffered I/O count:                 99      Peak working set size:      14112
  Direct I/O count:                   53      Peak virtual size:         287664
  Page faults:                       949      Mounted volumes:                0
  Charged CPU time:        0 00:00:00.07      Elapsed time:       0 00:00:00.30
Now that I've got remote network access to the system I'll have a look at trying to work out what its trying to do when it goes over.


rodprince
Active Contributor
Posts: 28
Joined: Mon Aug 14, 2023 6:00 pm
Reputation: 0
Status: Offline

Re: Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by rodprince » Tue Sep 24, 2024 3:55 pm

I see that you installed the new C++ compiler on 9-Aug-2024. When I think of pthreads, I always think of C or C++ code. Since its a memory access violation that makes me think an address is wacked out somewhere. If your using the new c++ compiler to build anything that the pascal code is calling, keep in mind pascal is a 32 bit address language by default and the new C++ compiler version has a whole lot of default 64bitness added to it.

I am sure you have already verified that the exact same .exe is running on both virtual machines, but this is the first step I would take.

Rod


Topic author
brianreiter
Valued Contributor
Posts: 51
Joined: Fri Jun 14, 2019 4:17 pm
Reputation: 0
Location: North East England
Status: Offline

Re: Differences between VMWare PRo 17.5.2 to ESXi 8.02

Post by brianreiter » Fri Oct 04, 2024 6:29 am

So, I'm thinking this may be exposing a latent application problem.

After adding additional debug and finally getting the application to seemingly do something, I now get this:

Code: Select all

%SYSTEM-F-ACCVIO, access violation, reason mask=05, virtual address=000000007ACBE000, PC=FFFF830006C0DC8B, PS=0000001B

  Improperly handled condition, image exit forced by last chance handler.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000005
                                 000000007ACBE000
                                 FFFF830006C0DC8B
                                 000000000000001B
    Register dump:
    RAX = 0000000000000000  RDI = 000000007FF7BC20  RSI = 0000000000000001
    RDX = 0000000000000000  RCX = 000000007ACBDE28  R8  = FFFFFFFF804BA420
    R9  = FFFFFFFF804BA3E8  RBX = FFFFFFFF804BA3A0  RBP = 000000007ACBCEC0
    R10 = 0000000000002804  R11 = 0000000000000246  R12 = 0000000000000000
    R13 = FFFFFFFF804BA220  R14 = FFFFFFFF804BA388  R15 = FFFFFFFF804BA370
    RIP = FFFF830006C0DC8B  RSP = 000000007ACBCB50  SS  = 000000007B258210
%DECthreads bugcheck (version V3.23-001), terminating execution.

% Reason:  lckMcsLock: deadlock detected, cell = 0x23d280

% Running on OpenVMS E9.2-3() on VMware, Inc. VMware20,1, 16124Mb; 16 CPUs, pid 91186

% The bugcheck occurred at 03-OCT-2024 14:37:47.21, running image

%  HRNS$DKA400:[MID.V10_00_00_01_BAR.][EXE]MID_LCCLNK.EXE;15 in process

%  16432 (named "MID_LCCL01"), under username "MID_90_SYS". AST delivery is

%  enabled for all modes; ASTs are active in user. Upcalls are enabled.

%  Multiple kernel threads are disabled.

% The current thread sequence number is -1, at 0x23d280

% Current thread traceback:

%     0:  PC 0x8cdbf10, SP 0x7acba840, ICTX       0x7acbae98

%     1:  PC 0x8ce4e2b, SP 0x7acbaea0, ICTX       0x7acbaf38

%     2:  PC 0x8ce4d10, SP 0x7acbaf40, ICTX       0x7acbaf68

%     3:  PC 0x8cdeecb, SP 0x7acbaf70, ICTX       0x7acbafa8

%     4:  PC 0x8ce1c73, SP 0x7acbafb0, ICTX       0x7acbafd8

%     5:  PC 0x8cdf523, SP 0x7acbafe0, ICTX       0x7acbaff8

%     6:  PC 0x8cdf458, SP 0x7acbb000, ICTX       0x7acbb028

%     7:  PC 0x8d207aa, SP 0x7acbb030, ICTX       0x7acbb1b8

%     8:  PC 0x6de8a72, SP 0x7acbb1c0, ICTX       0x7acbb258

%     9:  PC 0x6ddfbcb, SP 0x7acbb260, ICTX       0x7acbb2c8

%    10:  PC 0x7b74d251, SP 0x7acbb2d0, ICTX       0x7acbb388

%    11:  PC 0x7b74cc6c, SP 0x7acbb390, ICTX       0x7acbb3a8

%    12:  PC 0x69e3765, SP 0x7acbb3b0, ICTX       0x7acbb3b8

%    13:  PC 0x720e9ef, SP 0x7acbb3c0, ICTX       0x7acbb448

%    14:  PC 0x720d7d1, SP 0x7acbb450, ICTX       0x7acbb508

%    15:  PC 0x69e3765, SP 0x7acbb510, ICTX       0x7acbb518

%    16:  PC 0x6d83402, SP 0x7acbb520, ICTX       0x7acbb568

%    17:  PC 0x6d839bf, SP 0x7acbb570, ICTX       0x7acbb618

%    18:  PC 0x6c0e5b7, SP 0x7acbb620, ICTX       0x7acbcb48

%    19:  PC 0x6c0dc8b, SP 0x7acbcb50, ICTX        0xdf078f8

%    20:  PC 0x8d31290, SP 0xdf07900, ICTX                0

%  Bugcheck output saved to pthread_dump.log.

%SYSTEM-F-IMGDMP, dynamic image dump signal at PC=FFFF830008CDC1F6, PS=0000001B
***RECURSIVE BUGCHECK IN THREAD -1***
                                     ***CANNOT CONTINUE: REPORTED STATE MAY BE INACCURATE AND INCOMPLETE***
Now to work out how to debug pthreads. :)

Sadly the process in question is 1 of 3 or co-operating processes and relies on a whole load of external resources (global sections etc.) as well as other system being operational.

Added in 3 hours 46 minutes 19 seconds:
So, upping the various thread stack sizes substantially seems to have resolved the problem.

Still concerned about the change in behaviour between VMWare Pro and the ESXi environments. I'm assuming that the code has been ready to fail for a number of years and this is has pushed it over the edge.

So, does anyone have any good resources for profiling a posix thread based VMS application?

Post Reply