Debugging on X86 (without compiler listings)

OpenVMS x86 Field Test questions, reports, and feedback.

Topic author
mgdaniel
Valued Contributor
Posts: 62
Joined: Mon Feb 28, 2022 5:16 pm
Reputation: 0
Location: Adelaide, South Australia
Status: Offline
Contact:

Debugging on X86 (without compiler listings)

Post by mgdaniel » Fri Nov 11, 2022 4:29 pm

A pointer to documentation regarding debugging specifically for X86 would be useful.

This is V9.2. With no native compilers, the XTOOLS C compiler /LIST /MACHINE producing a glorified source listing and no code listing, is there an approach for matching up an X86 LINK /MAP module location? For example, with the following fatal error

Code: Select all

%SYSTEM-F-ACCVIO, access violation, reason mask=07, virtual address=000000000374C000, PC=FFFF830007A1B0BC, PS=0000001B

  Improperly handled condition, image exit forced.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000007
                                 000000000374C000
                                 FFFF830007A1B0BC
                                 000000000000001B
    Register dump:
    RAX = 00000000036724CB  RDI = 00000000036724CB  RSI = 00000000FFFFFFFF
    RDX = 0000000000000000  RCX = 00000000000D9B30  R8  = 000000000367163C
    R9  = 000000000013C900  RBX = 000000000000001B  RBP = 000000007AD06080
    R10 = 000000007AD06310  R11 = FFFFFFFF89C04516  R12 = 00000000207F12A7
    R13 = 0000000000000018  R14 = 000000007AD062F0  R15 = 0000000000000000
    RIP = FFFF830007A1B0BC  RSP = 000000007AD06028  SS  = 000000000000001B
8< snip 8<
%HTTPD-F-EXIT, 11-NOV-2022 12:24:40, WASD:80 %X1000000C
-HTTPD-F-TRACE, FFFF830005F96445
-HTTPD-F-TRACE, FFFF8300066D776F
-HTTPD-F-TRACE, FFFF8300066D6152
-HTTPD-F-TRACE, FFFF830005F96445
-HTTPD-F-TRACE, FFFF8300062D2722
-HTTPD-F-TRACE, FFFF8300062D2CDF
-HTTPD-F-TRACE, FFFF83000617CE78
-HTTPD-F-TRACE, FFFF830006187515
-HTTPD-F-TRACE, FFFF830007A1B0BC
-HTTPD-F-TRACE, 000000008014A19F   <<<------- this one would be an obvious candidate
-HTTPD-F-TRACE, FFFF830006364711
-HTTPD-F-TRACE, FFFF830006304FFB
-HTTPD-F-TRACE, 000000008018C8B8
-HTTPD-F-TRACE, 000000008018CA79
-HTTPD-F-TRACE, FFFF830007CA3711
-HTTPD-F-TRACE, FFFF830007C58361
I would go to the LINK map and find the module containing the above address

Code: Select all

$CODE$                        Q-00000000 00000000 00000000                OCTA  4 CON,REL,LCL,  SHR,  EXE,NOWRT,NOVEC,  MOD
                                80000000 80441925 00441926 (    4462886.)
8< snip 8<
                FILE          Q-00000000 00000000 00000000                OCTA  4
                                80141780 8014E4E6 0000CD67 (      52583.)
In this case, 8014A19F is between start of module 80141780 and end of module 8014E4E6,.
Use the offset 8014A19F - 80141780 to indicate the module code address, in this case 8A1F.
Find that address in the compiler /MACHINE listing.
Use the line number cross reference to zoom back into the source listing.
(I have a tool that essentially does these steps.)

Without compiler machine code listings this obviously can't be done.

What am I missing? (apart from machine code listings)

What can be done when the software is not amenable to the OpenVMS Debugger?

Process dumps and ANALYZE /PROCESS have nothing to offer

Code: Select all

X86VMS$ ana /proc WASD_ROOT:[http$server]HTTPD_SSL.DMP

         OpenVMS x86-64 Debug64 Version V9.2-001

%DEBUG-E-ANPKERNOTAVAIL, analyze process dump kernel of the VMS debugger not available
DBG> show process
%DEBUG-W-NOPROCDEBUG, there are currently no processes being debugged
DBG>
possibly because of the

Code: Select all

%PROCDUMP-W-BADLOGIC, internal logic error detected at PC 00000000.7B4F2A81
-PROCDUMP-E-NOREAD, no access to location 00000000.00177280, length 00000000.00000098
-PROCDUMP-E-REQUESTED, requested from PC 00000000.7B4F067C
snipped from the above ACCVIO example.

Any suggestions gratefully considered.


Topic author
mgdaniel
Valued Contributor
Posts: 62
Joined: Mon Feb 28, 2022 5:16 pm
Reputation: 0
Location: Adelaide, South Australia
Status: Offline
Contact:

Re: Debugging on X86 (without compiler listings)

Post by mgdaniel » Mon Nov 14, 2022 8:05 am

A forum lurker (unregistered) took the time to personally email me with some very useful information which I'll post here (with permission).
As on IA64, especially for the C++ compiler, you can disassemble the
object module. That will show code offsets in the module and usually
line numbers of the source listing. Just append the output of
ANALYZE/OBJECT/DISASSEMBLE to the usual listing file and you have both
in one place.

And as long as you don't have the symbolic debugger, you can use DELTA.
You probably know how to get into it: DEFINE LIB$DEBUG SYS$SHARE:DELTA.
It's not as comfortable as the usual debugger but with the map, listings
and machine code, you should be able to look at the code of the "obvious
candidate" and set a breakpoint at or before the shown PC. When at the
breakpoint, looking at the process from SDA might help as well.
Works well in creating the equivalent of a /LIST /MACHINE. Thanks.
Last edited by mgdaniel on Mon Nov 14, 2022 8:51 am, edited 3 times in total.


Topic author
mgdaniel
Valued Contributor
Posts: 62
Joined: Mon Feb 28, 2022 5:16 pm
Reputation: 0
Location: Adelaide, South Australia
Status: Offline
Contact:

Re: Debugging on X86 (without compiler listings)

Post by mgdaniel » Mon Nov 14, 2022 12:08 pm

Prior to receiving the above email I had solved the ACCVIO issue, shall we say, via the tried-and-true approach of adding a swag of debug statements and over several iterations home in of the problem section of code.

It turned out to be a memset(dest,value,size); where size on one occasion in hundreds of thousands of transactions drifted to -1 (negative number). Now I know it's declared as a unsigned type_t and should not be given a negative integer. The behaviour may even be undefined (implementation dependent) if so. However...

Worked without a murmur on Alpha and IA64 (even back in VAX days). My guess is the RTL ignored the negative number and continued on its merry way. On X86 any negative size results in an ACCVIO. This may trip up more than me and so it has been reported to the VSI Service Platform as an issue. The immediate solution...

if (ptr - buf > 0) memset (ptr, 0, ptr - buf);
Last edited by mgdaniel on Mon Nov 14, 2022 7:08 pm, edited 1 time in total.

User avatar

arne_v
Master
Posts: 299
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Offline
Contact:

Re: Debugging on X86 (without compiler listings)

Post by arne_v » Thu Nov 17, 2022 9:27 pm

Mystery.

I am not on VMS x86-64 yet. What does this give?

Code: Select all

#include <stdio.h>
#include <string.h>

void test(char *p, size_t len)
{
   printf("%d\n", sizeof(p));
   printf("%d\n", len);
   printf("%u\n", len);
   printf("%p\n", p);
   printf("%p\n", p + len);
}

int main()
{
   char buf[1];
   char *p = buf;
   int len = -1;
   test(p, len);
   memset(p, 0, len);
   return 0;
}
Arne
arne@vajhoej.dk
VMS user since 1986


Topic author
mgdaniel
Valued Contributor
Posts: 62
Joined: Mon Feb 28, 2022 5:16 pm
Reputation: 0
Location: Adelaide, South Australia
Status: Offline
Contact:

Re: Debugging on X86 (without compiler listings)

Post by mgdaniel » Thu Nov 17, 2022 10:32 pm

As expected Arne.

Code: Select all

X86VMS$ mcr []ARNE_V.EXE
4
-1
4294967295
7ACA18CF
7ACA18CE
%SYSTEM-F-ACCVIO, access violation, reason mask=07, virtual address=000000007ACA2000, PC=FFFF830007A1B0BC, PS=0000001B

  Improperly handled condition, image exit forced by last chance handler.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000007
                                 000000007ACA2000
                                 FFFF830007A1B0BC
                                 000000000000001B
    Register dump:
    RAX = 000000007ACA18CF  RDI = 000000007ACA18CF  RSI = 00000000FFFFFFFF
    RDX = 0000000000000000  RCX = 0000000000000730  R8  = 0000000000023A00
    R9  = 0000000000000000  RBX = 000000007FFCF898  RBP = 000000007ACA18A0
    R10 = FFFFFFFFFFFFFFFE  R11 = 0000000000000001  R12 = 000000007ACA19A8
    R13 = 0000000000000000  R14 = 000000007AEBA170  R15 = 000000001C274606
    RIP = FFFF830007A1B0BC  RSP = 000000007ACA1848  SS  = 000000000000001B
FWIW: Yesterday I received a comment from the Service Platform.
Subject: [Service Platform] (SPS-783) C memset() difference in behaviour on incorrect size argument
To: mark.daniel@wasd.vsm.com.au
From: noreply@vmssoftware.com
Date: Thu, 17 Nov 2022 13:37:14 +0000

8< snip 8< added 1 comment on the Service Platform support issue that you created.
Service Platform / SPS-783
C memset() difference in behaviour on incorrect size argument <https://sp.vmssoftware.com/#/org/issues/SPS-783>
2022-11-17 07:37:13 8< snip 8< commented:

Mark,

It appears Development is on the trail of the problem; evidently not so much memset() itself, but an OTS$ routine underlying and/or working in conjunction. I’ll let you know when we have a fix available.
8< snip 8<
Mark.

User avatar

arne_v
Master
Posts: 299
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Offline
Contact:

Re: Debugging on X86 (without compiler listings)

Post by arne_v » Fri Nov 18, 2022 9:54 am

Sounds like you will get a fix.

Put from a very puristic point of view it could be argued that the new x86-64 behavior is correct and that it is the old behavior that is a bug.

If one provide a length as an unsigned value of 0xFFFFFFFF then one has really asked memset to initialize 4 GB - 1 byte. So in the best or worst tradition of C then it does as it is being told.

:-)
Arne
arne@vajhoej.dk
VMS user since 1986


Topic author
mgdaniel
Valued Contributor
Posts: 62
Joined: Mon Feb 28, 2022 5:16 pm
Reputation: 0
Location: Adelaide, South Australia
Status: Offline
Contact:

Re: Debugging on X86 (without compiler listings)

Post by mgdaniel » Fri Nov 18, 2022 11:40 am

Put from a very puristic point of view it could be argued that the new x86-64 behavior is correct and that it is the old behavior that is a bug.

If one provide a length as an unsigned value of 0xFFFFFFFF then one has really asked memset to initialize 4 GB - 1 byte. So in the best or worst tradition of C then it does as it is being told.
Sure. If the implementation was using C. But as the Service Platform post states
It appears Development is on the trail of the problem; evidently not so much memset() itself, but an OTS$ routine underlying and/or working in conjunction.
it is implemented using a RTL routine (another private email has suggested the documented OTS$MOVE5 via an undocumented OTS$FILL call point) this is of more significant concern due to potentially more widespread RTL use in the OS and/or other codebase. It also casts a shadow over any general initialisation / parameter checking code in that routine and/or RTL.

User avatar

arne_v
Master
Posts: 299
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Offline
Contact:

Re: Debugging on X86 (without compiler listings)

Post by arne_v » Fri Nov 18, 2022 7:20 pm

Interesting.

OTS$MOVE5 is pretty obvious.

Code: Select all

#include <stdio.h>
#include <string.h>

#include <ots$routines.h>

int main(int argc, char *argv[])
{
   char buf[10];
   int i;
   memset(buf, 0x01, sizeof(buf));
   for(i = 0; i < sizeof(buf); i++) printf("%02X", buf[i]);
   printf("\n");
   ots$move5(0, NULL, 0x02, sizeof(buf), buf);
   for(i = 0; i < sizeof(buf); i++) printf("%02X", buf[i]);
   printf("\n");
   return 0;
}
And documentation for lib$move5 says:

"Number of bytes of data to move. The longword-int-source-length argument is a signed longword that contains this number. The value of longword-int-source-length may range from 0 to 2,147,483,647. "

so -1 should fail in some way.

More interesting is why it has changed. On VAX it must have been Macro-32 code with a loop around MOVC5. But since the behavior changed on x86-64 then maybe the Macro-32 was not compiled on newer platforms but instead converted to native assembler. And then on x86-64 someone did not check the limit on that arg.

Pure speculation.
Arne
arne@vajhoej.dk
VMS user since 1986


Topic author
mgdaniel
Valued Contributor
Posts: 62
Joined: Mon Feb 28, 2022 5:16 pm
Reputation: 0
Location: Adelaide, South Australia
Status: Offline
Contact:

Re: Debugging on X86 (without compiler listings)

Post by mgdaniel » Fri Nov 18, 2022 7:56 pm

arne_v wrote:
Fri Nov 18, 2022 7:20 pm
Interesting.

OTS$MOVE5 is pretty obvious.
8< snip 8<
And documentation for lib$move5 says:

"Number of bytes of data to move. The longword-int-source-length argument is a signed longword that contains this number. The value of longword-int-source-length may range from 0 to 2,147,483,647. "

so -1 should fail in some way.
8< snip 8<
Pure speculation.
Absolutely is. Perhaps even the reputed (but undocumented) OTS$FILL entry point initialisation.

Code: Select all

IA64$ mcr []arne_v_2
01010101010101010101
02020202020202020202

Code: Select all

X86VMS$ mcr []arne_v_2
01010101010101010101
02020202020202020202

User avatar

arne_v
Master
Posts: 299
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Offline
Contact:

Re: Debugging on X86 (without compiler listings)

Post by arne_v » Sat Nov 19, 2022 9:47 am

ots$fill is undocumented but the internet knows everything.

:-)

Code: Select all

#include <stdio.h>
#include <string.h>

#include <ots$routines.h>

void ots$fill(void *addr, size_t len, unsigned char b);

int main(int argc, char *argv[])
{
   char buf[10];
   int i;
   memset(buf, 0x01, sizeof(buf));
   for(i = 0; i < sizeof(buf); i++) printf("%02X", buf[i]);
   printf("\n");
   ots$move5(0, NULL, 0x02, sizeof(buf), buf);
   for(i = 0; i < sizeof(buf); i++) printf("%02X", buf[i]);
   printf("\n");
   ots$fill(buf, sizeof(buf), 0x03);
   for(i = 0; i < sizeof(buf); i++) printf("%02X", buf[i]);
   printf("\n");
   return 0;
}
Arne
arne@vajhoej.dk
VMS user since 1986

Post Reply