DECnet phase V protocol bug in NSP

OpenVMS x86 Field Test questions, reports, and feedback.
Post Reply

Topic author
pkoning
Contributor
Posts: 16
Joined: Fri May 19, 2023 2:19 pm
Reputation: 0
Status: Offline

DECnet phase V protocol bug in NSP

Post by pkoning » Mon Sep 04, 2023 8:13 pm

I'm running my E9.2-1 x86 system in Phase V mode (mostly to remind me what that looks like). I tried to have a Phase IV node do remote network management access, i.e., the NICE protocol. That fails; apparently Phase V doesn't include support for NICE V4.0. Either that, or I haven't done the magic word to enable it (is there one?).

What's interesting is that the reject comes in the form of a message that violates the NSP protocol spec -- which says that disconnect or connect reject data is an I-16 fields, i.e., a counted string. What VMS Phase V sends instead is the data without the preceding count. In the case of the NICE reject, the data is the expected version number, 5.0.0. So I see a reject message with the usual fixed fields folllowed by 3 bytes 0x05 0x00 0x00. What should be there is 4 bytes: 0x03 0x05 0x00 0x00, with the 03 being the byte count of the data that follows. Interestingly enough, in disconnects that have no data I do see the correct encoding (there still is a length field containing 0).

Most amusing is that you can reproduce this just on the VMS system. Run NCP, and try to do a "remote" connection to this system. The result is a console log message showing a protocol error.

Code: Select all

NCP>set exe nod pkvms2
%%%%%%%%%%%  OPCOM  28-AUG-2023 16:19:04.92  %%%%%%%%%%%
Message from user SYSTEM on PKVMS2
Event: Remote Protocol Error from: Node LOCAL:.PKVMS2 NSP Local NSAP 490029AA000
40041A420 Remote NSAP 490029AA00040041A420,
        at: 2023-08-28-16:19:04.928-04:00Iinf
        Reject Cause=Invalid Message Format, 
        Erroneous Transport PDU='3811001200000005'H
        eventUid   9B7499F2-45BE-11EE-BD90-AA00040041A4
        entityUid  7D071144-45BE-11EE-BD57-AA00040041A4
        streamUid  B7A1055C-F737-11ED-8801-AA00040041A4
I can also capture this when testing from another node, by using Wireshark. That gives me basically the same information but it shows the entire packet:

Code: Select all

0000   aa 00 04 00 01 a4 aa 00 04 00 41 a4 60 03 20 00   ..........A.`. .
0010   81 26 00 00 aa 00 04 00 01 a4 00 00 aa 00 04 00   .&..............
0020   41 a4 00 01 00 00 38 20 ea 0a 00 00 00 05 00 00   A.....8 ........
0030   00 00 00 00 00 00 00 00 00 00 00 00               ............
(The Ethernet payload ends at offset 002f -- the line starting at 0030 is padding.)

This isn't a major problem given that it happens on a connect reject so communication isn't possible anyway. But it would be good for the system to generate conforming messages. I haven't attempted to construct test cases for other related NSP messages like connect confirm and disconnect initiate, which both have the same encoding for optional data. If connect confirm has the same bug, that would be a bigger deal.

Post Reply