MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

All types of networks, network stacks, and protocols supported by OpenVMS.

Topic author
sms
Master
Posts: 253
Joined: Fri Aug 21, 2020 5:18 pm
Reputation: 0
Status: Offline

MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by sms » Sat Jul 22, 2023 3:07 pm

Code: Select all

   Has anyone done anything (successful) with MOP on V9.2-1?  I've had
problems.  Around here:

R86 $ tcpip show vers

  VSI TCP/IP Services for OpenVMS x86_64 Version 66.0
  on a VMware, Inc. VMware7,1 running OpenVMS V9.2-1  

R86 $ show net decnet /full
[...]
    Implementation                    = 
       {
          [
          Name = OpenVMS x86_64 ,
          Version = "V9.2-1  "
          ] ,
          [
          Name = VSI DECnet-Plus for OpenVMS ,
          Version = "V9.2-E  8-MAY-2023 13:53:12.64"
          ]
       }
[...]


   The first symptom I noticed was this, when trying to talk to a
DECserver (this was on E9.2-1):

V87 $ set host /mop an2
%CCR-F-WRONGSTATE, wrong circuit state

   Looking at the circuit revealed an empty "Functions" list:

R86 $ ncl show mop circ * all
[...]
Node 0 MOP Circuit CSMACD-0
[...]
Status

    UID                               = 58795AA6-2878-11EE-8926-AA0004006408
    Functions                         = 
       {
       }
[...]

which would normally contain more (as on a working IA64 system):

    Functions                         = 
       {
          Loop Requester ,
          Console Requester ,
          Load Server ,
          Dump Server
       }

   A manual attempt also failed:

R86 $ ncl ENABLE NODE 0 MOP CIRCUIT CSMACD-0 FUNCTION = -
 {LOAD SERVER, DUMP SERVER, CONSOLE REQUESTER, LOOP REQUESTER}

Node 0 MOP Circuit CSMACD-0
at 2023-07-22-11:00:31.331-05:00Iinf

command failed due to:
 no resources available

   That seemed a bit vague, but sys$manager:net$mop_output.log said:

MOP$LOG_MODE = 0 or undefined, Mode logging disabled
%%% Management #3  22-JUL-2023 10:13:04.08 %%%
%MOP-E-XCREVCI, failed to create VCI port CSMACD-0(1)
-SYSTEM-F-INSFMEM, insufficient dynamic memory

   Later, I noticed the following at system start:

[...]
%NET$STARTUP-I-EXECUTESCRIPT, executing NCL script SYS$SYSROOT:[SYSMGR]NET$APPLI
CATION_STARTUP.NCL;
%NET-I-LOADED, executive image NET$LOOP_APPLICATION.EXE loaded
%NET$STARTUP-I-STARTPROCESS, starting process MOP
%RUN-S-PROC_ID, identification of created process is 0000041A
%NET$STARTUP-I-EXECUTESCRIPT, executing NCL script SYS$SYSROOT:[SYSMGR]NET$MOP_C
LIENT_STARTUP.NCL;
%SYSTEM-W-POOLEXPF, Pool expansion failed -- insufficient NPAGEVIR

%NET$STARTUP-I-OPERSTATUS, DECnet-Plus for OpenVMS operational status is RUNNING
-ALL
[...]

   "help /mess POOLEXPF" suggested increasing NPAGEDYN or buying more
memory.  The VM has 16GB, and this stuff worked on VAXes which couldn't
even _spell_ "giga", so I let AUTOGEN work its will.  That seemed to
boost a few parameters to values pretty close to what I see in a working
IA64 system.  For example:

R86 $ sysgen show NPAGE
Parameter Name            Current    Default     Min.       Max.   Unit  Dynamic
--------------            -------    -------   -------    -------  ----  -------
NPAGEDYN                 23601152    4194304    163840 1879048192 Bytes      
NPAGEVIR                126410752   16777216    163840 1879048192 Bytes      
NPAGECALC                       0          1         0          2 Coded-valu 
NPAGERAD                        0          0         0         -1 Bytes      
NPAGEDYN_S2                     6          2         2       1024 MBytes     
NPAGEXPVIR_S2                   8          4         4       2048 MBytes     

All of which had no obvious effect.

   Am I missing something obvious here, or am I the first person to try
SET HOST /MOP from a x86_64 system, or what?  Is something in the 
x86_64 MOP code asking for all the memory in the world?

   Note: I'm currently away from my usual LAN, leaving me out of touch
with my DECservers, so I can't do more testing than, say, those NCL
commands shown above.

User avatar

volkerhalle
Master
Posts: 191
Joined: Fri Aug 14, 2020 11:31 am
Reputation: 0
Status: Offline

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by volkerhalle » Wed Jul 26, 2023 3:27 am

Steven,

this error is easily reproducible. Even if you have not configured MOP in the initial DECnet-Plus node configuration, you can start MOP afterwards using @sys$system:startup network mop (see NCL HELP NETWORK MOP).

This will immediately cause nonpaged pool to be expanded up to it's virtual limit ! Check $ SHOW MEM/POOL/FULL before starting MOP and afterwards.

Something seems to be allocating 131072 byte packets (all just filled with zeroes) in a close loop ;-) As can be seen with

Code: Select all

VOLKER $ analyze/sys

OpenVMS system analyzer

SDA> show pool/ring
Pool History Ring-Buffer (NPP only)
-----------------------------------

                             (2048 NPP entries: Most recent first)

     Packet             Size               Type/Subtype           Caller's PC             Operation       IPL CPU      Time
----------------- ------------------  ---------------------- ------------------------ ------------------- --- --- -----------------
-<Alloc failure>-                  0  0/0                    00000000.8000615D        EXE$ALONPAGVAR        0   1 00B8A329.C95D476F
FFFFFFFF.88400000             131072  0/0                    00000124 Failure         EXPAND_NPP            8   1 00B8A329.C95D476F
FFFFFFFF.88400000             131072  0/0                    00000124 Failure         EXPAND_NPP            8   1 00B8A329.C95D476F
FFFFFFFF.88400000             131072  0/0                    00000124 Failure         EXPAND_NPP            8   1 00B8A329.C95D476F
...
FFFFFFFF.88400000             131072  0/0                    00000124 Failure         EXPAND_NPP            8   1 00B8A329.C95D476F
FFFFFFFF.883E0000             131072  0/0                    00010001 Success         EXPAND_NPP            8   1 00B8A329.C95D476F
FFFFFFFF.883C0000             131072  0/0                    00010001 Success         EXPAND_NPP            8   1 00B8A329.C95D476F
...
No 'tuning' efforts will be able to solve this problem ! Look at the time value in the last column, it will be the time when MOP has been started:

SDA> eva/time 00B8A329.C95D476F
26-JUL-2023 09:12:38.20

Volker.

PS: If you have configured MOP, you can disable it's startup by defining the logical NET$STARTUP_MOP FALSE (see NCL HELP NETWORK LOGICAL). Then delete this logical and manually start MOP using the above command. This will more clearly show the problem ...
Last edited by volkerhalle on Wed Jul 26, 2023 5:50 am, edited 1 time in total.


sodjan
Active Contributor
Posts: 37
Joined: Mon Apr 24, 2023 3:51 am
Reputation: 0
Status: Offline

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by sodjan » Wed Jul 26, 2023 5:55 am

If there is a memory leak, raising the NPAGE values will probably just give it more memory to fill up before it crashes.

Had a similar issue with acmeldap with a quota issue. Raising the quota values did nothing. In that case (a bit unrelated here maybe) the "solution" was to specify the ldap servers (MS AD servers) using ip addresses instead of host names. So it seems that the issue was in the dns lookup part of acmeldap...
Last edited by sodjan on Wed Jul 26, 2023 6:01 am, edited 2 times in total.


dgordon
VSI Expert
Contributor
Posts: 17
Joined: Tue May 09, 2023 7:57 am
Reputation: 0
Status: Offline

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by dgordon » Wed Jul 26, 2023 7:44 am

My "keep trying" comment was with respect to "HELP/MESSAGE"

The original suggestion to run AUTOGEN was to help determine if it was a config issue. Pool expansion failure is either configuration or a pool leak. This is obviously the latter.
Executive Vice President of InfoServer Engineering at VSI.


Topic author
sms
Master
Posts: 253
Joined: Fri Aug 21, 2020 5:18 pm
Reputation: 0
Status: Offline

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by sms » Wed Jul 26, 2023 12:56 pm

Code: Select all

> this error is easily reproducible. [...]

   Thanks for the analysis and suggestions.  I started to suspect
something similar when I saw AUTOGEN keep cranking up the parameters.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

> My "keep trying" comment was with respect to "HELP/MESSAGE"

   Understood.  So was my "I expect to. [...]".


> [...] This is obviously the latter.

   Seems so, even to me.  I'll relax, and wait for the (inevitable?)
fix.  And, when I get back to my normal LAN, I might even explore more
the use of LANCP for MOP.

   Is there (normally) any significant advantage to using LANCP for MOP,
other than its not requiring DECnet?  I've always run some kind of
DECnet, so never felt the need to learn one more networky thing.

   I might even have a dim recollection of reading something about this
use of LANCP, long ago, when it was new, but, if so, it didn't seem
advantageous then, either (to a DECnet user).


wmcgaw
VSI Expert
Visitor
Posts: 2
Joined: Thu Jul 27, 2023 7:57 pm
Reputation: 0
Status: Offline

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by wmcgaw » Thu Jul 27, 2023 8:08 pm

Hi,

I have tested both DECnet Plus and LANCP on two X86VMS test systems and neither work for me either to make a MOP connection to a terminal server on my network. I created a X86VMS system with Phase IV DECnet and I am able to successfully use NCP to MOP connect to a terminal server.

The reason for trying the LANCP MOP connection was to be able to provide you with a workaround in the event that worked. Unfortunately it doesn't work for you or for me.

This information has also been shared with the Our engineering group / the DECnet Maintainer.

Best regards,
walt

User avatar

volkerhalle
Master
Posts: 191
Joined: Fri Aug 14, 2020 11:31 am
Reputation: 0
Status: Offline

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by volkerhalle » Fri Jul 28, 2023 4:11 am

Walt,

the LANCP> CONNECT seems to be a different problem !

I can enable MOP using LANCP> SET DEV/DLL=ENABLE EIA0 and there is no nonpaged pool leak causing nonpaged pool to expand to it's virtual limit !

I don't have a DECserver on my home LAN, but a

$ mc lancp CONNECT NODE 08:00:2B:11:22:33/DEVICE=EIA0
Connecting to 08-00-2B-11-22-33 via EIA0 .......
MOP V3 format selected (253 cmd size), type Control-D to disconnect

seems to at least 'work', although it does not (i.e. can not) establish a connection to a non-existing MAC address. Interestingly it seems to imply, that it DID connect. CTRL-D does not work, but CTRL-C does.

LANCP> SET ACP/ECHO/FULL does not show any messages in SYS$MANAGER:LAN$node.LOG. But this may be expected, as it only seems to log received and sent MOP download messages...

So the LANCP CONNECT issue looks like a different problem.

Volker.

User avatar

martinv
Valued Contributor
Posts: 94
Joined: Fri Jun 14, 2019 11:05 pm
Reputation: 0
Location: Goslar, Germany
Status: Offline
Contact:

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by martinv » Fri Jul 28, 2023 6:54 am

I can confirm Volker's observation. On our network:

Code: Select all

LANCP> connect node 08-00-2B-BD-51-B3 /device=eib0
Connecting to 08-00-2B-BD-51-B3 via EIB0 ....
MOP V3 format selected (1534 cmd size), type Control-D to disconnect
#

Network Access SW V2.4 BL50 for DS700-08

(c) Copyright 2000, Digital Networks - All Rights Reserved

Please type HELP if you need assistance

Enter username>
Working hard for something we don't care about is called stress;
working hard for something we love is called passion.
(Simon Sinek)


johngemignani
Visitor
Posts: 1
Joined: Wed Sep 20, 2023 7:56 pm
Reputation: 0
Location: Greater Seattle
Status: Offline

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by johngemignani » Wed Sep 20, 2023 8:00 pm

I am updating and driver and moving it to C. Mistakenly, I declared my driver$init_tables() routine to be static, so it didn't publish the symbol in the image. The loader couldn't find it and therefore didn't call the init routine. It DID however, proceed to create a large number of 128K allocations which depleted NPP. When I fixed it, this problem went away. I can't say if this is similar, but it seemed strange that I am having the same problem.


wmcgaw
VSI Expert
Visitor
Posts: 2
Joined: Thu Jul 27, 2023 7:57 pm
Reputation: 0
Status: Offline

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by wmcgaw » Wed Oct 04, 2023 10:53 pm

Sorry, LANCP not working was my mistake (a typo in the MAC address of my terminal server). I retested the connection this evening and it works fine.

$ mcr lancp connect node 08-00-2B-B3-E7-37 /device=eib0
Connecting to 08-00-2B-B3-E7-37 via EIB0 ....
MOP V3 format selected (1534 cmd size), type Control-D to disconnect
#

Network Access SW V2.2 for DS90M (BL29B-52)

(c) Copyright 1997, Digital Equipment Corporation - All Rights Reserved

Please type HELP if you need assistance

Enter username> walt

Local>

I also noticed when I logout, ^D does not work and I need to use a ^Y to get back to the system.

Best regards,
Walt


Topic author
sms
Master
Posts: 253
Joined: Fri Aug 21, 2020 5:18 pm
Reputation: 0
Status: Offline

Re: MOP circuit configuration fails on V9.2-1 -- %SYSTEM-W-POOLEXPF, et al.

Post by sms » Thu Oct 05, 2023 12:59 am

Code: Select all

> $ mcr lancp connect node 08-00-2B-B3-E7-37 /device=eib0

   Around here, that seems to work (after a fashion):

V87 $ lancp connect node 08-00-2B-A0-F4-46 /devi = EIA0:
Connecting to 08-00-2B-A0-F4-46 via EIA0 ....
MOP V3 format selected (1534 cmd size), type Control-D to disconnect
[...]
Local> show server

DECserver 90TL V1.1C BL46-13  LAT V5.1  ROM 2.0.0  Uptime:  85 23:29:49

Address:   08-00-2B-A0-F4-46   Name:   AN2                Number:     0
[...]

   But there's a delay of around 5-10 seconds between typing characters
and seeing the echo/response.

> I also noticed when I logout, ^D does not work and I need to use a ^Y
> to get back to the system.

   No such trouble here:

Local> log
Local -020- Logged out port 17 on server AN2
[Ctrl/D]
%LANCP-I-CONTERM, Connection terminated
V87 $ 

But there was a multi-second delay for that, too.


   On my main IA64 system, all I get is:

ITS $ lancp connect node 08-00-2B-A0-F4-46 /devi = EWA0:
%LANCP-F-BADPARAM, bad parameter value
-LANCP-I-OTHERAPP, Another application may be using the device, device EWA0
%LANCP-E-CMDERROR, Error executing command

   I assume that that's because the MOP stuff is running there.  If I
get ambitious, I might try connecting a cable for EIA0: (the 100Mb/s
MP/iLO interface?), and then try using that.

Post Reply