crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Having difficulties when installing the system? Your system runs slowly and requires some tweaking? You can get help here.
Post Reply

Topic author
neliasen
Member
Posts: 8
Joined: Mon Aug 28, 2023 5:23 am
Reputation: 0
Status: Offline

crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by neliasen » Mon Sep 04, 2023 4:05 am

Hello
My OpenVMS 9.2-1 hangs/crashes/whatever after a few days ... and then doing a "anal/crash" gives me the following:

Code: Select all

Crashdump Summary Information:
------------------------------
Crash Time:         1-SEP-2023 23:45:23.60
Bugcheck Type:     CPUSANITY, CPU sanity timer expired
Node:              VMS2    (Standalone)
CPU Type:          QEMU Standard PC (Q35 + ICH9, 2009)
VMS Version:       V9.2-1
Current Process:   NULL
Current Image:     <not available>
Failing PC:        FFFF8300.07DF3D04    SYSTEM_PRIMITIVES_4_MIN+8008DD04   (INTERVAL_TIMER + 000023C4 / line 52943)
Failing PS:        00000000.00000000
Module:            SYSTEM_PRIMITIVES_4_MIN    (Link Date/Time: 26-JUL-2023 22:09:03.27)
Offset:            8008DD04

Boot Time:          1-SEP-2023 04:48:25.00
System Uptime:               0 18:56:58.60
Crash/Primary CPU: 0./0.
System/CPU Type:   0000
Saved Processes:   13
Pagesize:          8 KByte (8192 bytes)
Physical Memory:   7915 MByte (2621440 PFNs, discontiguous memory)
Dumpfile Pagelets: 0 blocks
Dump Flags:        olddump,writecomp,errlogcomp
Dump Type:         raw,full,shared_mem
EXE$GL_FLAGS:      init,bugdump
Paging Files:      1 Pagefile and 0 Swapfiles installed

and "clue crash" gives me: 

Crashdump Summary Information:
------------------------------
Failing Instruction:
INTERVAL_TIMER + 000023C4 / line 52943:  jmpb    -2F

I am a bit lost here .... any ideas ?
Last edited by marty.stu on Mon Sep 04, 2023 4:59 am, edited 1 time in total.

User avatar

imiller
Master
Posts: 147
Joined: Fri Jun 28, 2019 8:45 am
Reputation: 0
Location: South Tyneside, UK
Status: Offline
Contact:

Re: crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by imiller » Mon Sep 04, 2023 7:16 am

in SYS$ERRORLOG there should be a file called CLUE$*.LIS ( the name includes the node name and date and time of crash). Raise an issue on the Support Portal and attach this file and the result of running VSI$SUPPORT.COM ( you can find this DCL zipped on the Support Portal )
Ian Miller
[ personal opinion only. usual disclaimers apply. Do not taunt happy fun ball ].

User avatar

volkerhalle
Master
Posts: 198
Joined: Fri Aug 14, 2020 11:31 am
Reputation: 0
Status: Offline

Re: crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by volkerhalle » Tue Sep 05, 2023 6:41 am

In an OpenVMS SMP system, each CPU is monitoring the sanity timer of the neighboring CPU.
If that other CPU is stuck/hung/HALTed, a CPUSANITY crash is taken.

Look at the crash with SDA> CLUE CONFIG to determine the state of the other CPUs.

Volker.


Topic author
neliasen
Member
Posts: 8
Joined: Mon Aug 28, 2023 5:23 am
Reputation: 0
Status: Offline

Re: crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by neliasen » Tue Sep 05, 2023 8:09 am

I get the following:

Code: Select all

$ anal/crash  SYS$SYSTEM:SYSDUMP.DMP

OpenVMS system dump analyzer
...analyzing an x86-64 interleaved memory dump...

Dump taken on  1-SEP-2023 23:45:23.60 using version V9.2-1
CPUSANITY, CPU sanity timer expired

SDA> clue config

System Configuration:
---------------------
System Information:
System Type   QEMU Standard PC (Q35 + ICH9, 2009)       Primary CPU ID 0.
Cycle Time    0.44 nsec (2295 MHz)                      Pagesize       8192 Byte
%CLUE-W-NOSYMBIOS, cannot access SMBIOS table


System Processor Configuration:
-------------------------------
CPU ID         0                         CPU State    rc,pa,pp,cv,pv,pmv,pl
CPU Type       unknown 00000000.00000000
Halt PC        00000000.00000000         Halt PS      00000000.00000000
Halt code      Bootstrap or Powerfail    Halt Req.    Default, No Action
Slot VA        FFFFFFFF.8CF3F000         CPUDB VA     FFFFFFFF.82000000
Package        Unknown                   Core         Unknown
Thread id      Unknown                   Cothread id  None
FW Usage       Unknown                   CPU die      Unknown
ACPI CPU id    00000000.00000000         Serial Num
LID            00000000.00000000         CFG flags    Unknown

CPU ID         1                         CPU State    bip,pa,pp,cv,pv,pmv,pl
CPU Type       unknown 00000000.00000000
Halt PC        00000000.00000000         Halt PS      00000000.00000000
Halt code      Bootstrap or Powerfail    Halt Req.    Default, No Action
Slot VA        FFFFFFFF.8CF3F430         CPUDB VA     FFFFFFFF.8D35A000
Package        Unknown                   Core         Unknown
Thread id      Unknown                   Cothread id  None
FW Usage       Unknown                   CPU die      Unknown
ACPI CPU id    00000000.00000001         Serial Num
LID            00000000.00000001         CFG flags    Unknown

User avatar

volkerhalle
Master
Posts: 198
Joined: Fri Aug 14, 2020 11:31 am
Reputation: 0
Status: Offline

Re: crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by volkerhalle » Tue Sep 05, 2023 8:23 am

So the state of CPU 1 is the problem, it shows:

CPU State bip,pa,pp,cv,pv,pmv,pl

In a running system, it would show:

CPU State rc,pa,pp,cv,pv,pmv,pl

According to LIB.REQ:

macro SLOT$V_BIP = 264,0,1,0 %; ! Bootstrap in progress

Volker.


Topic author
neliasen
Member
Posts: 8
Joined: Mon Aug 28, 2023 5:23 am
Reputation: 0
Status: Offline

Re: crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by neliasen » Tue Sep 05, 2023 10:31 am

Hello
Sounds very reasonable.....
But why is there a difference for the two CPU's ??
I have not "tinkered" with special flags for the CPU's (nor anything else for that matter!)

User avatar

volkerhalle
Master
Posts: 198
Joined: Fri Aug 14, 2020 11:31 am
Reputation: 0
Status: Offline

Re: crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by volkerhalle » Tue Sep 05, 2023 10:43 am

The 2 CPUs could be in different states. E.g. CPU 01 could have been halted after a software error.

What's the state of the 2nd CPU, if your system is running normally ?

$ ANALYZE/SYSTEM
SDA> CLUE CONFIG
SDA> EXIT

Do you record the console output on OPA0 (maybe with a Putty logfile ?) There may be some messages, when the system gets into this state.

It could also be some problem with the VM and the 2nd vCPU. Do log have logfiles to check ?

Volker.


Topic author
neliasen
Member
Posts: 8
Joined: Mon Aug 28, 2023 5:23 am
Reputation: 0
Status: Offline

Re: crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by neliasen » Wed Sep 06, 2023 6:33 am

running "analyze /system" and then "clue config" gives me now that the CPU states are identical ....

Code: Select all

System Processor Configuration:
-------------------------------
CPU ID         0                         CPU State    rc,pa,pp,cv,pv,pmv,pl
CPU Type       unknown 00000000.00000000
Halt PC        00000000.00000000         Halt PS      00000000.00000000
Halt code      Bootstrap or Powerfail    Halt Req.    Default, No Action
Slot VA        FFFFFFFF.8CF3F000         CPUDB VA     FFFFFFFF.82000000
Package        Unknown                   Core         Unknown
Thread id      Unknown                   Cothread id  None
FW Usage       Unknown                   CPU die      Unknown
ACPI CPU id    00000000.00000000         Serial Num
LID            00000000.00000000         CFG flags    Unknown

CPU ID         1                         CPU State    rc,pa,pp,cv,pv,pmv,pl
CPU Type       unknown 00000000.00000000
Halt PC        00000000.00000000         Halt PS      00000000.00000000
Halt code      Bootstrap or Powerfail    Halt Req.    Default, No Action
Slot VA        FFFFFFFF.8CF3F430         CPUDB VA     FFFFFFFF.8D35A000
Package        Unknown                   Core         Unknown
Thread id      Unknown                   Cothread id  None
FW Usage       Unknown                   CPU die      Unknown
ACPI CPU id    00000000.00000001         Serial Num
LID            00000000.00000001         CFG flags    Unknown

and the "analyze /crash" shows the following:
Crashdump Summary Information:
------------------------------
Crash Time:         1-SEP-2023 23:45:23.60
Bugcheck Type:     CPUSANITY, CPU sanity timer expired
Node:              VMS2    (Standalone)
CPU Type:          QEMU Standard PC (Q35 + ICH9, 2009)
VMS Version:       V9.2-1
Current Process:   NULL
Current Image:     <not available>
Failing PC:        FFFF8300.07DF3D04    SYSTEM_PRIMITIVES_4_MIN+8008DD04   (INTERVAL_TIMER + 000023C4 / line 52943)
Failing PS:        00000000.00000000
Module:            SYSTEM_PRIMITIVES_4_MIN    (Link Date/Time: 26-JUL-2023 22:09:03.27)
Offset:            8008DD04

Boot Time:          1-SEP-2023 04:48:25.00
System Uptime:               0 18:56:58.60
Crash/Primary CPU: 0./0.
System/CPU Type:   0000
Saved Processes:   13
Pagesize:          8 KByte (8192 bytes)
Physical Memory:   7915 MByte (2621440 PFNs, discontiguous memory)
Dumpfile Pagelets: 0 blocks
Dump Flags:        olddump,writecomp,errlogcomp
Dump Type:         raw,full,shared_mem
EXE$GL_FLAGS:      init,bugdump
Paging Files:      1 Pagefile and 0 Swapfiles installed

User avatar

volkerhalle
Master
Posts: 198
Joined: Fri Aug 14, 2020 11:31 am
Reputation: 0
Status: Offline

Re: crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by volkerhalle » Wed Sep 06, 2023 6:42 am

The state of both CPUs in the running system looks normal.

The question is, how did CPU 1 get into that 'bip' state and probably stopped to update it's sanity timer.

Could you try to log the output of the OPA0: console to some log file ? And then look at the messages preceeding the next CPUSANITY crash ?

Is the any log file from QEMU ?

Volker.


Topic author
neliasen
Member
Posts: 8
Joined: Mon Aug 28, 2023 5:23 am
Reputation: 0
Status: Offline

Re: crash: INTERVAL_TIMER + 000023C4 / line 52943: jmpb -2F

Post by neliasen » Thu Sep 07, 2023 7:30 am

i'll try to make OPA0: log to some file.. and also get some log from QEMU .... (none found right now...)
I just thought that having different flags set for the CPU's was ... in normal cases! .. not possible!

Post Reply