LibXML2 issue

Post Reply

Topic author
rodprince
Contributor
Posts: 18
Joined: Mon Aug 14, 2023 6:00 pm
Reputation: 0
Status: Offline

LibXML2 issue

Post by rodprince » Thu Mar 28, 2024 11:24 am

This issue exists on HPE I64 C Compiler and VSI x86 C Compiler. I have to assume its actually working as intended and my limited C skills are not really up to the task, hence the reason I am hoping someone with OpenVMS C knowledge has the answer.

We use LibXML2 to parse XML and build the DOM tree so we can extract values. When we get malformed XML data, LIBXML2 will attempt to write the error to the xmlLastError variable declared in globals.c The definition of xmlLastError is straight forward. Its just

xml1Error xmLastError; //just a typical type and variable name declaration

xmlLastError is never written to in globals.c The only reference to it is via a routine that will pass back the address of the variable. When an error is detected in the XML data, the parsing logic gets the address for xmlLastError and attempts to write the error to it. Since xmlLastError has been declared by either the compiler or linker to be Read Only, this results in an access violation which is not an elegant way to handle malformed XML data.

Currently to get around this issue, I have been altering the opensource globals.,c code to be:

xmlError xmlLastError = {0};

I am assuming the compiler/linker is optimizing something and thinking to itself, this variable is never written to so I can just park it into read only land and the initialization stops that.

Is there a compiler switch/setting or some other way to force the compiler/linker to not place the variable into the programs read only section? Is there any proper way to tell the compiler/linker that this variable is not read only, ie maybe in the link .opt file or something. I would like to not change the open source code but instead just alter our local build code for the package.

Rod Prince


hb
Valued Contributor
Posts: 79
Joined: Mon May 01, 2023 12:11 pm
Reputation: 0
Status: Offline

Re: LibXML2 issue

Post by hb » Thu Mar 28, 2024 12:05 pm

It looks like xmlLastError is global. So a linker map (/CROSS/FULL) should show that it is placed in a read-only program segment, and most likely which PSECT it was in, generated by which source module. So please post (a link to) the map file.

If xmlLastError isn't global (and if its address is passed to another routine, it doesn't have to be global), the traceback information can also help to see where this variable really is.

The linker does not optimize anything like this, the linker just collects the PSECTs, here the data, with the same attributes, here read only, as set by the compilers, into ELF program segments. Although PSECT attributes can be changed with linker options, I suggest using them as a last resort.


craigberry
Contributor
Posts: 23
Joined: Fri Nov 17, 2023 11:27 am
Reputation: 1
Status: Offline

Re: LibXML2 issue

Post by craigberry » Thu Mar 28, 2024 1:26 pm

You didn't say what version of libxml2. Current sources have it declared like so in globals.h:

Code: Select all

XMLPUBFUN xmlError * XMLCALL __xmlLastError(void);
#ifdef LIBXML_THREAD_ENABLED
#define xmlLastError \
(*(__xmlLastError()))
#else
XMLPUBVAR xmlError xmlLastError;
#endif
XMLPUBVAR is a macro that includes the extern keyword, which may well have an impact on the problem you are experiencing. My suggestion would be to compile with /list/show=(include,expansion) and make sure you see the expanded definition that the compiler sees and that it contains the storage class specifier you would need for the variable to actually be global.


Topic author
rodprince
Contributor
Posts: 18
Joined: Mon Aug 14, 2023 6:00 pm
Reputation: 0
Status: Offline

Re: LibXML2 issue

Post by rodprince » Thu Mar 28, 2024 4:33 pm

We are currently building 2.10.3 on x86.

2.10.3 does have the XMLPUBVAR definition, but until you posted the code snippet, I did not think it thru. We are building with LIBXML_THREAD_ENABLED (not sure why, but we are). This means globals.c does not see the XMLPUBVAR clause when it includes globals.h. Its just a simple global variable declared in globals.c with nothing describing it.

This just came up again for me, because we are in the process of rebuilding our VMS x86 image with the latest 9.2-2 release. Essentially, I was going thru the process we used for x86 9.2-1 testing. I just wanted to be ready to start testing the next version of the pascal compiler once it drops.

When we build the code without the ={0}; addition, we get the following in the shared image .map

xmlLastError 00004A14 00004A47 00000034 ( 52.) LONG 2 OVR,REL,GBL,NOSHR,NOEXE, WRT,NOVEC,NOMOD
<Linker> 00004A14 00004A47 00000034 ( 52.) LONG 2

With the ={0} addition, its

xmlLastError 000027A0 000027D3 00000034 ( 52.) OCTA 4 OVR,REL,GBL,NOSHR,NOEXE, WRT,NOVEC, MOD
globals 000027A0 000027D3 00000034 ( 52.) OCTA 4 Initializing Contribution

Rod
Last edited by rodprince on Thu Mar 28, 2024 4:59 pm, edited 2 times in total.


hb
Valued Contributor
Posts: 79
Joined: Mon May 01, 2023 12:11 pm
Reputation: 0
Status: Offline

Re: LibXML2 issue

Post by hb » Thu Mar 28, 2024 7:38 pm

From the image point of view, the ELF segments aren't really different. In both cases the segment is WRT. The NOMOD indicates it is a demand zero segment, something the linker sets up, the MOD indicates it is a file based segment, contributed by the "globals" module, something the compiler generates. The only (other) difference I see is the alignment of xmlLastError, at a longword boundary if uninitialized and on an octaword boundary if initialized. Whether and/or why that can make a difference, I do not see.

This seems to be C++ code or at least the C++ compiler. It seems the source is compiled with the default extern model. Can you try to use /EXTERN_MODEL=STRICT_REFDEF? I would expect that the alignment for the uninitialized xmlLastError will change. It should be octaword aligned. Again, whether this makes the code working, I have no idea.

Do you have a traceback from the access violation?

You mentioned, that the same problem shows on IA64. Do you have a map (extract) showing where xmlLastError is? And traceback output?

Post Reply