Bit as a key

Everything about buying, using, and managing OpenVMS systems not covered by other sections.
Post Reply

Topic author
kwfeese
Contributor
Posts: 10
Joined: Fri Nov 18, 2022 11:34 am
Reputation: 0
Status: Offline

Bit as a key

Post by kwfeese » Wed Mar 29, 2023 6:22 pm

We have a file that uses the bit to determine what to do in a particular situation with an account.

I want to skip all the accounts where the 5th bit is set to a certain value and was wondering if it's possible to create an alternate key on the specific bit in the field I'm trying to bypass.

The field is one character in length but I'm only focused on the 5th bit.


joukj
Master
Posts: 158
Joined: Thu Aug 27, 2020 5:50 am
Reputation: 0
Status: Offline

Re: Bit as a key

Post by joukj » Thu Mar 30, 2023 2:39 am

Most programming languages have this option. Below I give an example in Fortran (which I know best)

character*1 ch
integer*1 i1
equivalence (i1 , ch )
open(init=1 , file='your file' )
read(1,'(a)' ) ch
if (btest( I1 , 5-1) )then
..... action for 5th bit set
else
.actioin for 5th bit not set
endif
Last edited by joukj on Thu Mar 30, 2023 2:41 am, edited 1 time in total.


sms
Master
Posts: 310
Joined: Fri Aug 21, 2020 5:18 pm
Reputation: 0
Status: Offline

Re: Bit as a key

Post by sms » Thu Mar 30, 2023 2:48 am

Code: Select all

> We have a file [...]

   And you're reading it how, exactly?  Some high-level language
built-ins?  RMS system services?  Other?

> [...] alternate key on the specific bit in the field [...]

   I know nothing, but I don't see how you'd specify it.  According to
the "VSI OpenVMS Record Management Services Reference Manual":

      https://vmssoftware.com/docs/VSI_RMS_Ref_Manual_23Jul19.pdf

      7.10. RAB$B_KSZ Field

      The key size (KSZ) field contains a numeric value equal to the
      size, in bytes, of the record key pointed to by the RAB$L_KBF
      field. 

   How would you specify a key size of one bit, "in bytes"?

   Knowing what I know, I'd guess that you've compacted your data too
much to make that kind of selection easy.

      "See the Guide to OpenVMS File Applications for more information
      about accessing indexed records."


> Most programming languages have this option. [...]

   You can examine a bit in any language where you can do arithmetic,
but that's not using it as an RMS key.  (Which is how I read the
original question.)


tim.stegner
VSI Expert
Valued Contributor
Posts: 55
Joined: Wed Jul 21, 2021 9:14 am
Reputation: 0
Status: Offline

Re: Bit as a key

Post by tim.stegner » Thu Mar 30, 2023 8:35 am

It's -possible-, but probably not the way you'd normally think to do so. As previous poster stated, the smallest piece of record you can set as a key will be a single byte, in your case the character field you referenced. if it's really important to be able to access those records via that bit, and not just be part of other queries, the solution I see is to create a new field in the record which is populated with the value of that fifth bit only. This means commensurate changes to the application to keep the field up-to-date.

User avatar

arne_v
Master
Posts: 299
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Offline
Contact:

Re: Bit as a key

Post by arne_v » Thu Mar 30, 2023 5:25 pm

As already stated by several posters then indexes use keys that have byte offset and byte length, so you cannot just add an index to the file.

I see 3 options:

A) live with the cost of sequential scan of the entire file when you need to search for those bit values - if the bit in question is on/off 50%/50% and randomly distributed then the overhead of sequential scan is probably not that big (everything will need to be read from disk to memory anyway)

B) convert the file - like changing the 1 byte (with 8 bit values) to 8 bytes and creates indexes on those bytes that you need indexes on - that is a clean solution, but all applications using the file will need to change the file definition and the handling of this data from bits to bytes (changing all applications may or may not be a problem)

C) Create a new separate file with primary key from the original file and 8 bytes with indexes as above, existing applications run as always, then you can run an index generator that populate the new file based on the original file and the application that need the fast access can use the new file - there is a huge drawback that the index is not automatically updated and can be inaccurate if no repopulated before use
Arne
arne@vajhoej.dk
VMS user since 1986


hein
Active Contributor
Posts: 41
Joined: Fri Dec 25, 2020 5:20 pm
Reputation: 0
Status: Offline

Re: Bit as a key

Post by hein » Sun Apr 16, 2023 11:22 am

>> I want to skip all the accounts where the 5th bit is set to a certain value

Yeah we all made those stupid - bit saving - design choice back in the day it seems, and now we are stuck with it.

>> and was wondering if it's possible to create an alternate key on the specific bit in the field I'm trying to bypass.

NO can do.

>> The field is one character in length but I'm only focused on the 5th bit.

Are bit 6,7 or 8 in use? If not you are luck as you could used the byte value 16 as cut-off point. If they are you could consider multiple ranges like 48 (32+16) thru 63 for bit 6 being in use.

But back to the main question, What is the goal? Faster processing? fewer resources while processing? How my rows are we talking about? If it not hundreds of thousands than it's unlikely to be worth the effort and if it is hundreds of thousand then an alternate key with just two values is utterly useless unless combined with a 'null-key-value' where there are NO index entries made.

It's all about the distribution - in my experience even with 90 - 10 distribution it it better to just read all and discard 90% then to read 10% by key. This is because reading BY PRIMARY KEY typically get s 10+ possible 100 rows for a single IO and will likely prime read-ahead XFC/storage caches. Reading by alternate key is followed by likely random access for each row.

Now let's say that this is tasks file and the bit signifies 'done'. In such case there are likely thousands, if not millions, task 'done' over time and maybe just tens or hundreds of tasks needing to be done. In that case a key IS useful, but you'll have to convert the application structure and make it a BYTE, not a bit. - unlike an acceptable solution after 20? 30? 40? years in production.
If you do redesign, recompile, rebuilt, convert, and deploy then make sure that the byte value chosen for 'done' is declared as a NULL KEY value in RMS / FDL to avoid getting 'duplicate chains'. Those are disastrous for insert performance often causing an additional IO for every 1000 rows already present already - yes I've seen cases where adding a row in 10 million row file cause 50,000 IO's and took many seconds to find the place where to insert that next 'done' row pointer at the and of the chain of all 'done' row pointers (7 bytes each).

Hope this helps some,
Hein.

User avatar

arne_v
Master
Posts: 299
Joined: Fri Apr 17, 2020 7:31 pm
Reputation: 0
Location: Rhode Island, USA
Status: Offline
Contact:

Re: Bit as a key

Post by arne_v » Sun Apr 16, 2023 1:18 pm

The typical main application of a VMS is old (main application as in the application that is the reason the VMS is there - as opposed to various secondary applications that has been added over time because the VMS system was there and it needed to add various integrations/functionality).

My gut feeling is that the majority is from 1980-1995.

That was a different time. Much slower CPU's much less memory and much smaller disks. XX MHz with XX MB RAM and a bunch of XXX MB disks (as opposed to N cores at X GHz, XXX GB RAM and a bunch of X TB disks of today).

Choices were made to work in that environment.

When more powerful HW arrived then many of those choices should have been revisited and code changed.

But getting funding to redo something that already work is not always easy.

So we have these cases.
Arne
arne@vajhoej.dk
VMS user since 1986

Post Reply