Struggling with long index fields – HASH is the key

Every now and then you come across a limitation in ACL. This happened to me the other day when I needed to relate a couple of tables using a descriptive text field. Who would have thought that an INDEX field of more than 247 characters would be an issue? Well they are and, unfortunately, my text [key] fields were over 1,000 characters in length. No good.

HASH, Ian

My first attempt to deal with the problem was to create a computed field based on a SUBSTRING of the text field, setting the length of the computed field to 247 characters. This dealt with the initial problem, but unfortunately created a new problem – my key field was truncated and no longer unique so it didn’t work effectively to relate my two tables. Back to the drawing board.

Fast forward a few days and I’d just been brushing up on one or two of the newer features of ACL v11. One in particular caught my eye – the HASH function. It sounded pretty cool for those who needed to protect private or confidential information. There’s definitely going to be someone out there who needs it, but not me just yet.

The HASH function takes any type of input field and converts it into a 64 character coded hash value:

  • 10 characters will HASH into 64
  • 100 characters will HASH into 64
  • 1000 characters will HASH into 64 – ahah!

Then it hit me – MORE THAN 247 CHARACTERS + HASH = SOLUTION to my ‘long key’ problem.

To be clear, if I HASH my long text field, the result will be a character field with a length of 64. No matter how long (or short) my field is, HASH will give me 64 characters out the other end. Perfect!

So I created a new computed field in each of my tables:

  • k_Table1 = HASH(Text_Field1)
  • k_Table2 = HASH(Text_Field2)

and was able to relate them straightaway without any glitches. I didn’t even have to worry about ‘key harmonisation’ because both keys were created in the same format by the HASH function. Brilliant!

That wasn’t quite the end of the story, because there were one or two ‘trivialities’ that I had to consider:

  • Need to watch out for blanks or spaces – ie, when there is no matching key on the ‘child’ table in a relation or either the primary or secondary table in a join. Any blank or string of spaces will evaluate to a 64 character hash value so it may be necessary to set this to a variable in order that you can test for blank key values:

Command, Display HASH

  • HASH will ignore any leading or trailing spaces. It’s still good practice to use ALLTRIM to tidy up character strings, but you don’t have to do so explicitly before applying the HASH function
  • The output from the HASH function is always a string of 64 characters, regardless of whether the value being hashed is character, numeric, date or logical. This means that any pair of keys created using the HASH function will always be of matching datatype and length and therefore never need to be further harmonised.
  • The HASH-ed key field is really ‘for internal use only’ – the underlying data is effectively encrypted and therefore can’t be interpreted in any output except as a link or index to the underlying data record

It really is that simple. Next time you have an index field of more than 247 characters, don’t look for a work around that may or may not do the trick, simply HASH it!

Wednesday, June 1, 2016 In: Hot Topics Comments (None)

Contact us

3 Appleton Court, Calder Park
Wakefield, WF2 7AR

+44 (0) 1924 254 101

enquiries@dataconsulting.co.uk

Mailing List

Subscribe to our newsletter.