I once answered a Question on StackOverflow regarding surnames and regular expressions. I thought this might be worthy of a note here as well.
The questioner wanted to how to write a regular to transform surnames with irregular capitalisations. I.e. names like
Quite simply this is not possible as there is no reliable rule that holds 100% of the time.
Consider the following names:
- Mrs Macey
- Mr Opal
- Mr Macdonald
They are all correct. Even Mr Macdonald who doesn’t capitalise his ‘D’s. Our regex would churn out:
- Mrs MacEy
- Mr O’pal
- Mr MacDonald
We have to be careful when dealing with surnames – these could be our customers after all. And there is little that is more insulting than having your own name being churned up and spat out by some half-baked regex. Especially as this may be done by several such half-based regexes at different companies. You may feel like you want to change your name just so they get it right!
It’s as bad a name mispronunciation. I feel for all the people named Cockburn – (pronounced ‘Coeburn’), or McLeod – (‘McCloud’).
Unfortunately, this is all too common. Some systems are programmed only to store uppercase characters, in which case you are scuppered, and you do have to rely on some magical but flawed algorithm.
Others seek to perform some sort of user-input validation or correction. In any such case, the validation system should allow the user to input what they intended and not tell them how to think.
And always make it really really easy in your systems and processes to make minor corrections to a surname. This is a human being after all!
I still get letters from Scottish Gas addressed to Mr G Wiseman. And yet, they know my first name is James. I’ve tried to change it but just go through numerous levels of call-centre, and then get told that I need to provide it in writing. and email is not good enough. Sigh!