Names that break databases

[Read the post]


because “null” is used to separate fields in databases themselves

But that’s not what the original article says. It says:

This is because the word “null” is often inserted into database fields to indicate that there is no data there

So there’s already some confusion here between the null character that is used to delimit strings in languages like C and the database NULL value.

I don’t deny that the people in this article have the problems described. But I find it hard to accept that this is down to deep-seated problem with computer systems. No, this is just down to individual programmers being either idiots or lazy. If your database access code can’t differentiate between the string “Null” and a database NULL, then you really need to employ some better programmers (or use better tools).


So the choice is to spend untold extra hours for maybe the rest of your life suffering because of your name, or changing your name. I know which one I would choose.


She should probably have stuck with her maiden name.


This article made more sense when it was posted to /. a few hours ago.

NULL doesn’t always break database applications. Only in a few badly engineered cases will it do that.


Article mentions a McKenzie who had trouble with databases in Japan. I’ve got him beat as my name on my passport requires 11+6+6 characters and due to having a middle name, every DB in Japan handles my name differently and most of them don’t match “my name”. It doesn’t “break” databases, its just something you learn to live with.

Its best to read Cory articles with a reduction filter, even though he frequently writes about IT, lots of details are lost on him.


Perhaps it was Undefined

One problem here is that if a variable in a programming language is null, your database connector may insert the word null in the field. Also there is the problem where the schema defines a field as NOT NULL and then someone needs to insert null values into it.
But I am not aware of ever having created a database which would not accept the word Null in a free text field like a name. In fact, you want it to do that for consistent sorting.

In my research into printers some years ago I discovered that in their SNMP structures some of them returned null for certain fields (no data), and some returned the word NULL or variations thereon. When large manufacturers can’t get it right, programmers need to be aware.
(Printer MIBs are full of bugs, including such things as some models encoding data as ASCII and others from the same manufacturer spelling out the same data coded as hex, e.g. ABCD versus 41:42:43:44. Trying to analyse this stuff is an interesting exercise in sanitising data.)


I’ve noticed Spanish names get very inconsistent treatment, because everybody’s name goes First-Middle-Last, right?


Part of my job is to perform user acceptance testing for health information systems. I have a long list of “problem” names that I use to see how well the system copes. These are not made-up names, but real ones I have collected over a period of time.
When I report to the vendors that their systems cannot cope with these (and, generally, at the beginning of testing they cannot) the typical vendor response is that these people will have to use a “simpler” variant of their names.
My response to the vendors is always the same; people have the right to be called anything they like. If your system cannot cope, then the problem is in your system, and I will not sign off on testing until you fix the defects in your system.


That sounds like a major bug in the database connector. It’s not a bug I’ve ever come across, but I can’t imagine a bug that serious remaining unfixed for very long.

That should lead to a conversation between the database designers and the programmers. If there’s an actual need for a null in a not-null field, then the database definition needs to change. If the programmer decides to work round this restriction using the string “Null” - well I refer to my previous comment about some programmers being idiots :slight_smile:


Good luck with interfacing with proprietary systems with that attitude. Unfortunately in the real world there is often no choice but to run with what they brung. If that means putting a consistent text value of “null” into a NOT NULL field rather than argue for the next three years over why you should get your way even if other things are broken - that’s life.
Distributed systems integrating the products of multiple vendors are like that.


In almost thirty years of writing systems like this, I’ve year to come across a situation like this.

Perhaps that’s another advantage of largely sticking with open source tools :slight_smile:


In order to process your name change request, please complete the following form:

ERROR: Required Field Last Name: Null


Dutch names - the van isn’t part of the last name and needs a separate field. I’ve been burned by the length of last names too - two long hyphenated names can get pretty long.


My name is a database-killing word!


Entering your name using lookalike unicode characters? Or maybe zero width spaces?

Looks like someone wrote a tool for this sort of thing:


Her maiden name was “FILENOTFOUND,” amirite?


Unfortunately many readily available systems - I’m looking at you, Mailchimp, for instance - come with predesigned field layouts with a firstname and lastname field, and people tend to assume this is the norm.
A number of my own family have names of the given name, given name, surname, surname format. How do you partition them? And then there’s Russian where there are three names but also a semi-official short form of the first name - e.g. Dmitri “Dima” Grigorevich Ulyanov. And Koreans with their preferred Western names, or Indians with westernised forms of their Indian names, e.g. Sunil who likes to be called Sunny in the office.
After years of dealing with this stuff on and off I think the only real solution is to ask people to fill in three boxes:
First box full name, Unicode
Second box how you prefer to be called when phoned or written to
Third box family, surname or name used by officials.
Third box is used for indexing.
Obviously there is a lot of overlap but when it comes to names, atomicity goes out of the window.

Rejoice in your good fortune rather than criticising people who have not been so lucky.

Getting itself to be a whole can of worms, Кпарра.


Amusingly relevant.

(for those who often dive right in without hitting the article like myself) :wink:



Microsoft Excel carried many killer bugs for 20 years. But they were rare and of 'limited" impact (i.e. either contained within a localised system, or easy to hide). Given that it’s the primary analytical tool of the majority of the financial community, you can start to see how significant it is that the bugs lasted that long.

I’m pretty sure they still endure. I’ve just learned to back off the areas where they crop up.