Names that break databases

transliteration is not even needed to create problems: my last name contains a ß, because I was not fast enough to correct him the clerk in the Swiss bank opened my account with a name containing a B.

10 Likes

They did not. That isn’t the problem. The problem is delimiters and special characters, which is a very hard problem because ideally you want to be able to store every possible character in your database, so as soon as you have a special delimiter like ` you then have to have an escaped way of storing it. And data has to be conveniently entered by keyboard and there is only so much you can do with a keyboard.
In the XKCD example the issue is the failure of application code to separate data from SQL syntax. It is not a problem with SQL itself. Any experienced programmer sanitises the hell out of anything anybody uploads over an Internet connection, because the Internet user base can be taken as an intersection of the set of idiots and the set of complete bastards. With Java it helps a lot to use PreparedStatements and parameterised queries but it is still a good idea to sanitise the inputs.
Incidentally, the Apache Derby database, which is pure Java, allows you to store java classes in the database and extract, load and use them in your application code. With a little care in programming you can do this with perfect safety, though if you were stupid enough to allow a user to upload serialized classes the result could be an enormous fustercluck.

6 Likes

Ha. When I started work at one company we had a product which had expensive control panels, all made of screen printed aluminium. I wasn’t too popular when I pointed out that the German ones would all have to be scrapped because the cursor controls were labelled Fein and GroB. The department concerned had people fluent in English, French, Russian and Italian - but not German.

6 Likes

When I worked in a library, we had a book that was a list of German nouns and their gender. The title was Der, Die, Das?
By my reading, according to the rules the title should have been encoded as Die, Das? Der.

3 Likes

I used to work for a couple of direct mail companies. The standard software (Code1, anyone?) would always kick out two digit names as errors (being less than 3 digits), causing us to pay for (and lose) all the Asian names like Li, Hu, Na, and so on. I can’t believe how long this had been allowed to go on before I made a workaround for it.
Probably still happens at other companies.

4 Likes

I know of one American living in Germany who couldn’t sign up for some US website because her address contained more than five consonants in a row.

1 Like

Is it really that hard to transliterate “underscore”?

2 Likes

Well, there is always a choice. And absolutely no database ever made prohibits the storing of the word null in a text field. This is about custom or semi-custom applications, not the database vendor itself.

[quote] If that means putting a consistent text value of “null” into a NOT NULL field rather than argue for the next three years over why you should get your way even if other things are broken - that’s life.
[/quote]

I don’t doubt this bug exists, but it is pretty uncommon (no database I have worked with has this limitation, and I have seen some stupid ones) and mind numbingly stupid. This is not on the scale of ordinary “hack to get thing working.” Even in cases where you have to make a workaround for a bad database schema, you would pick a sentinel like “” that would at least be quite uncommon. The other thing is, when you make a "special value’ like null or N/A or NFN (no first name, commonly used on legal forms for people who don’t have a first name) to get around a limitation of your database schema that string is then a LEGAL value which is the whole point of the exercise. To designate the name null to mean “name unavailable but our stupid schema doesn’t allow that”, and then not allow null to be inserted is beyond stupid.

all mapped in Unicode code points

Um… like who? And what would it take to accommodate those people’s names? I mean, sorry, but… how would you even… as a software developer…? I would support them to get their alphabet included into Unicode, or whatever, if they were an indigenous people or something like that. But I couldn’t exactly rewrite a database server…? I remember when Prince renamed himself to that symbol thing. Mr squiggly symbol, Sir, I don’t question your right to name yourself whatever made-up symbol you like, but if you want to be in my database, you must give me some Unicode representation. quora.com: is Prince’s symbol a Unicode character? - (answer: no, and Unicode has a rule against such things).

1 Like
40. People have names.

At that point it’s not the programmer’s problem. Fuck you, your name is John Doe.

and not the one from the punk band with a similarly problematic name

5 Likes

I work in software in the U.S. and had a hell of a time trying to email one “Pankaj Kumar”. There were 8, out of our 7k employees.

2 Likes

Some people use names that contain garbage Characters, so as not to be coopted into an illegal fraud upon the Constitution.

Case in Point:

:JUDGE: David-Wynn: Miller:

Apparently, he reads Boing Boing

Hi, David!

5 Likes

[BOING-BOING-READS LIKE: BABBLE,
BIOUS, LIES, MISLEADING-STATEMENTS, GUESSING, OPINIONS, MISINFORMATION
&: LAZY-MINDS THAT DONOT-KNOW-SYNTAX

I think he’s just pissed off about all the markdown.

6 Likes

Der, die, das, die? (masculine, feminine, neuter, plural?)

(I’m sure one of our German happy mutants will correct us if we’re wrong!)

1 Like

Is the autocorrect that shows up when I try to enter my email address something like this? I don’t know whose boneheaded decision it was to make the email field something that needs spellchecked, but it pisses me off every time.

Well people can generally figure out plurals…But the rules for filing titles (which date back to Dewy and catalog cards) are that you ignore initial articles in titles. So the card for The Pokey Little Puppy is written as Pokey Little Puppy,The. This is so that the T’s aren’t inordinately huge, and so that people don’t have to look under both T and A if they can’t remember whether the title is The Girl Who Played With Fire or A Girl Who Played With Fire So the MARC format* (which is what books are described in) has an _indicator_before the title field telling the computer how many initial characters to ignore when alphabetizing the heading.

*MAchine Readable Cataloging https://www.loc.gov/marc/bibliographic/

5 Likes

Unfortunately, her maiden name can only be expressed in UTF-16, not UTF-8.


I had a co-worker who was asked for his name in an oral visa application, and he answered Shaishiv, and wondered why they never got around to asking him his family name. Then he got his passport as “Fnu, Shaishiv” - which meant his diploma and work IDs had to match. It snowed him to note end, as it stood for “First Name Unknown.”

4 Likes

[BOING-BOING-READS LIKE: BABBLE, BIOUS, LIES, MISLEADING-STATEMENTS, GUESSING, OPINIONS, MISINFORMATION &: LAZY-MINDS THAT DONOT-KNOW-SYNTAX & FAILURE OF THE LEARNING-CORRECT-SYNTAX-GRAMMAR FOUND IN THE STYLES-MANUALS OF THE WORLD-TREATIES ON THE “SYNTAX-GRAMMAR”. FOR THE LABOR-YEARS~14-TO~55-PAID-ALL-STATE-FED-SS-TAXES.

You know, I think that he may be disappointed in Boing-Boing.

3 Likes

Brings to mind Francis Dec

http://www.bentoandstarchky.com/dec/intro.htm

Hashtag, notallprogrammers

2 Likes