This type of attack was a real problem with the internationalized domain names until the browsers started to display them unmasked in the address bar.
I’ve hit so many control character, newline, smart case, alternate glyph issues that when in doubt I run code through a regex and spit out the character codes. Why,oh why are line endings still contentious?
Only then would I murder the sonofabitch that did that.
Some chat programs already implement this “feature”, ostensibly to prettify text, so if someone sends you a line or two of code via chat, you’ll end up with m-dash instead of n-dash, or pretty quotation marks instead of the standard ones.
It’s a pain in the ass, but usually not super hard to track down, as the compiler will give you a line number, and the clue that there’s an unrecognised symbol
“• Be fired, and then killed”
Always a crowd pleaser!
˙suɐƃıuɐuǝɥs ǝsǝɥʇ ʃʃɐ ɹoɟ ʃǝɐɥɔıɯɹǝɥʇo@ ǝɯɐʃq I
killed by firing!
It would give us SAS coders a huge headache, too!
Porting a file from mainframe to unix to windows (pick your combo) can be a real bear on line endings!
I don’t understand. Why would a compiler accept Unicode to begin with? Are there programming languages that require Hiragana or Tamil now?
“Wait, what? A 34 bit Honeywell processor and the OS only supports EBCDIC, but not all of it?? …I need to smoke something”
String literals? Comments?
[quote=“Boundegar, post:10, topic:68052, full:true”]
I don’t understand. Why would a compiler accept Unicode to begin with? Are there programming languages that require Hiragana or Tamil now?
[/quote]Why would a compiler not accept Unicode? It’s just a text format, and pretty much everything can read it by default. They’d have to specifically exclude it, and why would they do that?
Here’s why:
Why should this be a reason to forbid unicode in programming languages? Because it would be more convenient for someone not needing chars outside of the restricted ASCII table?
Well my EE days were long long ago, but I don’t recall any language requiring exotic characters, ever. Not even C, with its rat bastard ?: operator.
Again: string literals. Comments. These are things that generally contain text in human-readable languages, which includes all kinds of unicode.
public class Unicödĕ {
public static void main(String[] args) {
String düdelü = "Gar gräßlicher Kram";
System.out.println(düdelü);
}
}
Where’s the problem, most languages support Unicode (imo not too early, the standard is old)? If a compiler does not accept Unicode at all i10n and i18n would be impossible.
You obviously never used APL
Something like dd if=/dev/urandom count=1 bs=10k is probably a working APL program, summoning Cthulu or so.