This type of attack was a real problem with the internationalized domain names until the browsers started to display them unmasked in the address bar.
Iāve hit so many control character, newline, smart case, alternate glyph issues that when in doubt I run code through a regex and spit out the character codes. Why,oh why are line endings still contentious?
Only then would I murder the sonofabitch that did that.
Some chat programs already implement this āfeatureā, ostensibly to prettify text, so if someone sends you a line or two of code via chat, youāll end up with m-dash instead of n-dash, or pretty quotation marks instead of the standard ones.
Itās a pain in the ass, but usually not super hard to track down, as the compiler will give you a line number, and the clue that thereās an unrecognised symbol
ā⢠Be fired, and then killedā
Always a crowd pleaser!
ĖsuÉĘıuÉuĒÉ„s ĒsĒÉ„Ź ŹŹÉ É¹oÉ ŹĒÉÉ„ÉıɯɹĒÉ„Źo@ ĒÉÆÉŹq I
killed by firing!
It would give us SAS coders a huge headache, too!
Porting a file from mainframe to unix to windows (pick your combo) can be a real bear on line endings!
I donāt understand. Why would a compiler accept Unicode to begin with? Are there programming languages that require Hiragana or Tamil now?
āWait, what? A 34 bit Honeywell processor and the OS only supports EBCDIC, but not all of it?? ā¦I need to smoke somethingā
String literals? Comments?
[quote=āBoundegar, post:10, topic:68052, full:trueā]
I donāt understand. Why would a compiler accept Unicode to begin with? Are there programming languages that require Hiragana or Tamil now?
[/quote]Why would a compiler not accept Unicode? Itās just a text format, and pretty much everything can read it by default. Theyād have to specifically exclude it, and why would they do that?
Hereās why:
Why should this be a reason to forbid unicode in programming languages? Because it would be more convenient for someone not needing chars outside of the restricted ASCII table?
Well my EE days were long long ago, but I donāt recall any language requiring exotic characters, ever. Not even C, with its rat bastard ?: operator.
Again: string literals. Comments. These are things that generally contain text in human-readable languages, which includes all kinds of unicode.
public class UnicƶdÄ {
public static void main(String[] args) {
String düdelü = "Gar grƤĆlicher Kram";
System.out.println(düdelü);
}
}
Whereās the problem, most languages support Unicode (imo not too early, the standard is old)? If a compiler does not accept Unicode at all i10n and i18n would be impossible.
You obviously never used APL
Something like dd if=/dev/urandom count=1 bs=10k is probably a working APL program, summoning Cthulu or so.