Universal Unix tool AWK gets Unicode support (UPDATEDx2)
In Unix terms, this news is almost akin to Moses appearing and announcing an amendment to the 10 commandments.
AWK, a programming language for analyzing text files, is a core part of the Unix operating system, including Linux, all the BSDs and others. For an OS to be considered POSIX compliant, it must include AWK. AWK first appeared in 1977 and was included in Version 7 UNIX in 1979 – the last version of UNIX from Bell Labs, before AT&T turned it into a commercial product.
What is notable about the tool gaining Unicode support is not so much the feature itself, but who wrote it: Canadian computer scientist Brian Kernighan.
AWK's name is an acronym for its three original developers: Turing Award winner Alfred Aho, Peter Weinberger and Brian Kernighan. Professor Kernighan is also the "K" in "K&R C", as in the original, classic, 1978 book The C Programming Language, written by Professor Kernighan and the late, great Dennis Ritchie.
Indeed the book dictated and specified not only a version of the C language, now known as C78, but even an indentation style. Such is its influence that in old Unix hacker circles, the book is sometimes called "the old testament" and the indentation "the one true brace style".
[...]
It's important to remember that software such as Unix are not holy writ, handed down inviolable from historical times. Most of the people that designed, implemented and shaped them are still with us.
UPDATE:
-
The 80-Year Computer Scientist Who Termed 'Unix' Adds Unicode Support to AWK Code
Brian Kernighan is popularly known for his work along with the creators of Unix, Ken Thompson and Dennis Ritchie. He made significant contributions to the development of Unix.
Not just that, Brian Kernighan also suggested the name "Unix" and created the "Hello, world" as a test phrase for programs.
You might also recognize him as a co-author of the book "The C Programming Language" along with Dennis Ritchie. So, it is safe to say he's an important part of everything you know about Unix, Linux, BSD, and the evolution of C programming language.
And, as an 80-year-old (now), he seems to have invested some time to add a new feature to "AWK", a scripting language he co-created back in the 1970s.
That's wonderful, right? And, sounds like something to inspire us.
Another article and mention:
-
Brian Kernighan Updates Awk to Add Unicode Support
Brian Kernighan, co-creator of Awk (and the K in the name of the tool), has quietly submitted code for adding Unicode support to the scripting language, reports Kevin Purdy.
Awk is “a special-purpose language for extracting and manipulating language that was key to Unix's pipeline features and interoperability between systems,” Purdy explains.
-
Unix legend, who owes us nothing, keeps fixing foundational AWK code | Ars Technica
A Princeton professor, finding a little time for himself in the summer academic lull, emailed an old friend a couple months ago. Brian Kernighan said hello, asked how their US visit was going, and dropped off hundreds of lines of code that could add Unicode support for AWK, the text-parsing tool he helped create for Unix at Bell Labs in 1977.
"I have tested this a fair amount but clearly more tests are needed," Kernighan wrote in the email, posted in late May as a kind of pseudo-commit on the onetrueawk repo by longtime maintainer Arnold Robbins. "Once I figure out how ... I will try to submit a pull request. I wish I understood git better, but in spite of your help, I still don't have a proper understanding, so this may take a while."
Kernighan is the "K" in AWK, a special-purpose language for extracting and manipulating language that was key to Unix's pipeline features and interoperability between systems. A working awk function (AWK is the language, awk the command to invoke it) is critical to both Standard UNIX Specification and IEEE POSIX certification for interoperability. There are countless variants of awk—including modern derivations with support for Unicode—but "One True AWK," sometimes known as nawk, is a kind of canonical version based on Kernighan's 1985 book The AWK Programming Language and his subsequent input.