High-level language computer architecture

A high-level language computer architecture (HLLCA) is a computer architecture designed to be targeted by a specific high-level programming language (HLL), rather than the architecture being dictated by hardware considerations. It is accordingly also termed language-directed computer design, coined in (McKeeman 1967) and primarily used in the 1960s and 1970s. HLLCAs were popular in the 1960s and 1970s, but largely disappeared in the 1980s. This followed the dramatic failure of the Intel 432 (1981) and the emergence of optimizing compilers and reduced instruction set computer (RISC) architectures and RISC-like complex instruction set computer (CISC) architectures, and the later development of just-in-time compilation (JIT) for HLLs. A detailed survey and critique can be found in (Ditzel Patterson).

HLLCAs date almost to the beginning of HLLs, in the Burroughs large systems (1961), which were designed for ALGOL 60 (1960), one of the first HLLs. The best known HLLCAs may be the Lisp machines of the 1970s and 1980s, for the language Lisp (1959). At present the most popular HLLCAs are Java processors, for the language Java (1995), and these are a qualified success, being used for certain applications. A recent architecture in this vein is the Heterogeneous System Architecture (2012), which HSA Intermediate Layer (HSAIL) provides instruction set support for HLL features such as exceptions and virtual functions; this uses JIT to ensure performance.

Definition

There are a wide variety of systems under this heading. The most extreme example is a Directly Executed Language (DEL), where the instruction set architecture (ISA) of the computer equals the instructions of the HLL, and the source code is directly executable with minimal processing. In extreme cases, the only compiling needed is tokenizing the source code and feeding the tokens directly to the processor; this is found in stack-oriented programming languages running on a stack machine. For more conventional languages, the HLL statements are grouped into instruction + arguments, and infix order is transformed to prefix or postfix order. DELs are typically only hypothetical, though they were advocated in the 1970s.^[1]

In less extreme examples, the source code is first parsed to bytecode, which is then the machine code that is passed to the processor. In these cases, the system typically lacks an assembler, as the compiler is deemed sufficient, though in some cases (such as Java), assemblers are used to produce legal bytecode which would not be output by the compiler. This approach was found in the Pascal MicroEngine (1979), and is currently used by Java processors.

More loosely, a HLLCA may simply be a general-purpose computer architecture with some features specifically to support a given HLL or several HLLs. This was found in Lisp machines from the 1970s onward, which augmented general-purpose processors with operations specifically designed to support Lisp.

Examples

The Burroughs Large Systems (1961) were the first HLLCA, designed to support ALGOL (1959), one of the earliest HLLs. This was referred to at the time as "language-directed design." The Burroughs Medium Systems (1966) were designed to support COBOL for business applications. The Burroughs Small Systems (mid-1970s, designed from late 1960s) were designed to support multiple HLLs by a writable control store. These were all mainframes.

The Wang 2200 (1973) series were designed with a BASIC interpreter in micro-code.

The Pascal MicroEngine (1979) was designed for the UCSD Pascal form of Pascal, and used p-code (Pascal compiler bytecode) as its machine code. This was influential on the later development of Java and Java machines.

Lisp machines (1970s and 1980s) were a well-known and influential group of HLLCAs.

Intel iAPX 432 (1981) was designed to support Ada. This was Intel's first 32-bit processor design, and was intended to be Intel's main processor family for the 1980s, but failed commercially.

Rekursiv (mid-1980s) was a minor system, designed to support object-oriented programming and the Lingo programming language in hardware, and supported recursion at the instruction set level, hence the name.

A number of processors and coprocessors intended to implement Prolog more directly were designed in the late 1980s and early 1990s, including the Berkeley VLSI-PLM, its successor (the PLUM), and a related microcode implementation. There were also a number of simulated designs that were not produced as hardware A VHDL-based methodology for designing a Prolog processor, A Prolog coprocessor for superconductors. Like Lisp, Prolog's basic model of computation is radically different from standard imperative designs, and computer scientists and electrical engineers were eager to escape the bottlenecks caused by emulating their underlying models.

Niklaus Wirth's Lilith project included a custom CPU geared toward the Modula-2 language.^[2]

The INMOS Transputer was designed to support concurrent programming, using occam.

The AT&T Hobbit processor, stemming from a design called CRISP (C-language Reduced Instruction Set Processor), was optimized to run C code.

In the late 1990s, there were plans by Sun Microsystems and other companies to build CPUs that directly (or closely) implemented the stack-based Java virtual machine. As a result, several Java processors have been built and used.

Ericsson developed ECOMP, a processor designed to run Erlang.^[3] It was never commercially produced.

The HSA Intermediate Layer (HSAIL) of the Heterogeneous System Architecture (2012) provides a virtual instruction set to abstract away from the underlying ISAs, and has support for HLL features such as exceptions and virtual functions, and include debugging support.

Implementation

HLLCA are frequently implemented via a stack machine (as in the Burroughs Large Systems and Intel 432), and implemented the HLL via microcode in the processor (as in Burroughs Small Systems and Pascal MicroEngine). Tagged architectures are frequently used to support types (as in the Burroughs Large Systems and Lisp machines). More radical examples use a non-von Neumann architecture, though these are typically only hypothetical proposals, not actual implementations.

Application

Some HLLCs have been particularly popular as developer machines (workstations), due to fast compiles and low-level control of the system with a high-level language. Pascal MicroEngine and Lisp machines are good examples of this.

HLLCAs have often been advocated when a HLL has a radically different model of computation than imperative programming (which is a relatively good match for typical processors), notably for functional programming (Lisp) and logic programming (Prolog).

Motivation

A detailed list of putative advantages is given in (Ditzel Patterson).

HLLCAs are intuitively appealing, as the computer can in principle be customized for a language, allowing optimal support for the language, and simplifying compiler writing. It can further natively support multiple languages by simply changing the microcode. Key advantages are to developers: fast compilation and detailed symbolic debugging from the machine.

A further advantage is that a language implementation can be updated by updating the microcode (firmware), without requiring recompilation of an entire system. This is analogous to updating an interpreter for an interpreted language.

An advantage that's reappearing post-2000 is safety or security. Mainstream IT has largely moved to languages with type and/or memory safety for most applications.^{[citation needed]} The software those depend on, from OS to virtual machines, leverage native code with no protection. Many vulnerabilities have been found in such code. One solution is to use a processor custom built to execute a safe high level language or at least understand types. Protections at the processor word level make attackers' job difficult compared to low level machines that see no distinction between scalar data, arrays, pointers, or code. Academics are also developing languages with similar properties that might integrate with high level processors in the future. An example of both of these trends is the SAFE^[4] project. Compare language-based systems, where the software (especially operating system) is based around a safe, high-level language, though the hardware need not be: the "trusted base" may still be in a lower level language.

Disadvantages

A detailed critique is given in (Ditzel Patterson).

The simplest reason for the lack of success of HLLCAs is that from 1980 optimizing compilers resulted in much faster code and were easier to develop than implementing a language in microcode. Many compiler optimizations require complex analysis and rearrangement of the code, so the machine code is very different from the original source code. These optimizations are either impossible or impractical to implement in microcode, due to the complexity and the overhead. Analogous performance problems have a long history with interpreted languages (dating to Lisp (1958)), only being resolved adequately for practical use by just-in-time compilation, pioneered in Self and commercialized in the HotSpot Java virtual machine (1999).

The fundamental problem is that HLLCAs only simplify the code generation step of compilers, which is typically a relatively small part of compilation, and a questionable use of computing power (transistors and microcode). At the minimum tokenization is needed, and typically syntactic analysis and basic semantic checks (unbound variables) will still be performed – so there is no benefit to the front end – and optimization requires ahead-of-time analysis – so there is no benefit to the middle end.

A deeper problem, still an active area of development (As of 2014),^[5] is that providing HLL debugging information from machine code is quite difficult, basically because of the overhead of debugging information, and more subtly because compilation (particularly optimization) makes determining the original source for a machine instruction quite involved. Thus the debugging information provided as an essential part of HLLCAs either severely limits implementation or adds significant overhead in ordinary use.

Further, HLLCAs are typically optimized for one language, supporting other languages more poorly. Similar issues arise in multi-language virtual machines, notably the Java virtual machine (designed for Java) and the .NET Common Language Runtime (designed for C#), where other languages are second-class citizens, and often must hew closely to the main language in semantics. For this reason lower-level ISAs allow multiple languages to be well-supported, given compiler support. However, a similar issue arises even for many apparently language-neutral processors, which are well-supported by the language C, and where transpiling to C (rather than directly targeting the hardware) yields efficient programs and simple compilers.

The advantages of HLLCAs can be alternatively achieved in HLL Computer Systems (language-based systems) in alternative ways, primarily via compilers or interpreters: the system is still written in a HLL, but there is a trusted base in software running on a lower-level architecture. This has been the approach followed since circa 1980: for example, a Java system where the runtime environment itself is written in C, but the operating system and applications written in Java.

Alternatives

Since the 1980s the focus of research and implementation in general-purpose computer architectures has primarily been in RISC-like architectures, typically internally register-rich load–store architectures, with rather stable, non-language-specific ISAs, featuring multiple registers, pipelining, and more recently multicore systems, rather than language-specific ISAs. Language support has focused on compilers and their runtimes, and interpreters and their virtual machines (particularly JIT'ing ones), with little direct hardware support. For example, the current Objective-C runtime for iOS implements tagged pointers, which it uses for type-checking and garbage collection, despite the hardware not being a tagged architecture.

In computer architecture, the RISC approach has proven very popular and successful instead, and is opposite from HLLCAs, emphasizing a very simple instruction set architecture. However, the speed advantages of RISC computers in the 1980s was primarily due to early adoption of on-chip cache and room for large registers, rather than intrinsic advantages of RISC.^{[citation needed]}.

References

↑ See Yaohan Chu references.
↑ "Pascal for Small Machines – History of Lilith". Pascal.hansotten.com. 28 September 2010. http://pascal.hansotten.com/index.php?page=history-of-lilith.
↑ "ECOMP - an Erlang Processor". http://www.erlang.se/euc/00/processor.ppt.
↑ "SAFE Project". http://www.crash-safe.org/.
↑ See LLVM and the Clang compiler.

McKeeman, William M. (November 14–16, 1967). "Language directed computer design". AFIPS '67 (Fall) Proceedings of the November 14–16, 1967, Fall Joint Computer Conference. 31. http://www.computer.org/csdl/proceedings/afips/1967/5070/00/50700413.pdf.
- Keirstead, Ralph E. (March 1968). "R68-8 Language Directed Computer Design". IEEE Transactions on Computers 17 (3): 298. doi:10.1109/TC.1968.229106. http://www.computer.org/csdl/trans/tc/1968/03/01687335.pdf. – review
Ditzel, David R.; Patterson, David A. (1980). "Proceedings of the 7th annual symposium on Computer Architecture - ISCA '80". ISCA '80 Proceedings of the 7th annual symposium on Computer Architecture. ACM. pp. 97–104. doi:10.1145/800053.801914.
A Baker’s Dozen: Fallacies and Pitfalls in Processor Design Grant Martin & Steve Leibson, Tensilica (early 2000s), slides 6–9