Research Achievements and Proto-Memoirs: David J Greaves

I am David Greaves. I am a Senior Lecturer at the University of Cambridge Computer Laboratory and Fellow of Corpus Christi College.

I have worked on hundreds of technical projects. Here is a list of ten or so that can have some fair claim to be World Firsts.

1979: Personal Computer LANs

In 1979 I adapted the master/slave IEEE-488 bus to become a fully-connected LAN by designing and building a hub.
I was a pioneer in the field of Local-Area Networks. In the first year of my PhD I designed and built all of the first generation Cambridge Fast Ring LAN boards. This LAN had been designed by my PhD supervisor, Andy Hopper. But five years earlier I built a previous LAN. In 1979/80, I designed switching hubs that converted the GPIB master/slave bus into a multi-access LAN. No personal computers (except high-end UNIX workstations) at that time had LAN ports: IBM Token Ring was not available until 1984.
WHAT I DID
MODERN IMPACT
At my sixth form College a number of so-called `GPIB Combiners' were built to my design. These allowed multiple IEEE-488 bus masters to share a single set of peripherals. Therefore, the printers and floppy-disk units became fileservers for a six or so Commodore PET computers. I also built a GPIB interface card for the SWTC 6800-based systems so that machine could join this early LAN. All personal computers are today connected to LAN networks.
Title and masthead from Wireless World article, 1984.
   
Typical LAN setup from a decade later.
Title and masthead from Wireless World article, 1984.     Typical LAN setup from a decade later.

My core design was published in Wireless World in April 1984 and, at that time, two commercial companies had started making these devices, including a company started by my physics teachers. (PROJECT : www.bigbrownbus.com/koo.corpus.cam.ac.old/projects/gpibcombiner/gpibcombiner.html)


1983: Digital Keyboard Synthesier

In 1983 I designed and built one of the world's first fully-digital electronic synthesisers.
Unlike previous generation digital sample players, such as the Fairlight, this synthesier used algorithms to synthesize the output waveforms in real time. This project was completed the same year that Yamaha released the DX7, which was the world's first commercial such instrument. My design was published in Wireless World over five months. It was also submitted as part of my final-year undergraduate degree project, that also included a non-linear music composition application.
WHAT I DID
MODERN IMPACT
I used an 8088 micrprocessor (which was just out and being used in the first IBM PCs in the same year) to scan the keyboard and front panel controls. I made a high-performance (5 MIPS) processor from discrete TTL to generate the waveforms. For 8-note polyphony, only ten or so instructions were possible per rendered sample of an instrument. This unit also had MIDI input and output, which was brand new at the time. Nearly all keyboard synthesiers currently in production are digital. Today's devices support a mixture of sample playing and algorithmic synthesis. Many of the remainder have digitally controlled pitch oscillators followed by analogue tone shaping.
A photo of the synthesier and the block diagram of the TTL processor from the Wireless World article.
   
The Yamaha DX7 Synthesiser
A photo of the synthesier and the block diagram of the TTL processor from the Wireless World article.     The Yamaha DX7 Synthesiser

(PROJECT : www.bigbrownbus.com/koo.corpus.cam.ac.old/projects/digipoly/index.html)


1984: Drive Electronics for the World's First Pixel-Addressable LCD Display

The first generation of liquid crystal displays, such as used in digital clocks and some pioneering calculators produced by Sharp, required an electrical connection to every display element. This was unfeasible for large display panels with thousands of pixels. The first multiplexed LCD display panels were made at RSRE (the UK's Research at the Royal Signals and Radar Establishment) and I designed and built drive electronics for one such display.
The LED conducts electricity in only one direction: it is a diode. Unlike an LED, the LCD pixel requires an AC voltage to alter its state and this raised very significant design challenges for scan-multiplexed display panels. An approach was needed that exercised all of the X and Y electrodes all of the time and which delivered waveforms that were highly anti-correlated at pixels which should be activated but which were sufficiently un-correlated at all other pixels to keep them clear.
WHAT I DID
MODERN IMPACT
From Prof P Raynes at RSRE, I received an LCD panel. He also sent some elastomeric strips, which are pieces of rubber with densely-packed parallel, conductive needles. The panel was two sheets of thin glass that had been glued together with a small spacer between them. Close inspection revealed the nearly transparent conductive electrodes painted on the inner surfaces. Together with my assistant Keith Barraclough (we were both interns at Plessey), I designed the PCBs to mount the delicate glass panels and associated drive electronics. I wrote real-time software for a Z80 processor that generated the electrode waveforms. It was then relatively trival to get one chosen pixel in every column or every row activated. This is all that is needed for graph plotting of a single line. For more-advance plots, I generated ROM tables with various pre-computed pseudo-random sequences with known correlation patterns. Then, with appropriate selection of waveforms, it was possible to darken more than one pixel in a row or column at once without phantoms at the transposed cross points (based on the non-transitivity of correlation). My display had an area of about one inch by three. As everybody knows, LCD displays became a very important technology, especially for laptops and televisions. The LCD multiplexing problem was never really solved. Instead, modern display panels have transistors at every pixel and so the old scan multiplexing approach can be applied.
A mockup of the LCD display I used from an online image at RSRE. And also a schematic drawing of the panel structure.
   
A modern LCD oscilloscope.
A mockup of the LCD display I used from an online image at RSRE. And also a schematic drawing of the panel structure.     A modern LCD oscilloscope.

As with quite a few technological developments conducted under the blanket of the Official Secrets Act, the UK technology was never properly exploited. Contrary to their contracts of employment, the good staff at RSRE did not get significant royalties from their inventions and I was told that they ultimately sued their employer (the UK government) for failing to maximise their benefit. I certainly did not get any money. Meanwhile, the Japanese companies Epson and Sharp took the lead.

Two or three years after this project, I saw the prototype LCD on display in the London Science Museum.


1986: Constructed Digital IF Strip for Communications Receiver

At Plessey Roke Manor, in 1986, I constructed a fully-digital demodulators and IF strip for a radio receiver. This was the world's first Software Defined Radio, as it is now called.
The IF strip design in the original radio used a large number of very expensive crystal filters of differing bandwidths. Front panel controls selected one filter bandwidth and one demodulator circuit, so only two filters were ever in use at ant one time, these being for the I and Q channels. The matching of an I/Q pair was also comparatively poor which degraded rejection performance. In 1986, Roke Manor was the preeminent research laboratory for radio communications and so this was pretty much a world first.
WHAT I DID
MODERN IMPACT
I used three TMS 302010 digital-signal processing CPUs (5 MIPS each) with switchable firmware (high order address bits on the EPROM) for different demodulation modes (AM, FM, USB, CW). I designed and constructed the new circuit boards and was in charge of system integration. I used TTL logic to extract I and Q channels from a single ADC and feed these to two DSP processors that replaced the filters. The third DSP combined the filter outputs and demodulated the output to an audio DAC. I wrote the code for the automatic frequency control (AFC). Others wrote most of the DSP software. We published a paper at an IEE Colloquium on Digital Radio. The digitial filter implementations for the two channels are now perfectly matched. Even for our first prototype, the digitial logic had lower cost of goods than the analogue circuitry it replaced. It had better performance in many respects, and most surprisingly, the out-of-band rejection was hardly any worse. Today, 99.9 percent of all radio receivers are now made with digital IF strips and demodulation.
A view of the Plessey PRS 2280 HF communications receiver.
   
Extract from David Greaves Lecture Notes, University of Cambridge.
A view of the Plessey PRS 2280 HF communications receiver.     Extract from David Greaves Lecture Notes, University of Cambridge.



1991: A Version of Linux

In 1991 I implemented unix from scratch. I ran it on 68030 hardware of my own construction. This was so that I could use unix at home. The system was stable and worked well.
The lack of a powerful computer at home, combined with lack of access to the computers at work via anything other than a 9600 baud JNT PAD, motivated me to build a personal unix system from scratch. It turned out to be fairly straightforward. A great help was a copy of the book by Maurice J Bach, given to me a by an office mate, Bhaskar Harita.
WHAT I DID
MODERN IMPACT
I constructed a CPU from a processor and DRAM chip and connected on various peripherals, including floppy and hard disks. I wrote a self-hosting C compiler and 68030 assembler and linker. I then implemented the unix kernel from scratch. At work I discovered source code for the Ultrix, the DEC version of unix, and so then could compile many of the standard shell utilities and ran them on my version of unix. Compared with Linus Torvalds, I did twice as much work since I created both the hardware and the software. PC hardware of the time was expensive and commonly lacked an MMU. But Linus' version, targeting PC hardware was ultimately far more successful since many people did have PC hardware and nobody else had my machine.
View of my home-made 68030 card with 16 MByte of DRAM
   
Linux developers did not create their own hardware: they waited until the PC specification evolved to included a sensible processor with an MMU.
View of my home-made 68030 card with 16 MByte of DRAM     Linux developers did not create their own hardware: they waited until the PC specification evolved to included a sensible processor with an MMU.

When I had more-or-less finished, a student asked me about to explain some of the Minix code at the back of Tanenbaum's book. I had already met Tanenbaum and I had no problem understanding the code. But I wished I had seen the book earlier and simply copied Minix instead of starting from scratch. I used the resulting computer for about five years, before I switch to Linux. My first personal Linux computer had 4 MByte of RAM, which was less than the computer I had made, but it had richer software resources and I never looked back. But it was a while before the Linux sequencer implemented kernel timestamps on real-time MIDI traffic, just like I had implemented myself. (PROJECT : www.bigbrownbus.com/koo.corpus.cam.ac.old/projects/elencomputer/index.html)


1994: Verilog RTL Programming of FPGAs

I was a user of the first commercial FPGA, the Xilinx XC2064. In 1994 I wrote a Verilog compiler that targeted such FPGAs. This greatly increased design efficiency and enabled circuit boards containing FPGAs to be simulated as a whole.
Schematic capture and manual netlist entry were the only design capture mechanisms supported by the FPGA vendors before 1994. Dr Ian Leslie of the Computer Laboratory wrote the first interface between the Cambridge HDL design system and the Xilinx tools. But Cambridge HDL only supported structural/hierarchic descriptions. David Milway of the Computer Laboratory had extended Cambridge HDL to include Functional Definition Language (FDL), which was a form of RTL. I had written a PAL compiler that accepted complex logic equations. So I had access to the main pieces I would need.
WHAT I DID
MODERN IMPACT
I refactored all of the local EDA technology. I replaced the Cambridge HDL front end with a new Verilog parser (I learned about Verilog at Silicon Graphics where it was being used for TCP accelerator ASICs). The resulting tool was called CSYN. This told compiled Verilog RTL as input and rendering Xilinx Netlist Format (XNF) (and Cambridge HLD) as output code for input to the local simulators and Xilinx FPGA tools respectively.

I also wrote a back end that output code for PALs such as the 16R8 and 22V10 that we widely used at that time, so these could also be programmed (by my widely used Unix-based PAL compiler (commercial tools were all hosted on MS DOS)) and simulated in a common Verilog environment.

The benefits of the tool were rapidly and widely appreciated. The CSYN tool was used in a number of Cambridge companies (including Acorn) and also licensed to Motorola for use in their FPGA toolflow. I introduced Verilog to the Computer Science curriculum at Cambridge using my tool and along with a companion simulator.

RTL is today, without question, the primary input mechanism for FPGAs from all vendors.

Schematic flow diagram of the inputs and outputs of the CSYN compiler.
   
Image of a modern RTL FPGA design flow environment
Schematic flow diagram of the inputs and outputs of the CSYN compiler.     Image of a modern RTL FPGA design flow environment

This was the first demonstration of such an RTL flow for FPGA. The work was published at the very first FPL conference the next year in Oxford. (PUBLICATION : www.cl.cam.ac.uk/~djg11/edadoc/cv2man/csyn-verilog-compiler.pdf)


1994: Video-On-Demand

In 1994 I was the chief network architect and designer of the line cards for the Cambridge iTV Video-on-Demand and broadband trial. Together with David Milway and some chaps from SJ Research (sorry forgot your names), I also designed some of the kerbside switching equipment.
The trial connected 50 homes and schools providing on-demand television, and home shopping.
WHAT I DID
MODERN IMPACT
With Brian Knight of Virata and Andrew Bird of Acorn, and at the prompting of Hermann Hauser, I had a meeting with Hugo Davenport of the local CATV company. He was managing director of Cambridge Cable (later NTL and then Virgin Media) and he gave us permission to install plant in their kerbside boxes throughout the town. We used dark fibres for head-end connection to the kerbside and diplexers to separate the 25.6 Mbps duplex final-drop data form the analogue CATV service, which could operate at the same time. I participated in standards developments for data over cable TV networks (DOCSIS) and later I worked on ADSL. In the UK today, these two techniques are the mainstay of residential broadband to the home. Broadband has become an important technology worldwide and is now considered a necessity.

Video-on-demand has also recently taken off greatly: the last UK Blockbuster video rental shop closed in 2018. Today, young people often only watch television over the Internet, especially in the UK where the television license fee is imposed on broadcast services.

DJ Greaves standing next to a modern (2016) kerb side box while holding one of the prototype residential ADSL modems he designed at Virata.
   
Netflix is the predominant international Video-on-Demand service today, but we also have BBC iPlayer and the like from other broadcasters.
DJ Greaves standing next to a modern (2016) kerb side box while holding one of the prototype residential ADSL modems he designed at Virata.     Netflix is the predominant international Video-on-Demand service today, but we also have BBC iPlayer and the like from other broadcasters.

This project was way ahead of its time. About 20 years ahead, to be more precise. (PROJECT : www.bigbrownbus.com/koo.corpus.cam.ac.old/projects/itv) (PUBLICATION : www.aes.org/e-lib/browse.cfm?elib=7091)


1998: Multi-room Audio on Multimedia LAN

From 1996 to 2006 I had a home-made multimedia network in my house. This streamed audio between source and sink components so that various CD, Radio or TV channels could be heard in all rooms. Various other household items, such as the doorbell and heating controller were ultimately connected (see IoT entry below).
I built a large number of different media adaptors to connect devices to the home LAN which was an ATM Warren network. The Chief Scientist of Bell Labs (Sandy Fraser) visited my house to have a look. The system was replicated at BT Laboratories Martlesham and presented to cabinet members.

In 1996 I also presented packet-switched HiFi audio to the Audio Engineering Society Conference. The delegates had accepted digital technology, but the ideas relating to timing recovery from jittered packets were new to most of the engineers.

WHAT I DID
MODERN IMPACT
The Warren project at the Computer Laboratory designed and built a portfolio of some ten or so module designs. These included the LAN switches themselves and adaptors for Infra Red, Audio, Video and control. Even the doorbell was connected to demonstrate remote door unlocking. The Sonos corporation today makes a variety of audio LAN devices. Apple and others make them as well. There is a W3C standard protocol for streaming audio but different manufacturers' implementations are strangely incompatible. I wonder why!?

Video doorbells and remote unlocking of the outer porch door are becoming everyday practice.

Schematic of the the Audio LAN installed at home of DJ Greaves in 1997.
   
A typical WiFi-connected audio component made by Sonos.
Schematic of the the Audio LAN installed at home of DJ Greaves in 1997.     A typical WiFi-connected audio component made by Sonos.

Although a large number of commercial approaches were made, in both directions, to license this technology, these ultimately failed since ATM LANs were seen as less attractive than Ethernet. Of course, if the ATM technology had been adopted, with its guaranteed quality-of-service, then all of today's terrible audio gaps and glitches would not be annoying everybody. (PROJECT : www.cl.cam.ac.uk/research/srg/han/Warren) (PUBLICATION : ieeexplore.ieee.org/document/660006)


1999: Packet-Switched Digital Telephone Instrument

I designed and built the world's first telephone handsets to be directly connected to a packet-switched network. (Roy Want's Island project at the Computer Laboratory had first done this for the Cambridge Ring, but those phones had a 19-inch rack under the table containing the equivalent circuitry. Hence the world-first here perhaps relies on the words handset/instrument.)
I purchased off-the-shelf POTS telephone handsets made by Geemark. This model was selected since it had speaker phone mode and an LCD display. The phone was adapted so that it connected to a LAN instead of a telepone land line or PBX.
WHAT I DID
MODERN IMPACT
I carefully measured the internal PCB dimensions and made my own drop-in replacement PCB. The new PCB had ADC and DAC components and packet framers to convey the audio over the LAN. To bring the system to life, my RA Daniel Gordon wrote a telephone exchange program that was hosted elsewhere on the network. The directors of Geemark flew to the UK to take a look and to discuss future prospects. But this project was ahead of its time. The Cisco Ethernet phone on your desk today is the direct modern equivalent.
Internal view of one of the phones with the new PCB installed.
   
An Ethernet-connected telephone set made by Cisco.
Internal view of one of the phones with the new PCB installed.     An Ethernet-connected telephone set made by Cisco.

The phone was presented to the ATM Forum, Residential and Broadband working group, of which I was a founder. About 20 prototype instruments were made and some went to UCL and others to BT Martlesham. There's one in the central display cabinet on the top floor of the William Gates building. (PROJECT : www.bigbrownbus.com/koo.corpus.cam.ac.old/projects/warren)


1999: C-to-Gates Compilation (HLS)

In 1999 I demonstrated compiling C programs to register-transfer level (RTL) circuit descriptions.
At that time it was recognised that using higher-level languages to generate hardware could accelerate the design process. It could also enable software engineers to participate in hardware design. A software program would be mapped to a circuit that could be implemented either in custom silicon or on an FPGA (field-programmable gate array).
WHAT I DID
MODERN IMPACT
I implemented CTOV, a design tool that took statically-allocated C code (no heap and bounded stack depth) and generated a circuit of equivalent functionality. Arrays were mapped to RAMs using automatic heuristics in the first instance, but the report file from one run could be manually edited and read in to a further run so that the engineer could trim this important aspect. The tool also supported various constructs from the SystemC library, such as thread starts and custom-precision integers. CTOV was licensed to industry and is currently fully owned by market leader Synopsis. Today there are many commercial and research compilers of this nature. These compilation flows are becoming widely used.

A new execution platform has emerged with all the major cloud providers now adding FPGA chips to their server blades. Low energy execution has become the major driver for this technology. Performance increases are also commonly seen.

A fragment of C code that was very easy to compile to RTL.
   
Amazon promotional material for the FPGA-in-the-cloud elastic computing.
A fragment of C code that was very easy to compile to RTL.     Amazon promotional material for the FPGA-in-the-cloud elastic computing.

C-to-Gates is now called High-Level Synthesis (HLS). Contemporary, high-performance CPU cores use orders of magnitude more energy working out what to compute next than they use performing the computation. Because of this, the Von Neumann architecture is perhaps on its way out for big data. Analysis today, by Andre Dehon and others, predicts that hardware/FPGA execution will increasingly replace Von Neumann designs owing to reduced energy of execution. Certainly, the replacement of the fetch-execute cycle with circuit-switched wiring on FPGA has often been show to reduce computational energy use by two orders of magnitude (memory system energy use remains largely unchanged).

Scientific Acceleration on FPGA is the current major research direction for DJ Greaves, exemplified by the Kiwi project. (PROJECT : www.cl.cam.ac.uk/~djg11/cvtovpage-real) (PUBLICATION : ieeexplore.ieee.org/document/5970505)


1999: ASIC Virtual Platforms

In 1999 I wrote VTOC, a program that could convert very large chip designs into a C/C++ program that reflected their functionality. This enabled embedded software developers to test and develop their firmware before their ASIC (application-specific silicon integrated circuit or chip) was taped out.
Companies such as QuickTurn, Aptix, Zycad and IKOS sold million-dollar hardware boxes to chip developing companies as hardware emulators. The companies would compile the RTL of their design to target the hundreds of FPGAs inside these platforms. The resulting system would perhaps clock at one tenth the target clock frequency of the chip being designed. Schedulling use of the emulator was a logistics problem.

A purely software alternative was sorely needed. If it ran at one thousandth of the target design speed on a platform that cost one thousandth of the emulator, such as a Linux server, each software/firmware developer could have their own instance: a ten-to-one increase in cycles became available for the same capital and the logistics were solved. Also, less specialised engineering is needed to set up the platform and the servers can be used for many other purposes too.

The VTOC compiler was complementary to my earlier tool (mentioned above) that converted in the other direction.

WHAT I DID
MODERN IMPACT
The VTOC program compiled synthesiable RTL into C++ code. It had an advanced, incremental compilation approach, that carefully reflected the data dependencies between ports and hence could scale to model an entire chip as a single C++ module, albeit with millions of lines of code. A paper regarding VTOC was published at the Rapid Systems Prototyping conference in 2000. Subsequently a company was formed, Tenison EDA, that sold licenses to nearly all the major chip makers of the time. The VTOC approach is now commonly used throughout the IC industry where it is known as a 'virtual platform'. The use of FPGA hardware emulators has dropped off, these now being used only for hard-real-time applications like telecoms modems.
A single blade from a Zycad hardware emulation server.
   
A splash screen for a third-party toolset that integrated the VTOC compiler under the hood and attributes Tenison EDA.
A single blade from a Zycad hardware emulation server.     A splash screen for a third-party toolset that integrated the VTOC compiler under the hood and attributes Tenison EDA.

Our startup received a large amount of coverage in Electronics Times, Electronics Weekly, Chip Design Magazine and other trade journals (that do not count towards academic citation indices). Articles were written both by technical journalists and staff at Tenison EDA. As is common for companies in the EDA space, Tenison was finally acquired by one of the large three EDA Vendors (Mentor, Synopsys and Cadence). (PUBLICATION : ieeexplore.ieee.org/abstract/document/855208)


2006: Internet-Connected, Safety-Critical Central Heating Timer

In the AutoHan project, circa 2006, I designed and built a central heating timer for my house that could be controlled over the in-home LAN and also remotely on the WAN. (I also helped design the original WAN.) This was used from 2006 to 2016.
WHAT I DID
MODERN IMPACT
This heating controller uses a novel model checking system called Pushlogic that still has no modern equivalent and is worthy of wider dissemination. Under Pushlogic, both the running applications and the interactions between sensors and actuators is modelled. When someone attempts to reconfigure the system, such as entering the home with a given app running on their phone, on-the-fly formal techniques prevent devices interfering with each other. Each time a new application was introduced into a 'domain-of-participation' automated model checking was activated to selectively bar parts of its behaviour and to ensure it can be controlled (e.g. speaker mute) under relevant conditions. This heating controller would now be called an 'Internet Of Things' device. A variety of companies now make such devices. These sometimes have advanced applications bundled with them. For instance, after I installed a Nest device in my daughter's house and her husband programmed it, she discovered the heating always automatically turned off when he left home! Accidental or deliberate? And for a door entry systems, a deployment embodying Pushlogic could ensure deadlock avoidance: it would ensure that you cannot lock all your entry keys inside the house when you are outside!
The heating controller and its block diagram.
   
Nest and Hive are two companies with IoT heating controllers. Note: they do not model check against adverse interactions with other home agents.
The heating controller and its block diagram.     Nest and Hive are two companies with IoT heating controllers. Note: they do not model check against adverse interactions with other home agents.

(PROJECT : www.cl.cam.ac.uk/research/srg/han/pebbles/datasheets/heatingcontroller)


2009: Genome Alignment Processing Using Multi-threaded HLS

The genome aligment problem is well known to be highly amenable to acceleration on FPGA. A systolic array structure is normally used. The ground breaking aspect of this work was the FPGA being programmed by multi-threaded, object-oriented high-level code with exactly the same program being run on the workstation as on the FPGA.
Started in 2007, the Kiwi project developed a compiler for high-level synthesis of multi-threaded C# code.
WHAT I DID
MODERN IMPACT
Once the first version of the KiwiC compiler was working, we (S Singh and I) used it to generate a Smith Waterman accelerator on FPGA. A processing cell was defined that was easily paramaterisable in terms of how much of the search string it handled and an outer framework was defined that instantiated sufficient cells for the whole of the search string. The DNA to be searched was then streamed through these cells from the file server. Each component was a separate C# object.

We presented a paper at the Many-Core and Reconfigurable Supercomputing Conference in Berlin. Our results for the HLS design were about one third the performance of a hand-crafted design on the same FPGA. This was because the Kiwi tool chain was immature.

Papers are still being published accelerating Genome Sequencing on FPGA. Kiwi has also implemented Burrows-Wheeler searches. And a number of commercial FPGA-based accelerators are available for genomics processing have been sold. Today the hardware boxes are being replaced with AWS Elastic Computing: hiring FPGA can be cheaper than owning.
Basic structure of the Smith Waterman alignment algorithm.
   
Dragen workflow from a company that performs alignment and compression on FPGA
Basic structure of the Smith Waterman alignment algorithm.     Dragen workflow from a company that performs alignment and compression on FPGA

(PUBLICATION : www.semanticscholar.org/paper/Synthesis-of-a-Parallel-Smith-Waterman-Sequence-Greaves/9a9b4dc5d0921c4a4a6ffcaf40fb3e465c63e5d1)


2013: Cake ML Processor

The Cake ML project has developed (possibly?) the world's first formally-verified compiler and processor composition. In 2013 I implemented two versions of the first Cake ML ISA (instruction set architecture) on FPGA.
I have long been a keen user of functional programming in FSharp and other ML dialects. Jonathan Kimmitt compiled OCaml into FPGA as part of his PhD research for which I was the supervisor. Also VTOC was written in ML and as part of that project I had implemented an ML interpreter and compiler. Using functional expression is acknowledged to result in designs that are easier to reason about.

I collaborated with the main Cake ML design team over the detailed design of the first processor. The idea was to synthesise the processor from the instruction set architecture formal specification. Or indeed a family of such processors of varying performance and complexity.

WHAT I DID
MODERN IMPACT
The first CPU design was hand written in the RTL.

The second design was again hand written, but in Bluespec RTL. I also wrote the Bluespec Compiler in FSharp.

Both designs were loaded onto FPGA and demonstrated in my office to Mike Gordon and the rest of the Cake ML team.

There is a massive demand for provably secure systems. A fresh start, like those of Kimmitt and Cake is a good approach, especially when performance penalties are tolerable. A recent example is home gateway firewalls. With broadband speeds in the tens of Mbps, the processing requirements for the firewall are modest and a relatively poor CPU implementation on an FPGA can well serve. But it must be completely secure. Engineers are turning to synthesised soft cores for this reason.
Cake ML CPU Initial Design
   
The Cake ML project is going from strength to strength.
Cake ML CPU Initial Design     The Cake ML project is going from strength to strength.

I am currently working on techniques for advance processor synthesis from ISA specifications. I will return again to the current Cake ML specifications as a test case. (PROJECT : ieeexplore.ieee.org/abstract/document/7281847) (PUBLICATION : technotes-djg.blogspot.com/2013)


END