CONTRIB.9FRONT.ORG NO REFUNDS

Advanced Namespace Tools blog

09 January 2019

Time Warp, Hypercubes, and ANTS future

This is part three in a series on hypercubic computing, as used by the nCUBE series of 80s and 90s supercomputers. We are changing direction slightly in this post to examine Time Warp, an algorithm for distributed simulations used in the Time Warp OS on the Jet Propulsion Laboratory Mark II and Mark III hypercubes, and later in many other non-hypercubic simulation systems such as Georgia Tech Time Warp, WARPED simulation kernel, and ROSS.

Life in the 80s: Space-Based Time Warping Hypercubic Robot Weapons Control

The year is 1987. Eddie Murphy fighting crime in Beverly Hills Cop II is #1 at the box office, and Reagan’s Strategic Defense Initiative to create a space based missile defense was still receiving a lot of military funding. This is the context you need to understand the paper “Virtual-Time Operating-System Functions for Robotics Applications on a Hypercube” (Einstein Bahren Jefferson 1987 Hypercube conference) which envisions a future in which the safety of the USA depends on robotic missile defense weapons in space, controlled by an nCUBE computer running a Time Warp-based Virtual Time control shell. The motivation is that communication and coordination between the robots will be subject to time delays, and hence the need for a “virtual time (VT) shell for hypercubic operating systems, intended to facilitate real-time responses to unpredictable events in a robotics application” such as an autonomous military space platform. It seems like this project never finished, possibly because Time Warp is actually more suited to simulations rather than applications with hard real-time requirements and this was not completely clear during these early years.

Time Warp: insert obligatory Rocky Horror reference here

Time Warp is a system created in the early 1980s by David Jefferson and Henry Sowizral, continuously developed subsequently by Jefferson and others, for managing parallel discrete event simulations. It’s notable characteristic is the idea of ‘optimism with rollback’ to deal with non-deterministic message ordering between the objects being simulated, implemented with a brilliantly clever system of ‘antimessages’ to handle the complex causal interdependencies of message passing objects changing each other’s state. At each time tick, the state of each object is saved. Whenever an object sends a message to another object, it saves a copy of that message with an opposite ‘sign bit’ as an antimessage. In the event of messages arriving out of time order, objects roll back to the previous state and replay their actions, and send antimessages to cancel actions taken as a result of the mis-sequence. Antimessages ‘annihilate’ their corresponding positive message when they are enqueued together.

The soundness and beauty of this system can take some time and thought and study to appreciate. There are also many further details such as a Global Virtual Time (GVT) which is the time at which actions become irreversible and old states may be discarded, and numerous options for controlling scheduling to optimize for particular types of model simulations. The Time Warp system has received a lot of use and study within the discrete event simulation community. Despite its status within that field, it does not seem to be widely known outside of it, but I think the mechanism is of such generality and power that many useful systems can be built using it, provided the necessary preconditions - reversibility of operations - are met.

Time Warp pseudocode

A Time Warp simulation is a set of message passing objects which all go through a state transition every tick of their local virtual clock, and can send messages to each other. Pseudocode which is run on each node once per tick:

   advance local virtual time one tick
   add incoming messages to the message queue according to timestamps
   check if lowest timestamp is below current virtual time
          if so rollback to that time point
   annihilate message/antimessage pairs in inqueue
   process messages for current virtual time
   update local state
   outqueue messages 
   annihilate message/antimessage pairs in outqueue
   save corresponding antimessages from remaining outgoing messages
   send outbound messages including antimessages from previous attempts at this time tick
   save a copy of state at this tick
   check GVT (time of oldest unhandled message in simulation)
   commit/flush actions older than GVT

This protocol allows simulation objects to speculatively execute into the future under the assumption that they have received all relevant messages, but be able to revert to an earlier state if needed, and cancel all side effecting actions through the use of antimessages.

Time Warp OS

The first mature implementation of the Time Warp system was on the JPL Mark III Hypercube in the Time Warp OS, which was later ported to the BBN 1000 Butterfly and also SUN 3 and 4. The Time Warp OS was an ambitious (and at least partially succesful) attempt to make the Time Warp simulation environment as close to a general purpose operating system as possible. The fundamental precondition of Time Warp is reversibility, which is often difficult to achieve with the large amount of state-changing IO operations that operating systems perform, as well as the large amount of state to track in the context of the resources available to 80s computers. The Time Warp OS eventually acquired sophisticated facilities such as load balancing migration of simulation processes, before being discontinued as an active project sometime in the early to mid 90s. The core technology of Time Warp however was migrated to pure simulation performance focused platforms (GTW, ROSS) with less attempt to provide general purpose os-like facilities.

Dreams of Lost Futures

At the same time as the Time Warp OS was nearing its final form, the nCUBE corporation was working on their third generation supercomputer, the nCUBE 3. I have come to believe that this system may never have been released into the wild. The earlier nCUBE 2 existed at several research institutions. The later nCUBE mediacube 4 also was sold as a commercial product. The largest and most powerful of all nCUBE supercomputers, which ran the Plan 9 based Transit OS - weeks of dedicated web searching and paper reading have turned up many references to this system as an upcoming product - but nothing talks about actually using or interacting with a real running system. Hypercubes were no longer the hot thing in massively parallel architectures, and the switch from the entirely internally developed Axis/Vertex and nCX operating systems to Plan 9 based Transit is probably reflective of a troubled development process. I really wonder if any nCUBE 3 systems made it out the door. I am still fascinated by this history and correspondence on nCUBE, Time Warp OS, or related topics is welcome at mycroftiv AT sphericalharmony.com.

In my mind, what I have learned of this era of supercomputing (which I lived through and knew nothing about as a young adult) fills me with nostalgia for a past future that never quite was - one where 65,536 node nCUBE systems with a Plan 9 based OS ran Time Warping control shells to control space robots. The expansion of humanity through space was managed by elite “cube jockeys” who expertly navigated higher dimensional information spaces and issued precisely crafted rollback messages to account for changing conditions and communication latency. A lot of very smart people thought something like this made sense, and worked hard to build the systems and software for it. The future didn’t come out quite like that, but maybe it could have. Can we rollback to a future we missed?

Where the wave finally broke and rolled back

I think it all still makes sense. The tidal wave of Moore’s law changed computing so fast in the past few decades that many, many good ideas got lost under the surface and pulled into the depths of obscurity or specialized academic roles. Time Warp OS was forced into many compromises because the condition of reversibility means that you need to save a lot of state, and bytes were expensive back in those days. We can afford to spend those bytes to save a lot more state now, so we can do a lot more to buffer io and allow the user to interact with the simulation in real time. Everything computers do is a simulation, in a sense - a simulation of an operating system is still an operating system. Time Warp makes the most sense when things are expressed in terms of distributed message passing - and 9p and the general design means Plan 9 is much more comfortable expressing itself in those terms than most operating systems.

Squinting at the history that I can see between the lines of all these blurry old 80s photocopied research papers is a reminder of the context of Plan 9’s creation. The late 80s massively parallel machines were often patterning their operating systems on unix. First generation nCUBE Axis/Vertex treated the hypercube as a unix style device file. In adapting unix to parallel distributed MIMD systems, a lot of pain points were found, and Plan 9 can be read as a response to a feeling in the mid to late 80s that parallel distributed architectures were the future and that traditional unix was not quite suited for the task. Perhaps this is unwarranted speculation, because the Plan 9 papers make no mention of supercomputing as an application. Perhaps the kind of high-speed interconnect between cpus used in hypercubes seemed inherently different from local and wide area networks.

Discovering not just nCUBE and Time Warp, but the many other unique hardware platforms and operating systems of that era, hypercubic and not, puts Rob Pike’s noted “Utah 2000” talk on the fading relevance of operating systems research in context. Just a few years before, the innovators in parallel systems and software felt they had all the momentum and were riding the crest of a high and beautiful wave. And they were - but the bigger wave of Moore’s Law in the 80s and 90s overtook them.

Hypercubic Time Travelling ANTS

For reasons that are more mystic than rational, based on strange convergences of names and narratives, I am utterly compelled to see the just-barely-missed union of Plan 9, nCUBE, and Time Warp as a parallel computing reality I need to reach. I don’t know the precise path. Plan 9 thankfully is alive and well. Time Warp OS exists as an artifact but it is probably better to build on the general Time Warp principle than to reference the specifics of TWOS. As for nCUBE Transit and the earlier nCUBE operating systems…it would be a wonderful miracle if that code was available, but I’m not holding my breath.

The goal is to build something like TWOS hosted within Plan 9, and use a distributed hypercubic network of ANTS container/vps nodes rather than hardwired boards of CPUs for the platform. The very preliminary proof of concept explorations of hypercubic in the past two blog posts have showed me that 9front/ants can prototype hypercubic systems easily. I believe that hubfs can be adapted into a Time Warp engine component, because it already manages multiple message spindles of reads/writes between multiple clients.

I am not expecting this to produce anything more substantial or usable than a conceptually interesting toy, at this point. For entirely personal reasons, I need to run simulations of the orbits of lunar induction catapult launches, with a redundantly routed control layer whose topology is determined by the set of names used. The ability to ‘snoop’ speculatively into alternate futures is essential to Time Corps operations and only a powerful computer can effectively scan the possibility space and rollback undersired outcomes. As fate would have it, Plan 9 hypercubic Time Warp simulation fits the bill perfectly. Consequently, I announce additonal ANTS semantic re-binds:

ANTS stands for Alternate Ncubic Timeline System

ANTS stands for Adam “Network TANSTAAFL” Selene

References

“Fast Concurrent Simulation using the Time Warp Mechanism” Jefferson Sowizral 1982 https://www.rand.org/content/dam/rand/pubs/notes/2007/N1906.pdf

“Virtual Time” Jefferson 1985 http://cobweb.cs.uga.edu/~maria/pads/papers/p404-jefferson.pdf

“Virtual Time and Time Warp on the JPL Hypercube” Jefferson Beckman 1986 https://books.google.com/books?id=QN8HNVwZEecC&pg=PA121&lpg=PA121&dq=plan+9+hypercube&source=bl&ots=QJzmtsJDgM&sig=BQ2w5MtIfPMiT-mw90CdVTpopH4&hl=en&sa=X&ved=2ahUKEwjUkN6xkt3fAhViyoMKHRNiCas4ChDoATADegQIBxAB#v=onepage&q=plan%209%20hypercube&f=false

“Virtual-Time Operating-System Functions for Robotics Applications on a Hypercube” Einstein Bahren Jefferson 1987 https://books.google.com/books?id=fEbjEWonG0UC&pg=PA106&lpg=PA106&dq=time+warp+os+source+code&source=bl&ots=fB5mnAPHK9&sig=bxQNEG6cmq-PYg7J55Vhw-7g9PE&hl=en&sa=X&ved=2ahUKEwi6qaGz2d3fAhUq6oMKHQNECM04ChDoATAEegQIBRAB#v=onepage&q=time%20warp%20os%20source%20code&f=false

“Distributed simulation and the Time Warp Operating System” Jefferson et. al 1987 https://cs.nyu.edu/srg/docs/p77-jefferson.pdf

Time Warp Operating System v2.7 Internals manual JPL D-9516 1992 https://apps.dtic.mil/dtic/tr/fulltext/u2/a271489.pdf

“A network version of the time warp operating system” Bellenot 1993 https://www.researchgate.net/publication/2250407_A_Network_Version_of_the_Time_Warp_Operating_System

“Warp speed: Executing time warp on 1966080 cores” Barnes Carruthers Jefferson Laprey 2013 https://www.researchgate.net/publication/262253402_Warp_speed_Executing_time_warp_on_1966080_cores

https://github.com/ROSS-org/ROSS

“Systems Software Research is Irrelevant” Pike 2000 http://doc.cat-v.org/bell_labs/utah2000/utah2000.html

“The Moon is a Harsh Mistress” Heinlein 1966

Also?

Did this thing actually get sent out into space?

https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19940027927.pdf