#8. Hopeless Hidden Diamonds
How latent research helped create ChatGPT, design stealth aeroplanes, and win the Cold War.
I am currently in the process of reading Ben Rich’s memoir. I’m only about 150 pages in, but so far I would highly recommend reading it. I will write more about it at a later date, but found one snippet interesting and is related to an article I wrote previously in which I advocated for researchers to openly publish their research questions.
Here is a story of research that was published, received little engagement, yet was developed elsewhere and significantly altered the course of history.
Skunkworks
Lockheed Martin is a famed top secret organisation known for creating some of the most innovative aircraft the world has ever seen. Nicknamed Skunkworks due to its resemblance to a similar factory in a comic book, it is known for designing such aircraft as the F-117, U-2 and SR-71. It is also the name of Ben Rich’s memoir, from which this story is taken.
Today, Rich is known as the “Father of Stealth”, however stealth was hugely unpopular at the time. The Airforce had no real interest in it, and the conventional Pentagon view was that radar had advanced so much it was pointless to try and thwart its advancements through the use of stealth.
That was until one April afternoon when a 36-year old Denys Overholser decided to drop by Rich’s office and present him with, what Rich would describe as the “Rosetta Stone” of breakthroughs. What Overholser had discovered would enable the US to build planes invulnerable against the most advanced radar systems yet invented and set the US on course to build such aeroplanes as the F-117.
“Denys had discovered this nugget deep inside a long dense technical paper on radar written by one of Russia’s leading experts and published in Moscow nine year earlier. The paper was a sleeper in more ways that one: called “Method of Edge Waves in the Physical theory of Diffraction,” it had only recently been translated by the Air Force Foreign Technology Division from the original Russian language. The author was Pyotr Ufimtsev, chief scientist at the Moscow Institute of Radio Engineering. As Denys admitted, the paper was so obtuse and impenetrable that only a nerd’s nerd would have waded through it all—underlining yet! The nuggets Denys unearthed were found near the end of its forty pages. As he explained it, Ufimtsev had revisited a century old set of formulas derived by Scottish physicist James Clerk Maxwell and later refined by the German electromagnetic expert Arnold Johannes Sommerfield. These calculations predicted the manner in which a given geometric configuration would reflect electromagnetic radiation. Ufimtsev had taken this early work a step further.
“Ben, this guy has shown us how to accurately calculate radar cross sections across the surface of the wing and at the edge of the wing and put together these two calculations for an accurate total.”
Denys saw my blank stare. Radar cross section calculations were a branch of medieval alchemy as far as the noninitiated were concerned. Making big objects appear tiny on a radar screen was probably the most complicated, frustrating, and difficult part of modern warplane designing. A radar beam is an electromagnetic field, and the amount of energy reflected back from the target determines its visibility on radar. For example, our B-52, the mainstay long-range bomber of the Strategic Air Command for more than a generation, was the equivalent of a flying dairy barn when viewed from the side on radar. Our F-15 tactical fighter was as big as a two-story Cape Cod house with a carport. It was questionable whether the F-15 or the newer B-70 bomber would be able to survive the ever-improving Soviet defensive net. The F-111 tactical fighter-bomber, using terrain-following radar to fly close to the deck and "hide" in ground clutter, wouldn't survive either. Operating mostly at night, the airplane's radar kept it from hitting mountains, but as we discovered in Vietnam, it also acted like a four-alarm siren to enemy defenses that picked up the F-111 radar from two hundred miles away. We desperately needed new answers, and Ufimtsev had provided us with an "industrial-strength" theory that now made it possible to accurately calculate the lowest possible radar cross section and achieve levels of stealthiness never before imagined.
“Ufimtsev has shown us how to create computer software to accurately calculate the radar cross section of a given configuration, as long as it's in two dimensions," Denys told me. "We can break down an airplane into thousands of flat triangular shapes, add up their individual radar signatures, and get a precise total of the radar cross section."
Why only two dimensions and why only flat plates?
Simply because, as Denys later noted, it was 1975 and computers weren't yet sufficiently powerful in storage and memory capacity to allow for three-dimensional designs, or rounded shapes, which demanded enormous numbers of additional calculations. The new generation of super-computers, which can compute a billion bits of information in a second, is the reason why the B-2 bomber, with its rounded surfaces, was designed entirely by computer computations.
Denys's idea was to compute the radar cross section of an airplane by dividing it into a series of flat triangles. Each triangle had three separate points and required individual calculations for each point by utilizing Ufimtsev's calculations. The result we called "faceting" - creating a three-dimensional airplane design out of a collection of flat sheets or panels, similar to cutting a diamond into sharp-edged slices.
As his boss, I had to show Denys Overholser that I was at least as intellectual and theoretical as Ufimtsev, so I strummed on my desk importantly and said, "If I understand you, the shape of the airplane would not be too different from the airplane gliders we folded from looseleaf paper and sailed around the classroom behind the teacher's back."
Denys awarded me a "C+" for that try.
The Skunk Works would be the first to try to design an airplane composed entirely of flat, angular surfaces. I tried not to anticipate what some of our crusty old aerodynamicists might say. Denys thought he would need six months to create his computer software based on Ufimtsev's formula. I gave him three months. We code-named the program Echo I. Denys and his old mentor, Bill Schroeder, who had come out of retirement in his eighties to help him after serving as our peerless mathematician and radar specialist for many years, delivered the goods in only five weeks. The game plan was for Denys to design the optimum low observable shape on his computer, then we'd build the model he designed and test his calculations on a radar range.
[…]
Denys Overholser reported back to me on May 5, 1975, on his attempts to design the stealthiest shape for the competition. He was wearing a confident smile as he sat down on the couch on my office with a preliminary designer named Dick Scherrer, who had helped him sketch out the ultimate stealth shape that would result in the lowest radar observability from every angle. What emerged was a diamond beveled in four directions, creating in essence four triangles. Viewed from above the design closely resembled an Indian arrowhead.
Denys was a hearty outdoorsman, a cross-country ski addict and avid mountain biker, a terrific fellow generally, but inexplicably fascinated by radomes and radar. That was his specialty, designing radomes — the jet’s nose cone made out of non interfering composites, housing its radar tracking system. It was an obscure, arcane specialty, and Denys was the best there was. He loved solving radar problems the way that some people love crossword puzzles.
"Boss," he said, handing me the diamond-shaped sketch, "Meet the Hopeless Diamond."
"How good are your radar-cross-section numbers on this one?" I asked.
"Pretty good." Denys grinned impishly. "Ask me, 'How good?’”
I asked him and he told me. "This shape is one thousand times less visible than the least visible shape previously produced at the Skunk Works."
"Whoa!" I exclaimed. "Are you telling me that this shape is a thousand times less visible than the D-21 drone?"1
“You've got it!" Denys exclaimed.
“If we made this shape into a full-size tactical fighter, what would be its equivalent radar signature... as big as what— a Piper Cub, a T-38 trainer...what?"
Denys shook his head vigorously. "Ben, understand, we are talking about a major, major, big-time revolution here. We are talking infinitesimal."
“Well,” I persisted, "what does that mean? On a radar screen it would appear as a...what? As big as a condor, an eagle, an owl, a what?"
"Ben," he replied with a loud guffaw, "try as big as an eagle's eyeball."2
“When the transformer paper came out, I don’t think anyone at Google realised what it meant.”- Sam Altman
"Senior Soviet designers were absolutely uninterested in my theories"- Pyotr Ufimtsev
It’s important to stress here that the man who came up with the theory originally (Pyotr Ufimtsev) was a Soviet Russian— the exact same people America was fighting against. Ufmitsev didn’t hide his research. Nor did he collude with the Americans. Instead, its importance was just never realised by those with power to act. It was a hidden gem (diamond!) that the Americans managed to find.
It’s remarkable how similar this is to the story of transformers, (the AI architecture that is used to develop Large Language Models)- the theory behind which was originally published by Google employees. And yet it was Sam Altman, as OpenAI CEO who pounced on the idea, having been alerted to its promise by Ilya Sutskever.3 Similarly, it was Nokia, not Apple, that first came up with the idea for touchscreens and an App Store.4 But again, it was dismissed by management.
How different would history be if it was the Soviets who understood the importance of Ufimtsev’s research? Could we all be walking round with Nokia’s in our pockets today if their project was realised? How different would the future be if it wasn’t Sam Altman who understood the ‘transformers’ paper, but a different organisation?
How many more hidden gems (Hopeless Diamonds) are there?5 Who will find them?6 How will they be used? And can we try and mine more of them? If so, how?7
Further Reading
Safi Bahcall, Loonshots: How to Nurture the Crazy Ideas that Win Wars, Cure Diseases, and Transform Industries.
Ben Rich and Leo Janos, Skunk Works: A Personal Memoir of My Years at Lockheed.
8 Google Employees Invented Modern AI. Here’s the Inside Story, WIRED.
Kelly Johnson and Ben Rich had a wager on the difference in shealthiness between the D-21 and Hopeless Diamond.
Ben Rich, having backed the Diamond, won a quarter from Johnson.
This is taken verbatim from pages 18-21 and 26-27 of Rich’s book.
“As [Noam] Shazeer recalls it, he was walking down a corridor in Building 1965 and passing [Łukasz] Kaiser’s workspace. He found himself listening to a spirited conversation. “I remember Ashish [Vaswani] was talking about the idea of using self-attention, and Niki [Parmar] was very excited about it. I’m like, wow, that sounds like a great idea. This looks like a fun, smart group of people doing something promising.” Shazeer found the existing recurrent neural networks “irritating” and thought: “Let’s go replace them!”
In the higher echelons of Google, however, the work was seen as just another interesting AI project. I asked several of the transformers folks whether their bosses ever summoned them for updates on the project. Not so much. But “we understood that this was potentially quite a big deal,” says [Jakob] Uszkoreit. “And it caused us to actually obsess over one of the sentences in the paper toward the end, where we comment on future work.”
Google, as almost all tech companies do, quickly filed provisional patents on the work. The reason was not to block others from using the ideas but to build up its patent portfolio for defensive purposes. (The company has a philosophy of “if technology advances, Google will reap the benefits.”)
The picture internally is more complicated. “It was pretty evident to us that transformers could do really magical things,” says Uszkoreit. “Now, you may ask the question, why wasn’t there ChatGPT by Google back in 2018? Realistically, we could have had GPT-3 or even 3.5 probably in 2019, maybe 2020. The big question isn’t, did they see it? The question is, why didn’t we do anything with the fact that we had seen it? The answer is tricky.”
Many tech critics point to Google’s transition from an innovation-centered playground to a bottom-line-focused bureaucracy. As [Aidan] Gomez told the Financial Times, “They weren’t modernizing. They weren’t adopting this tech.” But that would have taken a lot of daring for a giant company whose technology led the industry and reaped huge profits for decades. Google did begin to integrate transformers into products in 2018, starting with its translation tool. Also that year, it introduced a new transformer-based language model called BERT, which it started to apply to search the year after.
But these under-the-hood changes seem timid compared to OpenAI’s quantum leap and Microsoft’s bold integration of transformer-based systems into its product line. When I asked CEO Sundar Pichai last year why his company wasn’t first to launch a large language model like ChatGPT, he argued that in this case Google found it advantageous to let others lead. “It’s not fully clear to me that it might have worked out as well. The fact is, we can do more after people had seen how it works,” he said.
[…]
As OpenAI CEO Sam Altman told me last year, “When the transformer paper came out, I don’t think anyone at Google realized what it meant.”
Source: 8 Google Employees Invented Modern AI. Here’s the Inside Story
“In 2004, a handful of excited Nokia engineers created a new kind of phone: internet-ready, with a big color touchscreen display and a high-resolution camera. They proposed another crazy idea to go along with the phone: an online app store. The leadership team—the same widely admired, cover-story leadership team—shot down both projects. Three years later, the engineers saw their crazy ideas materialize on a stage in San Francisco. Steve jobs unveiled the iPhone. Five years later, Nokia was irrelevant. It sold its mobile business in 2013. Between its mobile peak and exit, Nokia’s value dropped by roughly a quarter trillion dollars.” - Source: Loonshots, p10.
In the paper Undiscovered Public Knowledge, (1986) Don R. Swanson argues that significant latent knowledge exists which, if properly retrieved and combined, could lead to new scientific discoveries.
“After laying this philosophical groundwork, Swanson transitioned to proving, through a series of articles in the late 1980s, that…undiscovered public knowledge does indeed exist…Swanson postulated that the existence of articles linking two concepts A ↔ B and another collection linking B ↔ C indicates novel hypotheses may be found by investigating the yet-undiscovered link A ↔ C. His early implementations of this idea were successful. Noting that the literature on fish oil described outcomes related to improved blood circulation and that the literature on Raynaud’s syndrome described a blood circulation disorder, Swanson hypothesized in 1986 that fish oil could be consumed to ameliorate the symptoms of Raynaud’s syndrome (Swanson 1986). Noting that magnesium produces physiological changes related to those associated to migraine susceptibility, Swanson proposed in 1988 that magnesium had potential as a supplement for alleviating migraines (Swanson, 1988). In 2006, Swanson proposed that endurance athletes could be at increased risk of suffering atrial fibrillation, hypothesizing inflammation as the mediating link (Swanson, 2006). Each of these literature-driven hypotheses were later validated in subsequent clinical trials (DiGiacomo, Kremer, & Shah, 1989; Gordon & Lindsay, 1996; Gallai et al., 1992; Mont, Elosua, & Brugada, 2009).”
It’s important to stress here that the sheer breadth of science means numerous false-positive links are likely to emerge. The above article makes the point that Swanson’s linking is best understood not as an objective literature-based discovery method, but as a method for generating post-hoc evidence to support subjective speculation. This isn’t to downplay its importance, but rather to showcase how it should be marketed.