# ملاحظات

#### مقدمة

(1)

For these and more indicators of the global
progress achieved through the ideas of the
Enlightenment, see Pinker 2018.
#### الفصل الأول: ما هي الخوارزمية؟

(1)

“The Algorithmic Age” was aired on
February 8, 2018, on *Radio Open Source*.

(2)

For an account of algorithms in ancient
Babylon, see Knuth 1972.
(3)

The algorithm for distributing a number
of pulses in timing slots in the SNS was given by
Eric Bjorklund (1999). Godfried Toussaint (2005)
noticed the parallel with rhythms, and his work is
the basis for our exposition. For a more extensive
discussion, see Demaine et al. 2009. For a
book-length treatment of algorithms and music, see
Toussaint 2013.
(4)

The criteria come from Donald Knuth
(1997, sec. 1), who also starts his exposition with
Euclid’s algorithm.
(5)

For a discussion of the enumeration of
the paths on the grid, see Knuth 2011, 253–255; it
is the source for the example and path images. For
the algorithm that gives the number of possible
paths, see Iwashita et al.
2013.
(6)

For these number descriptions, see
Tyson, Strauss, and Gott 2016, 18–20. In Dave
Eggers’s novel *The Circle*, a thinly disguised technology company calculates the number of grains of sand in the Sahara Desert.

(7)

To fold paper
times, the paper must be large enough. If
you fold it always along the same dimension, you
will need a long sheet of paper. The length is given
by the formula
, where
is the paper’s thickness and
is the number of folds. If you fold a
square sheet of paper in alternate directions, then
the width of the square must be
. The reason why the
formulas are more complicated than simple powers of
two is that every time you fold the paper, you lose
some part of it as it curves along the edge of the
fold; it’s from calculating these curves that π
enters the picture in these formulas. The formulas
were found in 2002 by Britney Crystal Gallivan, then
a junior in high school. She went on to demonstrate
that a 1,200 meters–long sheet of toilet paper could
be folded in half 12 times. For a nice introduction
to the power of powers (including this example), see
Strogatz 2012, chapter 11.
(8)

“Transistor Count,” Wikipedia,
https://en.wikipedia.org/wiki/Transistor_count.
(9)

That is because to compare
items between them, you need to take one
of them and compare it to all the other
items, then you take another one and
compare it to the other
items (you have already compared it to
the first item you used), and so on. That gives
comparisons. Then you get
, because according to
the definition of
, if your algorithm runs in
time
, it will certainly run
in time
.
#### الفصل الثاني: التمثيلات البيانية

(1)

Image retrieved from the Wikipedia Commons
at
https://commons.wikimedia.org/wiki/File:Konigsberg_Bridge.png.
The image is in the public.
(2)

The paper (Eulerho 1736) is available from
the Euler Archive
(http://eulerarchive.maa.org),
maintained by the Mathematical Association of America.
For an English translation, see Biggs, Lloyd, and Wilson
1986.
(3)

The literature on graphs is vast, as is the
subject itself. For a good starting point, see Benjamin,
Chartrand, and Zhang 2015.
(4)

Image from the original publication
(Eulerho 1736) retrieved from the Wikipedia Commons at
https://commons.wikimedia.org/wiki/File:Solutio_problematis_ad_geometriam_situs_pertinentis,_Fig._1.png.
The image is in the public
domain.
(5)

Image from Kekulé 1872, retrieved from
the Wikipedia at
https://en.wikipedia.org/wiki/Benzene#/media/File:Historic_Benzene_Formulae_Kekul%C3%A9_(original).png.
The image is in the public
domain.
(6)

For the original publication in German
see Hierholzer 1873.
(7)

For more details on Hierholzer’s
algorithm and other algorithms for Eulerian paths,
see Fleischner 1991. For the use of graphs in genome
assembly, see Pevzner, Tang, and Waterman 2001;
Compeau, Pevzner, and Tesler
2011.
(8)

For an analysis of the optimality of
the greedy algorithm for online edge coloring, as
well as the example of the starlike graph to show
the worst case, see Bar-Noy, Motwani, and Naor
1992.
(9)

In the original fable, the two
characters are an ant and cicada. These two
characters also feature in Latin translations of the
original ancient Greek and Jean de La Fontaine’s
retelling of the fable in
French.
(10)

The invention episode is recounted by
Dijkstra in his interview in Misa and Frana
2010.
#### الفصل الثالث: البحث

(1)

For the first description of the
Matthew effect, see Merton 1968. For overviews of
the range of phenomena manifesting unequal
distributions, see Barabási and Márton 2016; West
2017. For the stadium height and wealth disparity,
see Taleb 2007.
(2)

John McCabe (1965) presented a
self-organized search. For analyses of the
performance of the move-to-front and transposition
methods, see Rivest 1976; Bachrach, El-Yaniv, and
Reinstädtler 2002.
(3)

The secretary problem appeared in
Martin Gardner’s column in February 1960 in
*Scientific American*. A solution was given in the March 1960 issue. For its history, see Ferguson 1989. J. Neil Bearden (2006) provided the solution for the not all-or-nothing variant. Matt Parker (2014, chapter 11) presents the problem, along with several other mathematical ideas and an introduction to computers.

(4)

Binary search goes back to the dawn of
the computer age (Knuth 1998). John Mauchly, one of
the designers of the ENIAC, the first
general-purpose electronic digital computer,
described it in 1946. For the checkered history of
binary search, see Bentley 2000; Pattis 1988; Bloch
2006.
#### الفصل الرابع: الترتيب

(1)

Hollerith 1894.
(2)

Selection and insertion sort have been
with us since the dawn of computers; they were
included in a survey of sorting published in the
1950s (Friend 1956).
(3)

According to Knuth (1998, 170), the
idea behind radix sort that we have seen here seems
to have been around at least since the
1920s.
(4)

Flipping the coin 226 times follows
from 1/52! ≈ (1/2)^{226}. The example of picking an atom from the earth is from David Hand (2014), according to whom probabilities less than one in 10

^{50}are negligible on the cosmic scale.

(5)

See Hoare 1961a, 1961b,
1961c.
(6)

For more on randomized algorithms, see
Mitzenmacher and Upfal 2017.
(7)

For an account of von Neumann’s life
and the environment around the origins of digital
computers, see Dyson 2012. For a presentation of von
Neumann’s merge sort program, see Knuth
1970.
#### الفصل الخامس: خوارزمية بيج رانك

(1)

The original PageRank algorithm was
published by Brin and Page (1998). We glossed over
the mathematics used by the algorithm. For a more
in-depth treatment, see Bryan and Leise 2006. For an
introduction to search engines and PageRank, see
Langville and Meyer 2006; Berry and Browne 2005.
Apart from PageRank, another important algorithm
used for ranking is Hypertext Induced Topic Search,
or HITS (Kleinberg 1998, 1999), developed before
PageRank. Similar ideas had been developed in other
fields (sociometry, the quantitative study of social
relationships, and econometrics, the quantitative
study of economic principles) much earlier, going
back to the 1940s (Franceschet
2011).
#### الفصل السادس: التعلُّم العميق

(1)

Although today we can use technology to
see neurons in much greater detail, Ramón y Cajal
was a pioneer, and his drawings rank among the most
elegant illustrations in the history of science. You
can find neuron images aplenty on the web, but this
image is enough for us, and a simple web search
should convince you of the beauty and enduring power
of Ramón y Cajal’s illustrations. The image is in
the public domain, retrieved from
https://commons.wikimedia.org/wiki/File:PurkinjeCell.jpg.
(2)

To be accurate, sigmoid would refer to
the Greek letter sigma, which is Σ, yet its
appearance is closer to the Latin
S.
(3)

The tangent of an angle is defined as
the ratio of the opposite side to the adjacent side
in a straight triangle, or equivalently, by the sine
of the angle divided by the cosine of the angle in
the unit circle. The hyperbolic tangent is defined
as the ratio of the hyperbolic sine by the
hyperbolic cosine of an angle on a
hyperbola.
(4)

Warren McCulloch and Walter Pitts
(1943) proposed the first artificial neuron. Frank
Rosenblatt (1957) described the Perceptron. If they
are more than half a century old, how come neural
networks have become all the rage recently? Marvin
Minsky and Seymour Papert (1969) struck a major blow
to Perceptrons in their famous book of the same
name, which showed that a single Perceptron had
fundamental computing limitations. This, coupled
with the hardware limitations of the time, ushered
in a so-called winter in neural computation, which
lasted well until the 1980s, when researchers found
how to build and train complex neural networks.
Interest in the field then revived, but still a lot
more work was required to advance neural networks to
the media-grabbing results that we have been seeing
in the last few years.
(5)

One of the challenges in neural
networks is that the notation can be off-putting and
hence the material seems approachable only to the
initiated. In fact, it is not that complicated once
you know what it is about. You often see
derivatives; the derivative of a function
with respect to
is written
. The partial derivative
of a function
of many variables, say,
, is written
. The gradient is
written
.
(6)

The backpropagation algorithm came onto
the scene in the mid-1980s (Rumelhart, Hinton, and
Williams 1986), although various derivations of it
had appeared back in the
1960s.
(7)

This image is from the Fashion-MNIST
data (Xiao, Rasul, and Vollgraf 2017), which was
developed as a benchmark data set for machine
learning. This section was inspired by the basic
classification Tensor Flow tutorial at
https://www.tensorflow.org/tutorials/keras/basic_classification.
(8)

For a description of the first system
to beat the Go human champion, see Silver et al.
2016. For an improved system that does not require
human knowledge in the form of previously played
games, see Silver et al.
2017.
(9)

The literature on deep learning is
vast. For a comprehensive introduction to the topic,
see Goodfellow, Bengio, and Courville 2016. For a shorter and more approachable treatment,
see
Charniak 2018. For a concise overview, see LeCun,
Bengio, and Hinton 2015. For deep and machine
learning, see Alpaydin 2016. For a survey of
automated neural architecture search methods, see
Elsken, Hendrik Metzen, and Hutter
2018.
#### الخاتمة

(1)

Besides Turing, other names on the short
list were Mary Anning, Paul Dirac, Rosalind Franklin,
William Herschel and Caroline Herschel, Dorothy Hodgkin,
Ada Lovelace and Charles Babbage, Stephen Hawking, James
Clerk Maxwell, Srinivasa Ramanujan, Ernest Rutherford,
and Frederick Sanger. Babbage, Lovelace, and Turing were
all computer pioneers. Babbage (1791–1871) invented the
first mechanical computer and developed the essential
ideas of modern computers. Lovelace (1815–1852), the
daughter of Lord Byron, worked with Babbage, recognized
the potential of his invention, and was the first to
develop an algorithm that would run on such a machine.
She is now considered to have been the first computer
programmer. For the £50 design, see the official
announcement at
https://www.bankofengland.co.uk/news/2019/july/50-pound-banknote-character-announcement.
(2)

See the excellent biography by Andrew
Hodges (1983). Turing’s role in breaking the German
Enigma cryptographic machine were dramatized in the 2014
film *The Imitation Game*.

(3)

For a description of the machine, see
Turing 1937, 1938.
(4)

The Turing machine example is adapted from
John Hopcroft, Rajeev Motwani, and Jeffrey Ullman (2001,
chapter 8). The figure is based on Sebastian Sardina’s
example at
http://www.texample.net/tikz/examples/turing-machine-2/.
(5)

For more on the Church-Turing thesis, see
Lewis and Papadimitriou 1998, chapter 5. For a
discussion of the history of the Church-Turing thesis
and various variants, see Copeland and Shagrir
2019.