这里是王垠发表的历史文章的一个私人备份。所有文章都属于王垠版权所有，放在这里仅仅是为了自己查看方便，因为原始文章会lost in future。
写了一篇英文短文叫《What is aProgram?》希望开始一系列的对程序语言语义，解释器，编译器，优化，静态分析，类型推导，并行计算，逻辑，定理证明的阐述和探索。我希望这些东西到最后能被整理成一本浅显易懂的程序语言入门读物，一方面帮助别人，另一方面也加深自己对这些概念的理解。虽然我尽量让讲解接近直觉，其中有些内容恐怕会很难解释（比如我觉得这篇里面的图没怎么画好），所以可能要多努力一下。如果看不懂或者发现错误的话，欢迎给我指出。
In this article I hope to explore possible answers to the question"What is a program?" from a programming language's perspective.This is in constrast to a usual "machine perspective". While themachine perspective may be broader, it is often more complicated (think of Turing machines). A programming language's perspectivemay be simpler, but it may be overly simplistic to the degree thatit doesn't capture everything that the machine perspective has tooffer. Later revisions may provide more insights into this matter.
Are programs necessary forcomputation?
Programs are usually what we use forcomputation, but are they necessary? I think the answer is No. Tounderstand this, first let us answer the question "What iscomputation?" Simply put, computation is the many ways wesimulate the world. This simulation could happen in thebrain, in a crystal ball, in an electronic circuit, in a chemicalreactor, or more. Originally， the purpose of this simulation is topredict the future (or at least the near future) for our advantagesor survival. This goal is evident in the use of computation inmeteorology， particle physics, biology etc. We want to know how thereal world changes with given conditions and time, so wemodel the world with data structures in a computer, andrun simulations on those "models". To be practical, we have toignore many apsects of the real word, so the models we have areonlyabstractions of it. The models must reflectaccurately the aspects that we care about, withoutover-assumptions, that is, constraints or conditions that doesn'treally exist in the real world. Errors may be tolerable, but shouldbe well-controlled. This is why computation is more and moreimportant in sciences, because it lets us predict the future. Intoday's sciences, it also lets us experiment with our "guesses" ofhow the universe was "built". There are many other aspects ofcomputation, for example as an information processing tool.Information is also part of the world, so we are actually using acomputer for the same purpose: manipulating the world's model. Nowlet's come back to our first question, do we need programs forcomputation? I said that it was not necessary. The reason is that a"computer" can be a special-purpose computerwith only onespecific goal, for example calculating the result of 2+3. Suchsimple circuit may be built using a few logic gates with fixedinputs on the pins. We will get the output as electric signals fromthe output pins. So you see, a program is not really necessary forcomputation. We can build any kind of circuit we want, be itelectrical, chemical or biological, as long as they reflect thereal world, and then we can utilize the physical charateristics ofthe circuit to do our "computation".
Programs are dynamiccircuits
Why do most computers run programsthen? This is because building circuits is time-consuming andusually very expensive, so we want to build a circuit once and useit for many purposes. We call those circuits which can serve morethan one purpose general-purpose computers. For example,in addition to calculating 2+3, they can also calculate 3+4, 2-15,124*45, etc and can do more complicated things. So you see, thosegeneral-purpose computers have the functionalities of many， even awhole class of, special-purpose computers. In order to achievethis, a general-purpose computer simulatesthespecial-purpose computers. That is, it "pretends" to be one of themat a given time. Now the question is, how does the general-purposecomputer know which special-purpose computer to simulate? Theanswer is to use a data structure -- a description of aspecial-purpose computer which the general-purpose computer canmanipulate. This data structure is normally called aprogram. The general-purpose computer is usually calledthe interpreter of the program. Once the interpreter getsa program as input, it behaves like the program. Since aspecial-purpose computer is a circuit, a program is then adescription of a circuit. This is evident from thefollowing example Scheme program which computes the factorial ofthe input number.
[(= n 0) 1]
[else (* n (fact (- n 1)))])))
As we can see, there is aninputn, and the output is either 1 or (* n (fact (- n1)))， depending on whether n is 0. Notice that "fact" refers to thefunction itself. If we build a circuit for computing this function,it would look like:
This circuit can be physicallyimplemented using electronic devices. I haven't drawn all thedetails. You must think of those four n's as connected together bywires, and have the same values at any time. The triangular shapeis a demultiplexer (demux)， it tests whether n is equal to one, and then decides whichoutput to activate. The demux is common to all programminglanguages, sometimes they appear as conditional statements (if,cond, ...)， sometimes they appear as pattern matching, butin essence they are all the same thing. The reason of using them issimply because different values of input can come in, and we needto distinguish them -- we look at them and do different thingsaccording to their values. But notice that this is not a usualcircuit. It is a "dynamic circuit". The botton part denoted by (fact (- n 1)) is not an endpoint. It is aninstantiationof the same circuit by passing (- n 1) as input. So the circuit isgrowing as we compute. For example, when we pass in the initialvalue 5, we end up with a circuit like the following after oneexpansion of the endpoint:
Interpreters are circuitsimulators
So now you may wonder whetherinterpreters are circuit simulators. In fact they are. If you lookat a simple interpreter, you will find that it is using some tricksto simulate the signal-flow in a circuit. Usually the interpretersare recursive, they traverse the sub-components and in themeanwhile carrying signal values in what we call anenvironmentor symbol table, so that they mayknow the values of each wire that may be used. Sometimes the wiresin both a component and its sub-component can be labeled with thesame name, if the wires in the component don't also go into thesub-component. This is called "shadowing". But we should rememberat all times that the wires that are labeled by the same name arein fact distinct wires and have absolutely nothing to do with eachother. We use the same names just for our notational convenience.If we don't reuse names, we may run out of unique names very soonin large circuits.
Lexical Scoping v.s. DynamicScoping
We can also explain why the notionof lexical scoping (or static scoping) is morereasonable than dynamic scoping. This because when wewrite the definition of a function that is later used as a value,the free variables inside the function refer to the names in anouter scope of the definition site. When we pass the functionaround as a value, those free variables (as electronic pins) shouldstill be "connected" to their original wires. If we use dynamicscoping, those variables may be reconnected to a wire labeled withthe same name but at the call site. This is not correct, becausethose two wires with the same name are different wires, and thushave completely different meanings. Ourintention isprobably to use the wire at the definition site because that iswhat we see when we write the function. We probably have nointention of connecting to a random wire with the same namesomewhere else. This why programming languages should not usedynamic scoping. This intuition of a program as a circuit is veryuseful. It can guide us to become aware of the most import aspectsof programming languages and help us become better programmers. Iwill talk about compilers, parallel computing, type inference,logics and theorem provers in later posts using this generalframework of thinking.