Software is a set of computer programs and associated documentation and data.[1] This is in contrast to hardware, from which the system is built and which actually performs the work.
At the lowest programming level, executable code consists of machine language instructions supported by an individual processor—typically a central processing unit (CPU) or a graphics processing unit (GPU). Machine language consists of groups of binary values signifying processor instructions that change the state of the computer from its preceding state. For example, an instruction may change the value stored in a particular storage location in the computer—an effect that is not directly observable to the user. An instruction may also invoke one of many input or output operations, for example, displaying some text on a computer screen, causing state changes that should be visible to the user. The processor executes the instructions in the order they are provided, unless it is instructed to "jump" to a different instruction or is interrupted by the operating system. (As of 2023), most personal computers, smartphone devices, and servers have processors with multiple execution units, or multiple processors performing computation together, so computing has become a much more concurrent activity than in the past.
The majority of software is written in high-level programming languages. They are easier and more efficient for programmers because they are closer to natural languages than machine languages.[2] High-level languages are translated into machine language using a compiler, an interpreter, or a combination of the two. Software may also be written in a low-level assembly language that has a strong correspondence to the computer's machine language instructions and is translated into machine language using an assembler.
An algorithm for what would have been the first piece of software was written by Ada Lovelace in the 19th century, for the planned Analytical Engine.[3] She created proofs to show how the engine would calculate Bernoulli numbers.[3] Because of the proofs and the algorithm, she is considered the first computer programmer.[4][5]
The first theory about software, prior to the creation of computers as we know them today, was proposed by Alan Turing in his 1936 essay, On Computable Numbers, with an Application to the Entscheidungsproblem (decision problem).[6] This eventually led to the creation of the academic fields of computer science and software engineering; both fields study software and its creation.[citation needed] Computer science is the theoretical study of computer and software (Turing's essay is an example of computer science), whereas software engineering is the application of engineering principles to development of software.[7]
In 2000, Fred Shapiro, a librarian at the Yale Law School, published a letter revealing that John Wilder Tukey's 1958 paper "The Teaching of Concrete Mathematics"[8][9] contained the earliest known usage of the term "software" found in a search of JSTOR's electronic archives, predating the Oxford English Dictionary's citation by two years.[10] This led many to credit Tukey with coining the term, particularly in obituaries published that same year,[11] although Tukey never claimed credit for any such coinage. In 1995, Paul Niquette claimed he had originally coined the term in October 1953, although he could not find any documents supporting his claim.[12] The earliest known publication of the term "software" in an engineering context was in August 1953 by Richard R. Carhart, in a Rand Corporation Research Memorandum.[13]
On virtually all computer platforms, software can be grouped into a few broad categories.
Based on the goal, computer software can be divided into:
Programming tools are also software in the form of programs or applications that developers use to create, debug, maintain, or otherwise support software.[17][better source needed]
Software is written in one or more programming languages; there are many programming languages in existence, and each has at least one implementation, each of which consists of its own set of programming tools. These tools may be relatively self-contained programs such as compilers, debuggers, interpreters, linkers, and text editors, that can be combined to accomplish a task; or they may form an integrated development environment (IDE), which combines much or all of the functionality of such self-contained tools.[citation needed] IDEs may do this by either invoking the relevant individual tools or by re-implementing their functionality in a new way.[citation needed] An IDE can make it easier to do specific tasks, such as searching in files in a particular project.[citation needed] Many programming language implementations provide the option of using both individual tools or an IDE.[citation needed]
People who use modern general purpose computers (as opposed to embedded systems, analog computers and supercomputers) usually see three layers of software performing a variety of tasks: platform, application, and user software.[citation needed]
Computer software has to be "loaded" into the computer's storage (such as the hard drive or memory). Once the software has loaded, the computer is able to execute the software. This involves passing instructions from the application software, through the system software, to the hardware which ultimately receives the instruction as machine code. Each instruction causes the computer to carry out an operation—moving data, carrying out a computation, or altering the control flow of instructions.[citation needed]
Data movement is typically from one place in memory to another. Sometimes it involves moving data between memory and registers which enable high-speed data access in the CPU. Moving data, especially large amounts of it, can be costly; this is sometimes avoided by using "pointers" to data instead.[citation needed] Computations include simple operations such as incrementing the value of a variable data element. More complex computations may involve many operations and data elements together.[citation needed]
Software quality is very important, especially for commercial and system software. If software is faulty, it can delete a person's work, crash the computer and do other unexpected things. Faults and errors are called "bugs" which are often discovered during alpha and beta testing.[citation needed] Software is often also a victim to what is known as software aging, the progressive performance degradation resulting from a combination of unseen bugs.[citation needed]
Many bugs are discovered and fixed through software testing. However, software testing rarely—if ever—eliminates every bug; some programmers say that "every program has at least one more bug" (Lubarsky's Law).[18] In the waterfall method of software development, separate testing teams are typically employed, but in newer approaches, collectively termed agile software development, developers often do all their own testing, and demonstrate the software to users/clients regularly to obtain feedback.[citation needed] Software can be tested through unit testing, regression testing and other methods, which are done manually, or most commonly, automatically, since the amount of code to be tested can be large.[citation needed] Programs containing command software enable hardware engineering and system operations to function much easier together.[19]
The software's license gives the user the right to use the software in the licensed environment, and in the case of free software licenses, also grants other rights such as the right to make copies.[20]
Proprietary software can be divided into two types:
Open-source software comes with a free software license, granting the recipient the rights to modify and redistribute the software.[23]
Software patents, like other types of patents, are theoretically supposed to give an inventor an exclusive, time-limited license for a detailed idea (e.g. an algorithm) on how to implement a piece of software, or a component of a piece of software. Ideas for useful things that software could do, and user requirements, are not supposed to be patentable, and concrete implementations (i.e. the actual software packages implementing the patent) are not supposed to be patentable either—the latter are already covered by copyright, generally automatically. So software patents are supposed to cover the middle area, between requirements and concrete implementation. In some countries, a requirement for the claimed invention to have an effect on the physical world may also be part of the requirements for a software patent to be held valid—although since all useful software has effects on the physical world, this requirement may be open to debate. Meanwhile, American copyright law was applied to various aspects of the writing of the software code.[24]
Software patents are controversial in the software industry with many people holding different views about them. One of the sources of controversy is that the aforementioned split between initial ideas and patent does not seem to be honored in practice by patent lawyers—for example the patent for aspect-oriented programming (AOP), which purported to claim rights over any programming tool implementing the idea of AOP, howsoever implemented.[citation needed] Another source of controversy is the effect on innovation, with many distinguished experts and companies arguing that software is such a fast-moving field that software patents merely create vast additional litigation costs and risks, and actually retard innovation.[citation needed] In the case of debates about software patents outside the United States, the argument has been made that large American corporations and patent lawyers are likely to be the primary beneficiaries of allowing or continue to allow software patents.[citation needed]
Design and implementation of software vary depending on the complexity of the software. For instance, the design and creation of Microsoft Word took much more time than designing and developing Microsoft Notepad because the former has much more basic functionality.[citation needed]
Software is usually developed in integrated development environments (IDE) like Eclipse, IntelliJ and Microsoft Visual Studio that can simplify the process and compile the software.[citation needed] As noted in a different section, software is usually created on top of existing software and the application programming interface (API) that the underlying software provides like GTK+, JavaBeans or Swing.[citation needed] Libraries (APIs) can be categorized by their purpose. For instance, the Spring Framework is used for implementing enterprise applications, the Windows Forms library is used for designing graphical user interface (GUI) applications like Microsoft Word, and Windows Communication Foundation is used for designing web services.[citation needed] When a program is designed, it relies upon the API. For instance, a Microsoft Windows desktop application might call API functions in the .NET Windows Forms library like Form1.Close() and Form1.Show()[25] to close or open the application. Without these APIs, the programmer needs to write these functionalities entirely themselves. Companies like Oracle and Microsoft provide their own APIs so that many applications are written using their software libraries that usually have numerous APIs in them.[citation needed]
Data structures such as hash tables, arrays, and binary trees, and algorithms such as quicksort, can be useful for creating software.
Computer software has special economic characteristics that make its design, creation, and distribution different from most other economic goods.[specify][26][27]
A person who creates software is called a programmer, software engineer or software developer, terms that all have a similar meaning. More informal terms for programmer also exist such as "coder" and "hacker" – although use of the latter word may cause confusion, because it is more often used to mean someone who illegally breaks into computer systems.