Linux and Unix are very popular with programmers, not just due to the overwhelming array of tools and environments available but also because the system is exceptionally well documented and transparent. On a Linux machine, you don’t have to be a programmer to take advantage of development tools, but when working with the system, you should know something about programming tools because they play a larger role in managing Unix systems than in other operating systems. At the very least, you should be able to identify development utilities and have some idea of how to run them.
Linux和Unix在程序员中非常受欢迎,不仅因为提供了丰富的工具和环境,还因为系统的文档和透明度异常出色。
在Linux机器上,即使不是程序员,也可以利用开发工具,但是在使用系统时,你应该了解一些关于编程工具的知识,因为它们在管理Unix系统中起着比其他操作系统更重要的作用。
至少,你应该能够识别开发工具,并且对如何运行它们有一些了解。
This chapter packs a lot of information into a small space, but you don’t need to master everything here. You can easily skim the material and come back later. The discussion of shared libraries is likely the most important thing that you need to know. But to understand where shared libraries come from, you first need some background on how to build programs.
本章节在一个小空间内提供了大量的信息,但你不需要完全掌握这里的所有内容。
你可以简单地浏览材料,然后稍后再回来。关于共享库的讨论可能是你需要了解的最重要的内容。
但是要理解共享库的来源,你首先需要了解如何构建程序的一些背景知识。
Knowing how to run the C programming language compiler can give you a great deal of insight into the origin of the programs that you see on your Linux system. The source code for most Linux utilities, and for many applications on Linux systems, is written in C or C++. We’ll primarily use examples in C for this chapter, but you’ll be able to carry the information over to C++.
了解如何运行C编程语言编译器可以让您对在Linux系统上看到的程序的起源有很大的了解。
大多数Linux实用程序和许多Linux系统上的应用程序的源代码都是用C或C++编写的。
本章我们将主要使用C的示例,但您可以将这些信息应用到C++上。
C programs follow a traditional development process: You write programs, you compile them, and they run. That is, when you write a C program and want to run it, you must compile the source code that you wrote into a binary low-level form that the computer understands. You can compare this to the scripting languages that we’ll discuss later, where you don’t need to compile anything.
C程序遵循传统的开发过程:您编写程序,编译它们,然后运行它们。
也就是说,当您编写一个C程序并希望运行它时,您必须将您编写的源代码编译成计算机理解的二进制低级形式。
您可以将此与我们稍后将讨论的脚本语言进行比较,在那里您不需要编译任何东西。
NOTE By default, most distributions do not include the tools necessary to compile C code because these tools occupy a fairly large amount of space. If you can’t find some of the tools described here, you can install the build-essential package for Debian/Ubuntu or the Chapter 15 yum groupinstall for Fedora/CentOS. Failing that, try a package search for “C compiler.” 注意: 默认情况下,大多数发行版不包含编译C代码所需的工具,因为这些工具占用了相当大的空间。 如果您找不到这里描述的某些工具,可以为Debian/Ubuntu安装build-essential软件包,或者为Fedora/CentOS安装Chapter 15 yum groupinstall软件包。 如果这样做不行,请尝试搜索"C编译器"。
The C compiler executable on most Unix systems is the GNU C complier, gcc, though the newer clang compiler from the LLVM project is gaining popularity. C source code files end with .c. Take a look at the single, self-contained C source code file called hello.c, which you can find in The C Programming Language, 2nd edition, by Brian W. Kernighan and Dennis M. Ritchie (Prentice Hall, 1988):
大多数Unix系统上的C编译器可执行文件是GNU C编译器gcc,尽管来自LLVM项目的较新的clang编译器正在变得越来越受欢迎。
C源代码文件以.c结尾。
请查看名为hello.c的单个、自包含的C源代码文件,您可以在Brian W. Kernighan和Dennis M. Ritchie的
《C程序设计语言》 第2版(Prentice Hall,1988年)中找到。
#include <stdio.h>
main() {
printf("Hello, World.\n");
}
Put this source code in a file called hello.c and then run this command:
将这段源代码放入一个名为hello.c的文件中,然后运行以下命令:
$ cc hello.c
The result is an executable named a.out, which you can run like any other executable on the system. However, you should probably give the executable another name (such as hello). To do this, use the compiler’s -o option:
结果是一个名为a.out的可执行文件,您可以像系统上的其他可执行文件一样运行它。
然而,您可能应该给可执行文件取另一个名字(比如hello)。
为此,请使用编译器的-o选项:
$ cc -o hello hello.c
For small programs, there isn’t much more to compiling than that. You might need to add an extra include directory or library (see 15.1.2 Header (Include) Files and Directories and 15.1.3 Linking with Libraries), but let’s look at slightly larger programs before getting into those topics.
对于小型程序来说,编译工作就没什么了。
在进入这些主题之前,您可能需要添加额外的包含目录或库(参见15.1.2 头文件和目录和15.1.3 链接库),但让我们先看一下稍微大一点的程序。
Most C programs are too large to reasonably fit inside a single source code file. Mammoth files become too disorganized for the programmer, and compilers sometimes even have trouble parsing large files. Therefore, developers group components of the source code together, giving each piece its own file.
大多数C程序太大了,无法合理地放在一个单独的源代码文件中。
庞大的文件会使程序员难以组织,而且编译器有时甚至会在解析大文件时出现问题。
因此,开发人员将源代码的组件分组在一起,每个组件都有自己的文件。
When compiling most .c files, you don’t create an executable right away. Instead, use the compiler’s -c option on each file to create object files. To see how this works, let’s say you have two files, main.c and aux.c. The following two compiler commands do most of the work of building the program:
在编译大多数.c文件时,你不会立即创建一个可执行文件。相反,可以在每个文件上使用编译器的-c选项来创建目标文件。
为了看清楚这是如何工作的,假设你有两个文件,main.c和aux.c。
下面的两个编译器命令完成了构建程序的大部分工作:
$ cc -c main.c
$ cc -c aux.c
The preceding two commands compile the two source files into the two object files main.o and aux.o.
上述两个命令将这两个源文件编译为两个目标文件main.o和aux.o。
An object file is a binary file that a processor can almost understand, except that there are still a few loose ends. First, the operating system doesn’t know how to run an object file, and second, you likely need to combine several object files and some system libraries to make a complete program.
目标文件是处理器几乎可以理解的二进制文件,只是还有一些松散的部分。
首先,操作系统不知道如何运行目标文件,其次,你可能需要将多个目标文件和一些系统库组合起来,才能生成一个完整的程序。
To build a fully functioning executable from one or more object files, you must run the linker, the ld command in Unix. Programmers rarely use ld on the command line, because the C compiler knows how to run the linker program. So to create an executable called myprog from the two object files above, run this command to link them:
要从一个或多个目标文件构建一个完全运行的可执行文件,必须运行链接器,即Unix中的ld命令。
程序员很少在命令行上使用ld,因为C编译器知道如何运行链接器程序。
因此,要从上述两个目标文件中链接它们创建一个名为myprog的可执行文件,请运行以下命令:
$ cc -o myprog main.o aux.o
Although you can compile multiple source files by hand, as the preceding example shows, it can be hard to keep track of them all during the compiling process when the number of source files multiplies. The make system described in 15.2 make is the traditional Unix standard for managing compiles. This system is especially important in managing the files described in the next two sections.
虽然你可以手动编译多个源文件,就像上面的例子所示,但是当源文件数量增多时,在编译过程中跟踪所有这些文件可能会很困难。
在第15.2节中描述的make系统是管理编译的传统Unix标准。这个系统在管理下面两节中描述的文件时尤为重要。
C header files are additional source code files that usually contain type and library function declarations. For example, stdio.h is a header file (see the simple program in 15.1 The C Compiler).
C头文件是通常包含类型和库函数声明的附加源代码文件。例如,stdio.h是一个头文件(见第15.1节C编译器中的简单程序)。
Unfortunately, a great number of compiler problems crop up with header files. Most glitches occur when the compiler can’t find header files and libraries. There are even some cases where a programmer forgets to include a required header file, causing some of the source code to not compile.
不幸的是,使用头文件时经常出现许多编译器问题。大多数故障发生在编译器找不到头文件和库时。
甚至有些情况下,程序员忘记包含所需的头文件,导致部分源代码无法编译。
Tracking down the correct include files isn’t always easy. Sometimes there are several include files with the same names in different directories, and it’s not clear which is the correct one. When the compiler can’t find an include file, the error message looks like this:
找到正确的头文件并不总是容易的。
有时候在不同目录中有几个同名的头文件,不清楚哪一个是正确的。
当编译器找不到一个头文件时,错误信息会像这样:
badinclude.c:1:22: fatal error: notfound.h: No such file or directory
This message reports that the compiler can’t find the notfound.h header file that the badinclude.c file references. This specific error is a direct result of this directive on line 1 of badinclude.c:
这个错误报告了编译器找不到badinclude.c文件引用的notfound.h头文件。
这个具体的错误是由badinclude.c文件的第1行上的这个指令引起的:
#include <notfound.h>
The default include directory in Unix is /usr/include; the compiler always looks there unless you explicitly tell it not to. However, you can make the compiler look in other include directories (most paths that contain header files have include somewhere in the name).
在Unix中,默认的包含目录是/usr/include;除非你明确告诉编译器不要去找,否则编译器总是在那里查找。
然而,你可以让编译器在其他包含目录中查找(大多数包含头文件的路径中都包含include这个关键字)。
NOTE You’ll learn more about how to find missing include files in Chapter 16. 注意 在第16章中,你将学习更多关于如何找到缺失的头文件。
For example, let’s say that you find notfound.h in /usr/junk/include. You can make the compiler see this directory with the -I option:
例如,假设你在/usr/junk/include中找到了notfound.h。
你可以使用-I选项让编译器看到这个目录:
$ cc -c -I/usr/junk/include badinclude.c
Now the compiler should no longer stumble on the line of code in badinclude.c that references the header file. You should also beware of includes that use double quotes (" ") instead of angle brackets (< >), like this:
现在,编译器不应该再在badinclude.c中引用头文件的那行代码上出现问题了。
你还应该注意,有些#include语句使用双引号(" ")而不是尖括号(< >),像这样:
#include "myheader.h"
Double quotes mean that the header file is not in a system include directory but that the compiler should otherwise search its include path. It often means that the include file is in the same directory as the source file. If you encounter a problem with double quotes, you’re probably trying to compile incomplete source code.
双引号意味着头文件不在系统的包含目录中,但编译器应该在其包含路径中搜索。
这通常意味着头文件与源文件位于同一个目录中。
如果你在双引号中遇到问题,你可能在尝试编译不完整的源代码。
It turns out that the C compiler does not actually do the work of looking for all of these include files. That task falls to the C preprocessor, a program that the compiler runs on your source code before parsing the actual program. The preprocessor rewrites source code into a form that the compiler understands; it’s a tool for making source code easier to read (and for providing shortcuts).
事实上,C编译器并不实际负责查找所有这些头文件。这项任务由C预处理器完成,它是编译器在解析实际程序之前在源代码上运行的程序。
预处理器将源代码重写成编译器能理解的形式;它是一种使源代码更易读(并提供快捷方式)的工具。
Preprocessor commands in the source code are called directives, and they start with the # character. There are three basic types of directives:
源代码中的预处理器命令称为指令,它们以#字符开头。有三种基本类型的指令:
o Include files. An #include directive instructs the preprocessor to include an entire file. Note that the compiler’s -I flag is actually an option that causes the preprocessor to search a specified directory for include files, as you saw in the previous section.
o Macro definitions. A line such as #define BLAH something tells the preprocessor to substitute something for all occurrences of BLAH in the source code. Convention dictates that macros appear in all uppercase, but it should come as no shock that programmers sometimes use macros whose names look like functions and variables. (Every now and then, this causes a world of headaches. Many programmers make a sport out of abusing the preprocessor.)
o Conditionals. You can mark out certain pieces of code with #ifdef, #if, and #endif. The #ifdef MACRO directive checks to see whether the preprocessor macro MACRO is defined, and #if condition tests to see whether condition is nonzero. For both directives, if the condition following the “if statement” is false, the preprocessor does not pass any of the program text between the #if and the next #endif to the compiler. If you plan to look at any C code, you’d better get used to this.
NOTE Instead of defining macros within your source code, you can also define macros by passing parameters to the compiler: -DBLAH=something works like the directive above. 注意 你可以通过向编译器传递参数来定义宏,而不是在源代码中定义宏:-DBLAH=something的效果类似于上面的指令。
An example of a conditional directive follows. When the preprocessor sees the following code, it checks to see whether the macro DEBUG is defined and, if so, passes the line containing fprintf() on to the compiler. Otherwise, the preprocessor skips this line and continues to process the file after the #endif:
以下是一个条件指令的示例。当预处理器看到下面的代码时,它会检查宏DEBUG是否已定义,如果是,则将包含fprintf()的那行传递给编译器。
否则,预处理器跳过这行代码,继续处理#endif之后的文件:
#ifdef DEBUG
fprintf(stderr, "This is a debugging message.\n");
#endif
NOTE The C preprocessor doesn’t know anything about C syntax, variables, functions, and other elements. It understands only its own macros and directives. 注意 C预处理器对C语法、变量、函数和其他元素一无所知。它只理解自己的宏和指令。
On Unix, the C preprocessor’s name is cpp, but you can also run it with gcc -E. However, you’ll rarely need to run the preprocessor by itself.
在Unix上,C预处理器的名称是cpp,但你也可以使用gcc -E来运行它。然而,你很少需要单独运行预处理器。
The C compiler doesn’t know enough about your system to create a useful program all by itself. You need libraries to build complete programs. A C library is a collection of common precompiled functions that you can build into your program. For example, many executables use the math library because it provides trigonometric functions and the like.
C编译器本身对于您的系统并不了解,无法单独创建一个有用的程序。
您需要使用库来构建完整的程序。C库是一组常见的预编译函数,您可以将其构建到程序中。
例如,许多可执行文件使用数学库,因为它提供三角函数等功能。
Libraries come into play primarily at link time, when the linker program creates an executable from object files. For example, if you have a program that uses the gobject library but you forget to tell the compiler to link against that library, you’ll see linker errors like this:
库主要在链接时起作用,链接程序将目标文件创建为可执行文件。
例如,如果您有一个使用gobject库的程序,但忘记告诉编译器链接该库,您将看到如下的链接错误:
badobject.o(.text+0x28): undefined reference to 'g_object_new'
The most important parts of these error messages are in bold. When the linker program examined the badobject.o object file, it couldn’t find the function that appears in bold, and as a consequence, it couldn’t create the executable. In this particular case, you might suspect that you forgot the gobject library because the missing function is g_object_new().
这些错误消息中最重要的部分用粗体表示。
当链接程序检查badobject.o目标文件时,它找不到出现在粗体中的函数,因此无法创建可执行文件。
在这种特殊情况下,您可能怀疑忘记了gobject库,因为缺少的函数是g_object_new()。
NOTE Undefined references do not always mean that you’re missing a library. One of the program’s object files could be missing in the link command. It’s usually easy to differentiate between library functions and functions in your object files, though. 注意:未定义的引用并不总是意味着缺少库。 链接命令中可能缺少程序的某个目标文件。 不过,通常很容易区分库函数和目标文件中的函数。
To fix this problem, you must first find the gobject library and then use the compiler’s -l option to link against the library. As with include files, libraries are scattered throughout the system (/usr/lib is the system default location), though most libraries reside in a subdirectory named lib. For the preceding example, the basic gobject library file is libgobject.a, so the library name is gobject. Putting it all together, you would link the program like this:
要解决这个问题,首先必须找到gobject库,然后使用编译器的-l选项链接该库。
与包含文件一样,库分散在整个系统中(/usr/lib是系统默认位置),尽管大多数库位于名为lib的子目录中。
对于前面的示例,基本的gobject库文件是libgobject.a,因此库名为gobject。
将所有内容组合起来,您可以像这样链接程序:
$ cc -o badobject badobject.o -lgobject
You must tell the linker about nonstandard library locations; the parameter for this is -L. Let’s say that the badobject program requires libcrud.a in /usr/junk/lib. To compile and create the executable, use a command like this:
您必须告诉链接器非标准库的位置;用于此的参数是-L。
假设badobject程序需要/usr/junk/lib中的libcrud.a。要编译并创建可执行文件,请使用如下命令:
$ cc -o badobject badobject.o -lgobject -L/usr/junk/lib -lcrud
NOTE If you want to search a library for a particular function, use the nm command. Be prepared for a lot of output. For example, try this: nm libgobject.a. (You might need to use the locate command to find libgobject.a; many distributions now put libraries in architecture-specific subdirectories in /usr/lib.) 注意:如果要在库中搜索特定函数,请使用nm命令。准备好大量输出。 例如,尝试执行此命令:nm libgobject.a(您可能需要使用locate命令来找到libgobject.a;许多发行版现在将库放在/usr/lib的体系结构特定子目录中)。
A library file ending with .a (such as libgobject.a) is called a static library. When you link a program against a static library, the linker copies machine code from the library file into your executable. Therefore, the final executable does not need the original library file to run, and furthermore, the executable’s behavior never changes.
以 .a 结尾的库文件(例如 libgobject.a)被称为静态库。
当你将程序与静态库进行链接时,链接器会将库文件中的机器码复制到可执行文件中。
因此,最终的可执行文件不需要原始库文件来运行,而且可执行文件的行为也永远不会改变。
However, library sizes are always increasing, as is the number of libraries in use, and this makes static libraries wasteful in terms of disk space and memory. In addition, if a static library is later found to be inadequate or insecure, there’s no way to change any executable linked against it, short of recompiling the executable.
然而,库的大小一直在增加,使用的库的数量也在增加,这使得静态库在磁盘空间和内存方面是一种浪费。
此外,如果后来发现静态库不足或存在安全问题,除非重新编译可执行文件,否则无法更改任何与之链接的可执行文件。
Shared libraries counter these problems. When you run a program linked against one, the system loads the library’s code into the process memory space only when necessary. Many processes can share the same shared library code in memory. And if you need to slightly modify the library code, you can generally do so without recompiling any programs.
共享库解决了这些问题。
当你运行与共享库链接的程序时,系统只在必要时将库的代码加载到进程内存空间中。
许多进程可以在内存中共享相同的共享库代码。
如果需要稍微修改库代码,通常可以在不重新编译任何程序的情况下完成。
Shared libraries have their own costs: difficult management and a somewhat complicated linking procedure. However, you can bring shared libraries under control if you know four things:
共享库也有自己的成本:管理困难和相对复杂的链接过程。
但是,如果你了解以下四点,就可以控制共享库:
o How to list the shared libraries that an executable needs o How an executable looks for shared libraries o How to link a program against a shared library o The common shared library pitfalls
The following sections tell you how to use and maintain your system’s shared libraries. If you’re interested in how shared libraries work or if you want to know about linkers in general, you can check out Linkers and Loaders by John R. Levine (Morgan Kaufmann, 1999), “The Inside Story on Shared Libraries and Dynamic Loading” by David M. Beazley, Brian D. Ward, and Ian R. Cooke (Computing in Science & Engineering, September/October 2001), or online resources such as the Program Library HOWTO (http://dwheeler.com/program-library/). The ld.so(8) manual page is also worth a read.
接下来的章节将告诉你如何使用和维护系统的共享库。
如果你对共享库的工作原理感兴趣,或者想了解链接器的一般情况,可以查阅
另外,也值得一读的是 ld.so(8) 手册页。
Shared library files usually reside in the same places as static libraries. The two standard library directories on a Linux system are /lib and /usr/lib. The /lib directory should not contain static libraries.
共享库文件通常存放在与静态库相同的位置。
Linux系统上的两个标准库目录是/lib和/usr/lib。
/lib目录不应包含静态库。
A shared library has a suffix that contains .so (shared object), as in libc-2.15.so and libc.so.6. To see what shared libraries a program uses, run ldd prog, where prog is the executable name. Here’s an example for the shell:
共享库的后缀包含.so(共享对象),如libc-2.15.so和libc.so.6。
要看程序使用的共享库,运行ldd prog,其中prog是可执行文件名。
以下是一个关于shell的示例:
$ ldd /bin/bash
linux-gate.so.1 => (0xb7799000)
libtinfo.so.5 => /lib/i386-linux-gnu/libtinfo.so.5 (0xb7765000)
libdl.so.2 => /lib/i386-linux-gnu/libdl.so.2 (0xb7760000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb75b5000)
/lib/ld-linux.so.2 (0xb779a000)
In the interest of optimal performance and flexibility, executables alone don’t usually know the locations of their shared libraries; they know only the names of the libraries, and perhaps a little hint about where to find them. A small program named ld.so (the runtime dynamic linker/loader) finds and loads shared libraries for a program at runtime. The preceding ldd output shows the library names on the left—that’s what the executable knows. The right side shows where ld.so finds the library.
为了实现最佳性能和灵活性,可执行文件通常不知道其共享库的位置;它们只知道库的名称,也许会有一点提示去找到它们。
一个名为ld.so(运行时动态链接器/加载器)的小型程序在运行时为程序找到并加载共享库。
前述ldd输出显示了左侧的库名称—这是可执行文件所知道的。右侧显示了ld.so找到库的位置。
The final line of output here shows the actual location of ld.so: ld-linux.so.2.
这里输出的最后一行显示了ld.so的实际位置:ld-linux.so.2。
One of the common trouble points for shared libraries is that the dynamic linker cannot find a library. The first place the dynamic linker should normally look for shared libraries is an executable’s preconfigured runtime library search path (rpath), if it exists. You’ll see how to create this path shortly.
Next, the dynamic linker looks in a system cache, /etc/ld.so.cache, to see if the library is in a standard location. This is a fast cache of the names of library files found in directories listed in the cache configuration file /etc/ ld.so.conf.
NOTE As is typical of many of the Linux configuration files that you’ve seen, ld.so.conf may include a number of files in a directory such as /etc/ld.so.conf.d.
为了实现最佳性能和灵活性,可执行文件通常不知道其共享库的位置;它们只知道库的名称,可能还有一些关于如何找到它们的提示。
一个名为ld.so(运行时动态链接器/加载器)的小程序在运行时为程序找到并加载共享库。
前面的ldd输出显示了左侧的库名称——这是可执行文件所知道的。
右侧显示了ld.so找到库的位置。
这里的最后一行输出显示了ld.so的实际位置:ld-linux.so.2。
One of the common trouble points for shared libraries is that the dynamic linker cannot find a library. The first place the dynamic linker should normally look for shared libraries is an executable’s preconfigured runtime library search path (rpath), if it exists. You’ll see how to create this path shortly.
共享库的常见问题之一是动态链接器找不到库。
动态链接器通常应该首先查找共享库的位置是可执行文件预配置的运行时库搜索路径(rpath),如果存在的话。您将在稍后看到如何创建此路径。
Next, the dynamic linker looks in a system cache, /etc/ld.so.cache, to see if the library is in a standard location. This is a fast cache of the names of library files found in directories listed in the cache configuration file /etc/ ld.so.conf
接下来,动态链接器会在系统缓存/etc/ld.so.cache中查找,以查看库是否位于标准位置。
这是一个快速缓存,其中包含在缓存配置文件/etc/ld.so.conf中列出的目录中找到的库文件的名称。
NOTE As is typical of many of the Linux configuration files that you’ve seen, ld.so.conf may include a number of files in a directory such as /etc/ld.so.conf.d. 注意 与您看到的许多Linux配置文件一样,ld.so.conf可能在目录(如/etc/ld.so.conf.d)中包含多个文件。
ld.so.conf中的每一行都是您希望包含在缓存中的目录。目录列表通常很短,类似于以下内容:
···sh/lib/i686-linux-gnu
/usr/lib/i686-linux-gnu
The standard library directories /lib and /usr/lib are implicit, which means that you don’t need to include them in /etc/ld.so.conf
标准库目录/lib和/usr/lib是隐式的,这意味着您不需要将它们包含在/etc/ld.so.conf中。
If you alter ld.so.conf or make a change to one of the shared library directories, you must rebuild the /etc/ld.so.cache file by hand with the following command
如果您修改了ld.so.conf或对其中一个共享库目录进行更改,必须手动使用以下命令重新构建/etc/ld.so.cache文件:
#ldconfig -v
The -v option provides detailed information on libraries that ldconfig adds to the cache and any changes that it detects
-v 选项提供了ldconfig添加到缓存中的库的详细信息以及它检测到的任何更改。
There is one more place that ld.so looks for shared libraries: the environment variable LD_LIBRARY_PATH. We’ll talk about this soon.
ld.so还会在另一个位置查找共享库:环境变量LD_LIBRARY_PATH。我们很快会讨论这个。
Don’t get into the habit of adding stuff to /etc/ld.so.conf. You should know what shared libraries are in the system cache, and if you put every bizarre little shared library directory into the cache, you risk conflicts and an extremely disorganized system. When you compile software that needs an obscure library path, give your executable a built-in runtime library search path. Let’s see how to do that
不要养成向/etc/ld.so.conf添加内容的习惯。
您应该知道系统缓存中有哪些共享库,如果将每个奇怪的共享库目录都放入缓存中,会导致冲突和一个非常混乱的系统。
当您编译需要一个奇怪的库路径的软件时,请给您的可执行文件设置一个内置的运行时库搜索路径。让我们看看如何做到这一点。
Let’s say you have a shared library named libweird.so.1 in /opt/obscure/lib that you need to link myprog against. Link the program as follows:
假设你有一个名为libweird.so.1的共享库,位于/opt/obscure/lib目录下,你需要将myprog与其进行链接。
按照以下方式链接该程序:
$ cc -o myprog myprog.o -Wl,-rpath=/opt/obscure/lib -L/opt/obscure/lib -lweird
The -Wl,-rpath option tells the linker to include a following directory into the executable’s runtime library search path. However, even if you use -Wl,-rpath, you still need the -L flag.
-Wl,-rpath选项告诉链接器将一个后续目录包含到可执行文件的运行时库搜索路径中。但是,即使你使用了-Wl,-rpath,你仍然需要-L标志。
If you have a pre-existing binary, you can also use the patchelf program to insert a different runtime library search path, but it’s generally better to do this at compile time.
如果你有一个现有的二进制文件,你也可以使用patchelf程序来插入一个不同的运行时库搜索路径,但通常最好在编译时进行此操作。
Shared libraries provide remarkable flexibility, not to mention some really incredible hacks, but it’s also possible to abuse them to the point where your system becomes an utter and complete mess. Three particularly bad things can happen:
共享库提供了非凡的灵活性,更不用说一些令人难以置信的技巧了,但是滥用它们可能会导致系统变得一团糟。
有三个特别糟糕的问题可能会发生:
o Missing libraries o Terrible performance o Mismatched libraries
o 缺少库 o 性能糟糕 o 库不匹配
The number one cause of all shared library problems is the environment variable named LD_LIBRARY_PATH. Setting this variable to a colon-delimited set of directory names makes ld.so search the given directories before anything else when looking for a shared library. This is a cheap way to make programs work when you move a library around, if you don’t have the program’s source code and can’t use patchelf, or if you’re just too lazy to recompile the executables. Unfortunately, you get what you pay for.
所有共享库问题的头号原因是一个名为LD_LIBRARY_PATH的环境变量。
将此变量设置为以冒号分隔的目录名称集合,使得ld.so在查找共享库时首先搜索给定的目录。
当你移动库文件时,如果没有程序的源代码,无法使用patchelf,或者你懒得重新编译可执行文件,这是一种廉价的方法使程序工作。
不幸的是,一分钱一分货。
Never set LD_LIBRARY_PATH in shell startup files or when compiling software. When the dynamic runtime linker encounters this variable, it must often search through the entire contents of each specified directory more times than you’d care to know. This causes a big performance hit, but more importantly, you can get conflicts and mismatched libraries because the runtime linker looks in these directories for every program
永远不要在shell启动文件或编译软件时设置LD_LIBRARY_PATH。
当动态运行时链接器遇到这个变量时,它通常需要多次搜索每个指定目录的全部内容,这会导致性能大幅下降,更重要的是,由于运行时链接器会在这些目录中搜索每个程序,可能会出现冲突和不匹配的库。
If you must use LD_LIBRARY_PATH to run some crummy program for which you don’t have the source (or an application that you’d rather not compile, like Mozilla or some other beast), use a wrapper script. Let’s say your executable is /opt/crummy/bin/crummy.bin and needs some shared libraries in /opt/crummy/lib. Write a wrapper script called crummy that looks like this
如果你必须使用LD_LIBRARY_PATH来运行一些糟糕的程序,而你没有源代码(或者你宁愿不编译某些应用程序,比如Mozilla或其他一些庞然大物),可以使用一个包装脚本。
假设你的可执行文件是/opt/crummy/bin/crummy.bin,需要一些共享库位于/opt/crummy/lib。
编写一个名为crummy的包装脚本,内容如下:
#!/bin/sh
LD_LIBRARY_PATH=/opt/crummy/lib
export LD_LIBRARY_PATH
exec /opt/crummy/bin/crummy.bin $@
Avoiding LD_LIBRARY_PATH prevents most shared library problems. But one other significant problem that occasionally comes up with developers is that a library’s application programming interface (API) may change slightly from one minor version to another, breaking installed software. The best solutions here are preventive: Either use a consistent methodology to install shared libraries with -Wl,-rpath to create a runtime link path or simply use the static versions of obscure libraries.
避免使用LD_LIBRARY_PATH可以避免大多数共享库问题。 但是开发人员偶尔会遇到另一个重要的问题,即库的应用程序编程接口(API)可能会在一个次要版本与另一个次要版本之间稍有变化,从而破坏已安装的软件。
在这里,最好的解决方案是预防性的:要么使用一致的方法来使用-Wl,-rpath安装共享库以创建运行时链接路径,要么简单地使用这些库的静态版本。
A program with more than one source code file or that requires strange compiler options is too cumbersome to compile by hand. This problem has been around for years, and the traditional Unix compile management utility that eases these pains is called make. You should know a little about make if you’re running a Unix system, because system utilities sometimes rely on make to operate. However, this chapter is only the tip of the iceberg. There are entire books on make, such as Managing Projects with GNU Make by Robert Mecklenburg (O’Reilly, 2004). In addition, most Linux packages are built using an additional layer around make or a similar tool. There are many build systems out there; we’ll look at one named autotools in Chapter 16. make is a big system, but it’s not very difficult to get an idea of how it works. When you see a file named Makefile or makefile, you know that you’re dealing with make. (Try running make to see if you can build anything.)
一个有多个源代码文件或需要奇怪编译选项的程序太繁琐了,无法手动编译。
这个问题已经存在多年了,传统的Unix编译管理实用程序可以缓解这些痛苦,它被称为make。
如果你在运行Unix系统,你应该对make有一些了解,因为系统工具有时依赖于make来运行。
然而,本章只是冰山一角。有很多关于make的书籍,比如Robert Mecklenburg的《使用GNU Make管理项目》(O'Reilly,2004)。
此外,大多数Linux软件包都是使用make或类似工具的附加层构建的。
市面上有很多构建系统,我们将在第16章介绍一个名为autotools的构建系统。
make是一个庞大的系统,但是理解它的工作原理并不难。
当你看到一个名为Makefile或makefile的文件时,你就知道你正在处理make。
(尝试运行make看看是否可以构建任何东西。)
The basic idea behind make is the target, a goal that you want to achieve. A target can be a file (a .o file, an executable, and so on) or a label. In addition, some targets depend on other targets; for instance, you need a complete set of .o files before you can link your executable. These requirements are called dependencies.
make的基本思想是目标,即你想要实现的目标。一个目标可以是一个文件(.o文件、可执行文件等)或一个标签。
此外,一些目标依赖于其他目标;例如,在链接可执行文件之前,你需要一个完整的.o文件集。这些要求被称为依赖关系。
To build a target, make follows a rule, such as a rule for how to go from a .c source file to a .o object file. make already knows several rules, but you can customize these existing rules and create your own.
为了构建一个目标,make遵循一个规则,比如从一个.c源文件到一个.o目标文件的规则。
make已经知道了几个规则,但你可以自定义这些现有规则并创建自己的规则。
The following very simple Makefile builds a program called myprog from aux.c and main.c:
下面是一个非常简单的 Makefile,它通过 aux.c 和 main.c 生成一个名为 myprog 的程序:
# object files
OBJS=aux.o main.o
all: myprog
myprog: $(OBJS)
$(CC) -o myprog $(OBJS)
The # in the first line of this Makefile denotes a comment.
这个Makefile的第一行中的#表示注释。
The next line is just a macro definition; it sets the OBJS variable to two object filenames. This will be important later. For now, take note of how you define the macro and also how you reference it later ($(OBJS)).
下一行只是一个宏定义;它将OBJS变量设置为两个对象文件的文件名。这在后面会很重要。现在,请注意如何定义宏以及如何在后面引用它($(OBJS))。
The next item in the Makefile contains its first target, all. The first target is always the default, the target that make wants to build when you run make by itself on the command line.
Makefile 中的下一项包含第一个目标,即 all。第一个目标总是默认的,也就是当你在命令行上运行 make 时,make 希望构建的目标。
The rule for building a target comes after the colon. For all, this Makefile says that you need to satisfy something called myprog. This is the first dependency in the file; all depends on myprog. Note that myprog can be an actual file or the target of another rule. In this case, it’s both (the rule for all and the target of OBJS).
构建目标的规则在冒号后面。
对于all来说,这个Makefile表示你需要满足一个叫做myprog的东西。
这是文件中的第一个依赖项;all依赖于myprog。
请注意,myprog可以是实际的文件,也可以是另一个规则的目标。
在这种情况下,它既是all的规则,也是OBJS的目标。
To build myprog, this Makefile uses the macro $(OBJS) in the dependencies. The macro expands to aux.o and main.o, so myprog depends on these two files (they must be actual files, because there aren’t any targets with those names anywhere in the Makefile).
为了构建myprog,这个Makefile在依赖项中使用了宏$(OBJS)。宏展开为aux.o和main.o,所以myprog依赖于这两个文件(它们必须是实际的文件,因为在整个Makefile中没有这些名称的目标)。
This Makefile assumes that you have two C source files named aux.c and main.c in the same directory. Running make on the Makefile yields the following output, showing the commands that make is running:
这个Makefile假设你在同一个目录中有两个名为aux.c和main.c的C源文件。
在Makefile上运行make会产生以下输出,显示make正在运行的命令:
$ make
cc -c -o aux.o aux.c
cc -c -o main.o main.c
cc -o myprog aux.o main.o
图 15-1 是相关性的示意图。
Figure 15-1. Makefile dependencies
Figure 15-1. Makefile dependencies
So how does make know how to go from aux.c to aux.o? After all, aux.c is not in the Makefile. The answer is that make follows its built-in rules. It knows to look for a .c file when you want a .o file, and furthermore, it knows how to run cc -c on that .c file to get to its goal of creating a .o file.
那么,make 是如何从 aux.c 到 aux.o 的呢?毕竟,aux.c 并不在 Makefile 中。
答案是 make 遵循其内置规则。它知道当你需要一个 .o 文件时,要查找一个 .c 文件,并且,它知道如何运行 cc -c 来编译该 .c 文件,以达到创建 .o 文件的目标。
The final step in getting to myprog is a little tricky, but the idea is clear enough. After you have the two object files in
(CC) expands to the compiler name):
获取到
(CC) 展开为编译器名称):
$(CC) -o myprog $(OBJS)
The whitespace before $(CC) is a tab. You must insert a tab before any real command, on its own line. Watch out for this:
$(CC) 前面的空格是一个制表符。在任何真正的命令之前,都必须在单独的行上插入一个制表符。
注意这个问题:
Makefile:7: *** missing separator. Stop.
An error like this means that the Makefile is broken. The tab is the separator, and if there is no separator or there’s some other interference, you’ll see this error.
这样的错误意味着 Makefile 有问题。制表符是分隔符,如果没有分隔符或有其他干扰,你会看到这个错误。
One last make fundamental is that targets should be up-to-date with their dependencies. If you type make twice in a row for the preceding example, the first command builds myprog, but the second yields this output:
最后一个 make 的基本原则是,目标应该与其依赖项保持更新。
如果你连续两次输入 make 命令来运行上述示例,第一次命令会构建 myprog,但第二次会输出以下内容:
make: Nothing to be done for 'all'.
This second time through, make looked at its rules and noticed that myprog already exists, so it didn’t build myprog again because none of the dependencies had changed since it was last built. To experiment with this, do the following:
第二次运行时,make 查看其规则并注意到 myprog 已经存在,所以它不会再次构建 myprog,因为自上次构建以来,没有任何依赖项发生变化。
为了进行实验,请执行以下操作:
This type of chain reaction is very typical
这种连锁反应很常见。
You can get a great deal of mileage out of make if you know how its command-line arguments and options work.
如果你了解make的命令行参数和选项的工作原理,你可以从中获得很多好处。
One of the most useful options is to specify a single target on the command line. For the preceding Makefile, you can run make aux.o if you want only the aux.o file.
其中最有用的选项之一是在命令行上指定一个单独的目标。对于前面的Makefile,如果你只想要aux.o文件,可以运行make aux.o。
You can also define a macro on the command line. For example, to use the clang compiler, try
你还可以在命令行上定义一个宏。
例如,要使用clang编译器,可以尝试执行以下命令:
make CC=clang
Here, make uses your definition of CC instead of its default compiler, cc. Command-line macros come in handy when testing preprocessor definitions and libraries, especially with the CFLAGS and LDFLAGS macros that we’ll discuss shortly.
在这里,make使用你定义的CC而不是默认的编译器cc。
命令行宏在测试预处理器定义和库时非常方便,特别是在讨论稍后的CFLAGS和LDFLAGS宏时。
In fact, you don’t even need a Makefile to run make. If built-in make rules match a target, you can just ask make to try to create the target. For example, if you have the source to a very simple program called blah.c, try make blah. The make run proceeds like this:
事实上,你甚至不需要一个Makefile来运行make。
如果内置的make规则匹配一个目标,你只需让make尝试创建该目标即可。
例如,如果你有一个名为blah.c的非常简单的程序源代码,可以尝试运行make blah。make的运行过程如下:
$ make blah
cc blah.o -o blah
This use of make works only for the most elementary C programs; if your program needs a library or special include directory, you should probably write a Makefile. Running make without a Makefile is actually most useful when you’re dealing with something like Fortran, Lex, or Yacc and don’t know how the compiler or utility works. Why not let make try to figure it out for you? Even if make fails to create the target, it will probably still give you a pretty good hint as to how to use the tool
这种使用make的方式只适用于最基本的C程序;如果你的程序需要一个库或特殊的包含目录,你应该编写一个Makefile。
在没有Makefile的情况下运行make实际上在处理Fortran、Lex或Yacc等情况时最有用,因为你可能不知道编译器或实用程序的工作原理。
为什么不让make试着为你找出来呢?即使make无法创建目标,它可能仍然会给你一个相当好的提示,告诉你如何使用该工具。
Two make options stand out from the rest:
有两个make选项与其他选项不同:
o -n Prints the commands necessary for a build but prevents make from actually running any commands o -f file Tells make to read from file instead of Makefile or makefile
make has many special macros and variables. It’s difficult to tell the difference between a macro and a variable, so we’ll use the term macro to mean something that usually doesn’t change after make starts building targets.
make有许多特殊的宏和变量。很难区分宏和变量的区别,所以我们将使用术语“宏”来表示在make开始构建目标后通常不会改变的东西。
As you saw earlier, you can set macros at the start of your Makefile. These are the most common macros:
正如你之前看到的,你可以在Makefile的开头设置宏。以下是最常见的宏:
o CFLAGS C compiler options. When creating object code from a .c file, make passes this as an argument to the compiler. o LDFLAGS Like CFLAGS, but they’re for the linker when creating an executable from object code. o LDLIBS If you use LDFLAGS but do not want to combine the library name options with the search path, put the library name options in this file. o CC The C compiler. The default is cc. o CPPFLAGS C preprocessor options. When make runs the C preprocessor in some way, it passes this macro’s expansion on as an argument. o CXXFLAGS GNU make uses this for C++ compiler flags
A make variable changes as you build targets. Because you never set make variables by hand, the following list includes the $.
一个make变量在构建目标时会发生变化。因为你从不手动设置make变量,所以下面的列表包括$符号。
o
* Expands to the basename of the current target. For example, if you’re building blah.o, this expands to blah.
The most comprehensive list of make variables on Linux is the make info manual.
Linux上关于make变量的最全面列表可以在make info手册中找到。
NOTE Keep in mind that GNU make has many extensions, built-in rules, and features that other variants do not have. This is fine as long as you’re running Linux, but if you step off onto a Solaris or BSD machine and expect the same stuff to work, you might be in for a surprise. However, that’s the problem that multi-platform build systems such as GNU autotools solve. 注意 请记住,GNU make具有许多其他变体没有的扩展、内置规则和功能。 只要你在运行Linux,这没有问题,但是如果你切换到Solaris或BSD机器并期望相同的东西能够工作,你可能会感到惊讶。 然而,这就是GNU autotools等多平台构建系统解决的问题。
Most Makefiles contain several standard targets that perform auxiliary tasks related to compiles.
大多数Makefile包含几个执行与编译相关的辅助任务的标准目标。
o clean The clean target is ubiquitous; a make clean usually instructs make to remove all of the object files and executables so that you can make a fresh start or pack up the software. Here’s an example rule for the myprog Makefile:
clean:
rm -f $(OBJS) myprog
o distclean A Makefile created by way of the GNU autotools system always has a distclean target to remove everything that wasn’t part of the original distribution, including the Makefile. You’ll see more of this in Chapter 16. On very rare occasions, you might find that a developer opts not to remove the executable with this target, preferring something like realclean instead.
o install Copies files and compiled programs to what the Makefile thinks is the proper place on the system. This can be dangerous, so always run a make -n install first to see what will happen without actually running any commands.
o test or check Some developers provide test or check targets to make sure that everything works after performing a build.
o depend Creates dependencies by calling the compiler with -M to examine the source code. This is an unusual-looking target because it often changes the Makefile itself. This is no longer common practice, but if you come across some instructions telling you to use this rule, make sure to do so.
o all Often the first target in the Makefile. You’ll often see references to this target instead of an actual executable.
Even though there are many different Makefile styles, most programmers adhere to some general rules of thumb. For one, in the first part of the Makefile (inside the macro definitions), you should see libraries and includes grouped according to package:
尽管有许多不同的Makefile风格,大多数程序员都遵循一些通用的规则。
首先,在Makefile的第一部分(宏定义内部),你应该看到按照包进行分组的库和包含文件:
MYPACKAGE_INCLUDES=-I/usr/local/include/mypackage
MYPACKAGE_LIB=-L/usr/local/lib/mypackage -lmypackage
PNG_INCLUDES=-I/usr/local/include
PNG_LIB=-L/usr/local/lib -lpng
Each type of compiler and linker flag often gets a macro like this:
每种编译器和链接器标志通常都有一个类似这样的宏:
CFLAGS=$(CFLAGS) $(MYPACKAGE_INCLUDES) $(PNG_INCLUDES)
LDFLAGS=$(LDFLAGS) $(MYPACKAGE_LIB) $(PNG_LIB)
Object files are usually grouped according to executables. For example, say you have a package that creates executables called boring and trite. Each has its own .c source file and requires the code in util.c. You might see something like this:
目标文件通常按照可执行文件进行分组。
例如,假设你有一个创建名为boring和trite的可执行文件的包。
每个可执行文件都有自己的.c源文件,并且需要util.c中的代码。你可能会看到类似这样的内容:
UTIL_OBJS=util.o
BORING_OBJS=$(UTIL_OBJS) boring.o
TRITE_OBJS=$(UTIL_OBJS) trite.o
PROGS=boring trite
The rest of the Makefile might look like this:
Makefile的其余部分可能如下所示:
all: $(PROGS)
boring: $(BORING_OBJS)
$(CC) -o $@ $(BORING_OBJS) $(LDFLAGS)
trite: $(TRITE_OBJS)
$(CC) -o $@ $(TRITE_OBJS) $(LDFLAGS)
You could combine the two executable targets into one rule, but it’s usually not a good idea to do so because you would not easily be able to move a rule to another Makefile, delete an executable, or group executables differently. Furthermore, the dependencies would be incorrect: If you had just one rule for boring and trite, trite would depend on boring.c, boring would depend on trite.c, and make would always try to rebuild both programs whenever you changed one of the two source files
您可以将两个可执行文件目标合并为一条规则,但这样做通常不是个好主意,因为 因为您不容易将规则移到另一个 Makefile 中、删除可执行文件或以不同方式分组可执行文件。
不同。此外,依赖关系也是不正确的:如果只有一条规则用于 boring 和 trite,trite 会依赖 boring.c,boring 会依赖 trite.c,而且每当你修改了一个程序,make 总是会尝试重建这两个程序。
每当你修改这两个源文件中的一个时,make 都会尝试重建这两个程序。
NOTE If you need to define a special rule for an object file, put the rule for the object file just above the rule that builds the executable. If several executables use the same object file, put the object rule above all of the executable rules. 注意 如果需要为对象文件定义特殊规则,请将对象文件的规则放在构建可执行文件的规则的上方。 如果多个可执行文件使用同一对象文件,则应将对象规则置于所有可执行文件规则之上。 置于所有可执行文件规则之上。
The standard debugger on Linux systems is gdb; user-friendly frontends such as the Eclipse IDE and Emacs systems are also available. To enable full debugging in your programs, run the compiler with -g to write a symbol table and other debugging information into the executable. To start gdb on an executable named program, run
Linux系统上的标准调试器是gdb;还可以使用诸如Eclipse IDE和Emacs等用户友好的前端。
为了在程序中启用完整的调试功能,可以使用-g选项运行编译器,将符号表和其他调试信息写入可执行文件中。
要在名为program的可执行文件上启动gdb,运行以下命令:
$ gdb program
You should get a (gdb) prompt. To run program with the command-line argument options, enter this at the (gdb) prompt:
您应该会得到一个(gdb)提示符。要使用命令行参数选项运行程序,请在(gdb)提示符处输入以下内容:
(gdb) run options
If the program works, it should start, run, and exit as normal. However, if there’s a problem, gdb stops, prints the failed source code, and throws you back to the (gdb) prompt. Because the source code fragment often hints at the problem, you’ll probably want to print the value of a particular variable that the trouble may be related to. (The print command also works for arrays and C structures.)
如果程序正常工作,它应该像平常一样启动、运行和退出。
然而,如果出现问题,gdb会停止运行,打印出错误的源代码,并将您带回(gdb)提示符。
由于源代码片段通常暗示了问题所在,您可能希望打印与问题可能相关的特定变量的值。
(print命令也适用于数组和C结构。)
(gdb) print variable
To make gdb stop the program at any point in the original source code, use the breakpoint feature. In the following command, file is a source code file, and line_num is the line number in that file where gdb should stop:
要让gdb在原始源代码的任意位置停止程序,可以使用断点功能。
在下面的命令中,file是源代码文件,line_num是gdb应该停止的文件中的行号:
(gdb) break file:line_num
To tell gdb to continue executing the program, use
要告诉gdb继续执行程序,请使用以下命令:
(gdb) continue
To clear a breakpoint, enter
要清除断点,请输入以下命令:
(gdb) clear file:line_num
This section has provided only the briefest introduction to gdb, which includes an extensive manual that you can read online, in print, or buy as Debugging with GDB, 10th edition, by Richard M. Stallman et al. (GNU Press, 2011). The Art of Debugging by Norman Matloff and Peter Jay Salzman (No Starch Press, 2008) is another guide to debugging.
本节只是对gdb进行了最简单的介绍,它还包括一份详尽的手册,您可以在线阅读,打印出来,或者购买Richard M. Stallman等人的《Debugging with GDB, 10th edition》(GNU Press,2011)。
Norman Matloff和Peter Jay Salzman的《The Art of Debugging》(No Starch Press,2008)也是一本关于调试的指南。
NOTE If you’re interested in rooting out memory problems and running profiling tests, try Valgrind ( http://valgrind.org/). 注意 如果您对查找内存问题和运行性能分析测试感兴趣,请尝试Valgrind(http://valgrind.org/)。
You might encounter Lex and Yacc when compiling programs that read configuration files or commands. These tools are building blocks for programming languages.
当编译读取配置文件或命令的程序时,你可能会遇到Lex和Yacc。这些工具是编程语言的构建模块。
o Lex is a tokenizer that transforms text into numbered tags with labels. The GNU/Linux version is named flex. You may need a -ll or -lfl linker flag in conjunction with Lex.
o Yacc is a parser that attempts to read tokens according to a grammar. The GNU parser is bison; to get Yacc compatibility, run bison -y. You may need the -ly linker flag.
A long time ago, the average Unix systems manager didn’t have to worry much about scripting languages other than the Bourne shell and awk. Shell scripts (discussed in Chapter 11) continue to be an important part of Unix, but awk has faded somewhat from the scripting arena. However, many powerful successors have arrived, and many systems programs have actually switched from C to scripting languages (such as the sensible version of the whois program). Let’s look at some scripting basics.
很久以前,普通的Unix系统管理员对于除了Bourne shell和awk之外的脚本语言并不需要太多担心。Shell脚本(在第11章讨论)仍然是Unix的重要组成部分,但awk在脚本领域中有些衰落。
然而,许多强大的继任者已经出现,许多系统程序实际上已经从C语言切换到脚本语言(例如whois程序的合理版本)。
让我们来看一些脚本的基础知识。
The first thing you need to know about any scripting language is that the first line of a script looks like the shebang of a Bourne shell script. For example, a Python script starts out like this:
关于任何脚本语言,你需要知道的第一件事是脚本的第一行看起来像Bourne shell脚本的shebang。
例如,Python脚本的开头是这样的:
#!/usr/bin/python
或者这样:
#!/usr/bin/env python
In Unix, any executable text file that starts with #! is a script. The pathname following this prefix is the scripting language interpreter executable. When Unix tries to run an executable file that starts with a #! shebang, it runs the program following the #! with the rest of the file as the standard input. Therefore, even this is a script:
在Unix中,任何以#!开头的可执行文本文件都被视为脚本。
在这个前缀之后的路径名是脚本语言解释器的可执行文件。当Unix尝试运行以#!开头的可执行文件时,它会将#!之后的程序作为标准输入,并执行该程序。
因此,即使是这样一个脚本:
#!/usr/bin/tail -2
This program won't print this line,
but it will print this line...
and this line, too.
这个程序不会打印这一行, 但它会打印这一行... 还有这一行。
The first line of a shell script often contains one of the most common basic script problems: an invalid path to the scripting language interpreter. For example, say you named the previous script myscript. What if tail were actually in /bin instead of /usr/bin on your system? In that case, running myscript would produce this error:
Shell脚本的第一行通常包含最常见的基本脚本问题之一:对脚本语言解释器的路径设置错误。例如,假设你将前面的脚本命名为myscript。如果tail实际上在你的系统上的/bin而不是/usr/bin中,那么运行myscript将产生以下错误:
bash: ./myscript: /usr/bin/tail: bad interpreter: No such file or
directory
Don’t expect more than one argument in the script’s first line to work. That is, the -2 in the preceding example might work, but if you add another argument, the system could decide to treat the -2 and the new argument as one big argument, spaces and all. This can vary from system to system; don’t test your patience on something as insignificant as this.
不要期望在脚本的第一行中使用多个参数能够正常工作。
也就是说,前面的例子中的-2可能有效,但如果你添加另一个参数,系统可能会将-2和新的参数视为一个大参数,包括空格在内。这可能因系统而异;不要在这种微不足道的事情上浪费耐心。
Now, let’s look at a few of the languages out there.
现在,让我们来看看一些现有的语言。
Python is a scripting language with a strong following and an array of powerful features, such as text processing, database access, networking, and multithreading. It has a powerful interactive mode and a very organized object model.
Python是一种具有强大功能的脚本语言,拥有广泛的用户群体和一系列强大的功能,如文本处理、数据库访问、网络和多线程。
它具有强大的交互模式和非常有组织的对象模型。
Python’s executable is python, and it’s usually in /usr/bin. However, Python isn’t used just from the command line for scripts. One place you’ll find it is as a tool to build websites. Python Essential Reference, 4th edition, by David M. Beazley (Addison-Wesley, 2009) is a great reference with a small tutorial at the beginning to get you started.
Python的可执行文件是python,通常位于/usr/bin目录下。
然而,Python不仅仅用于命令行脚本。
你会发现它在构建网站的工具中也得到了应用。
《Python基础参考》(第4版),作者David M. Beazley(Addison-Wesley,2009)是一本很好的参考书,书的开头还有一个小教程,帮助你入门。
One of the older third-party Unix scripting languages is Perl. It’s the original “Swiss army chainsaw” of programming tools. Although Perl has lost a fair amount of ground to Python in recent years, it excels in particular at text processing, conversion, and file manipulation, and you may find many tools built with it. Learning Perl, 6th edition, by Randal L. Schwartz, brian d foy, and Tom Phoenix(O’Reilly, 2011) is a tutorialstyle introduction; a larger reference is Modern Perl by Chromatic (Onyx Neon Press, 2014)
Perl是较早的第三方Unix脚本语言之一。它是编程工具中最早的“瑞士军刀”。
尽管Perl在最近几年中失去了一定的市场份额,但在文本处理、转换和文件操作方面表现出色,并且你可能会发现许多使用它构建的工具。
《学习Perl》(第6版),作者Randal L. Schwartz、brian d foy和Tom Phoenix(O'Reilly,2011)是一本以教程形式介绍的入门书籍;
更详细的参考书是Chromatic的《现代Perl》(Onyx Neon Press,2014)。
You might also encounter these scripting languages:
你还可能遇到以下脚本语言:
o PHP. This is a hypertext-processing language often found in dynamic web scripts. Some people use PHP for standalone scripts. The PHP website is at http://www.php.net/. o Ruby. Object-oriented fanatics and many web developers enjoy programming in this language (http://www.ruby-lang.org/).
o JavaScript. This language is used inside web browsers primarily to manipulate dynamic content. Most experienced programmers shun it as a standalone scripting language due to its many flaws, but it’s nearly impossible to avoid when doing web programming. You might find an implementation called Node.js with an executable name of node on your system
JavaScript.这种语言主要用于在Web浏览器内部操作动态内容。
大多数经验丰富的程序员都不喜欢它作为一个独立的脚本语言,因为它有很多缺点,但在Web编程中几乎无法避免。
你可能会在你的系统上找到一个叫做Node.js的实现,可执行文件名为node。
o Emacs Lisp. A variety of the Lisp programming language used by the Emacs text editor.
Emacs Lisp.这是Emacs文本编辑器使用的一种Lisp编程语言的变体。
o Matlab, Octave. Matlab is a commercial matrix and mathematical programming language and library. There is a very similar free software project called Octave.
Matlab, Octave. Matlab是一种商业矩阵和数学编程语言和库。
还有一个非常相似的免费软件项目叫做Octave。
o R. A popular free statistical analysis language. See http://www.r-project.org/ and The Art of R Programming by Norman Matloff (No Starch Press, 2011) for more information.
R.一种流行的免费统计分析语言。更多信息请参见http://www.r-project.org/和Norman Matloff的《R编程艺术》(No Starch Press,2011)。
o Mathematica. Another commercial mathematical programming language with libraries. m4 This is a macro-processing language, usually found only in the GNU autotools.
Mathematica是另一种商业数学编程语言,带有库。m4是一种宏处理语言,通常只在GNU自动工具中找到。
o Tcl. Tcl (tool command language) is a simple scripting language usually associated with the Tk graphical user interface toolkit and Expect, an automation utility. Although Tcl does not enjoy the widespread use that it once did, don’t discount its power. Many veteran developers prefer Tk, especially for its embedded capabilities. See http://www.tcl.tk/ for more on Tk.
Tcl. Tcl(工具命令语言)是一种简单的脚本语言,通常与Tk图形用户界面工具包和Expect自动化工具相关联。
尽管Tcl不再像过去那样广泛使用,但不要低估它的功能。许多资深开发人员喜欢Tk,特别是因为它的嵌入能力。
更多关于Tk的信息,请参见http://www.tcl.tk/。
Java is a compiled language like C, with a simpler syntax and powerful support for object-oriented programming. It has a few niches in Unix systems. For one, it’s often used as a web application environment, and it’s popular for specialized applications. For example, Android applications are usually written in Java. Even though it’s not often seen on a typical Linux desktop, you should know how Java works, at least for standalone applications.
Java是一种编译语言,类似于C语言,语法更简单,支持面向对象编程。
它在Unix系统中有一些特定的应用场景。首先,它经常被用作Web应用程序开发环境,并且在专门的应用程序中很受欢迎。
例如,Android应用通常是用Java编写的。
即使在典型的Linux桌面系统中不常见,你也应该了解Java的工作原理,至少对于独立应用程序来说。
There are two kinds of Java compilers: native compilers for producing machine code for your system (like a C compiler) and bytecode compilers for use by a bytecode interpreter (sometimes called a virtual machine, which is different from the virtual machine offered by a hypervisor, as described in Chapter 17). You’ll practically always encounter bytecode on Linux.
有两种类型的Java编译器:本地编译器用于生成适用于系统的机器代码(类似于C编译器),字节码编译器用于字节码解释器(有时称为虚拟机,与第17章中描述的虚拟机不同)。
在Linux上,你几乎总是会遇到字节码。
Java bytecode files end in .class. The Java runtime environment (JRE) contains all of the programs you need to run Java bytecode. To run a bytecode file, use
Java 字节码文件的扩展名是 .class。
Java 运行时环境(JRE)包含了运行 Java 字节码所需的所有程序。要运行一个字节码文件,可以使用以下命令:
$ java file.class
You might also encounter bytecode files that end in .jar, which are collections of archived .class files. To run a .jar file, use this syntax:
有时你可能会遇到以 .jar 结尾的字节码文件,这些文件是包含归档的 .class 文件的集合。
要运行一个 .jar 文件,可以使用以下语法:
$ java -jar file.jar
Sometimes you need to set the JAVA_HOME environment variable to your Java installation prefix. If you’re really unlucky, you might need to use CLASSPATH to include any directories containing classes that your program expects. This is a colon-delimited set of directories like the regular PATH variable for executables.
有时候你需要设置 JAVA_HOME 环境变量为你的 Java 安装前缀。
如果你很不幸,可能需要使用 CLASSPATH 来包含任何包含程序所需类的目录。
这是一个由冒号分隔的目录集合,类似于用于可执行文件的常规 PATH 变量。
If you need to compile a .java file into bytecode, you need the Java Development Kit (JDK). You can run the javac compiler from JDK to create some .class files:
如果你需要将一个 .java 文件编译成字节码,你需要使用 Java 开发工具包(JDK)。
你可以使用 JDK 中的 javac 编译器来创建一些 .class 文件:
$ javac file.java
JDK also comes with jar, a program that can create and pick apart .jar files. It works like tar
JDK 还带有 jar,一个可以创建和解压 .jar 文件的程序。它的功能类似于 tar。
The world of compilers and scripting languages is vast and constantly expanding. As of this writing, new compiled languages such as Go (golang) and Swift are gaining popularity.
编译器和脚本语言的世界是广阔的,而且不断扩展。
截至本文撰写时,新的编译语言如Go(golang)和Swift正日渐流行。
The LLVM compiler infrastructure set (http://llvm.org/) has significantly eased compiler development. If you’re interested in how to design and implement a compiler, two good books are Compilers: Principles, Techniques, and Tools, 2nd edition, by Alfred V. Aho et al. (Addison-Wesley, 2006) and Modern Compiler Design, 2nd edition, by Dick Grune et al. (Springer, 2012). For scripting language development, it’s usually best to look for online resources, as the implementations vary widely.
LLVM编译器基础设施集(http://llvm.org/)大大简化了编译器开发。
如果你对如何设计和实现编译器感兴趣,可以参考Alfred V. Aho等人编写的《编译原理》第二版(Addison-Wesley, 2006)和Dick Grune等人编写的《现代编译器设计》第二版(Springer, 2012)这两本好书。
对于脚本语言开发,最好查找在线资源,因为实现方式各不相同。
Now that you know the basics of the programming tools on the system, you’re ready to see what they can do. The next chapter is all about how you can build packages on Linux from source code.
现在你已经了解了系统上的编程工具基础知识,接下来可以看看它们能做什么了。下一章将详细介绍如何在Linux上从源代码构建软件包。