1.2. Package Management: How to Do It?

Well, all that sounds great — easy install, upgrade, and deletion of packages; getting package information presented several different ways; making sure packages are installed correctly; and even tracking changes to config files. But how do you do it?

As mentioned above, the obvious answer is to let the computer do it. Many groups have tried to create package management software. There are two basic approaches:

  1. Some package management systems concentrate on the specific steps required to manipulate a package.

  2. Other package management systems take a different approach, keeping track of the files on the system and manipulating packages by concentrating on the files involved.

Each approach has its good and bad points. In the first method, it's easy to install new packages, somewhat difficult to remove old ones, and almost impossible to obtain any meaningful information about installed packages.

The second method makes it easy to obtain information about installed packages, and fairly easy to install and remove packages. The main problem using this method is that there may not be a well-defined way to execute any commands required during the installation or removal process.

In practice, no package management system uses one approach or the other — all are a mixture of the two. The exact mix and design goals will dictate how well a particular package management system meets the needs of the people using it. At the time Red Hat started work on their Linux distribution, there were a number of package management systems in use, each with a different approach to making package management easier.

1.2.1. Ancestors of RPM

Since this is a book on the Red Hat Package Manager, a good way to see what RPM is all about is to look at the package management software that preceded RPM.

1.2.1.1. RPP

RPP was used in the first Red Hat Linux distributions. Many of RPP's features would be recognizable to anyone who has worked with RPM. Some of these innovative features are:

While RPP possessed several of the features that were important enough to continue on as parts of RPM today, it had some weaknesses, too:

  • It didn't use "pristine sources". Every program is made up of programming language statements stored in files. This source code is later translated into a binary language that the computer can execute. In the case of RPP, its packages were based on source code that had been modified specifically for RPP, hence the sources weren't pristine. This is a bad idea for a number of fairly technical reasons. Not using pristine sources made it difficult for package developers to keep things straight, particularly if they were building hundreds of different packages.

  • It couldn't guarantee executables were built from packaged sources. The process of building a package for RPP was such that there was no way to ensure the executable programs were built from the source code contained in an RPP source package. Once again, this was a problem for the package builder, especially those who had large numbers of packages to build.

  • It had no support for multiple architectures. As people started using RPP, it became obvious that the package managers that were unable to simplify the process of building packages for more than one architecture, or type of computer, were going to be at a disadvantage. This was a problem, particularly for Red Hat, as they were starting to look at the possibility of creating Linux distributions for other architectures, such as the Digital Alpha.

Even with these problems, RPP was one of the things that made the first Red Hat Linux distributions unique. Its ability to simplify the process of installing software was a real boon to many of Red Hat's customers, particularly those with little experience in Linux.

1.2.1.4. RPM Version 1

With two major forays into package management behind them, Marc Ewing and Erik Troan went to work on a third attempt. This one would be called the Red Hat Package Manager, or RPM.

Although it built on the experiences of PM, PMS, and RPP, RPM was quite different under the hood. Written in the Perl programming language for fast development, the creation of RPM version 1 focused on addressing the flaws of its ancestors. In some cases, the flaws were eliminated, while in others, the problems remained.

Some of the successes of RPM version 1 were:

But RPM version 1 wasn't perfect. There were a number of flaws, some of them major:

  • It was slow. While the use of Perl made RPM's development proceed more quickly, it also meant that RPM wouldn't run as quickly as it would have, had it been written in C.

  • Its database design was fragile. Unfortunately, under RPM version 1 it was not unusual for there to be problems with the database. While the approach of dedicating a database to package management was a good idea, the implementation used in RPM version 1 left a lot to be desired.

  • It was big. This is another artifact of using Perl. Normally, RPM's size requirements were not an issue, except for one area. When performing an initial system install, RPM was run from a small, floppy-based system environment. The need to have Perl available meant space on the boot floppies was always a problem.

  • It didn't support multiple architectures (types of computers) well. The need to have a package manager support more than one type of computer hadn't been acknowledged before. With RPM version 1, an initial stab was taken at the problem, but the implementation was incomplete. Nonetheless, RPM had been ported to a number of other computer systems. It was becoming obvious that the issue of multi-architecture support was not going away and had to be addressed.

  • The package file format wasn't extensible. This made it very difficult to add functionality, since any change to the file format would cause older versions of RPM to break.

Even though their Linux distribution was a success, and RPM was much of the reason for it, Marc and Erik knew that some changes were going to be necessary to carry RPM to the next level.