Building RPM packages with rpmbuild, Koji, and GitLab-CI

The RPM system facilitates the user to query and update a software package. It also allows examining package interdependencies, and verifying package file permissions. This blog post will describe the process of building an RPM package using the rpmbuild utility and will then explain how to schedule build tasks using Koji. Finally, it will describe how to automate the build pipeline using continuous integration in GitLab.

1. RPM Package Manager

RPM Package Manager is an open-source package management system which was originally designed for Red Hat Linux, but it is now supported on most Linux distributions. RPM packages can generally be of two types:

Binary RPM: A binary RPM contains the compiled binary of a complete application (or a library), and targets a particular architecture and may not be installable on all platforms. For example, an RPM compiled for the x86 architecture will not be compatible with ARM architecture, and vice-versa. It should be noted however that some binary RPMs can target multiple architectures, for example, when the underlying application is written in a platform-independent language, such as Python or Java.
Source RPM: A source RPM contains the source code of an application, and a set of commands for creating its binary RPM on the target Linux environment. As the code present in an SRPM is not compiled, the created binary RPMs are termed as "noarch" as they are platform-independent and are generally more flexible, i.e. they allow modifying compile options, which can allow additional functionality.

Note: The Fedora project hosts an extensive documentation on how to build RPMs.

1.1. RPM file components

There are four sections in an RPM file which contain information regarding the package.

Name: The package name.
Signature: Used to verify the authenticity of the package. It does so by applying a mathematical function on the header and archive sections of the file. There are numerous options available for the encryption method, such as PGP or MD5.
Header: The copyright information, package version numbers, package description, etc.
Payload: The actual contents of the package, which are decompressed when the package is installed. The data in this section is compressed using gzip. Upon decompression, the data is in cpio format.

1.2. RPM build procedure

Building an RPM can be accomplished using the rpmbuild utility. In most cases, we only require the package source code and a spec file. The spec file is tailored for each package and contains the recipe for building the package.

Listed below are the directories with a description of what they contain. These directories are created when building an RPM package:

BUILD: The software build storage directory.
RPMS: Binary RPM storage directory.
SOURCES: The package source code (e.g. as a tarball).
SPECS: The spec file(s) for one or more RPMs.
SRPMS: Source RPM storage directory.

Note: It is possible to place the package source code outside of the SOURCES directory and then providing the path in the Source0 tag.

The essence of the RPM build process lies in the spec file, which contains information regarding the package, its version (and build) number, its changelog, etc. This information can later be queried using the rpm command. In addition to this meta-information it also contains the instructions to build, install, and delete the package. The sections that are involved in the spec file are explained below.

1.2.1. Preamble section

This section contains the metadata of the package, which can later be queried using the rpm -qi <package-name> command. An example section is shown below:

%{!?dist: %define dist .el7.cern}

Name:           <package name>
Version:        <version number>
Release:        <release number>

Summary:        <package summary>
Group:          <package group to which it belongs>
License:        <package license policy>
URL:            <package homepage>
Vendor:         <package vendor>
Requires:       <package dependencies>
BuildRoot:      <build root directory>
BuildArch:      <build architecture or "noarch">

Source0:        <package source files>
Source1:        ...
...

Patch0:         <package patch files>
Patch1:         ...
...

%description
<package description>

Note: The RPM spec file provides numerous system and user-defined macros. User-defined macros follow the %{<macro name>} syntax, whereas system macros are written without the curly braces, for example %define. The macros section provides a detailed overview of the available macros.

1.2.2. Prep section

The prep or prepare section precedes the build section and defines the commands necessary for the build. If the source is specified using a tarball, the prep section is responsible for extracting it.

%prep

# This step extracts the tarball.
%setup -cq

The %setup macro is used for unpacking the original sources in preparation for the build and can take the following options:

-n <name>: Name of the software build directory.
-q: Suppress the displaying of files when unpacking sources.
-c: Create the top-level build directory before unpacking the sources.
-D: Do not delete the build directory prior to unpacking the sources.
-T: Override the default unpacking scheme, used in combination with -a (after) or -b (before) option, for example:

%setup -D -T -b 2

1.2.3. Build section

This section contains the commands to build the application. Since the build instructions are usually present in a separate file, this section is mostly empty.

%build

# Commands to build the application.
...

1.2.4. Install section

The install section is responsible for installing the application. In principal, this section should delete the build directory, i.e. remove previously installed files. The example below removes the build directory and then copies the build files of the package to %{buildroot}.

%install

# Remove the build directory. 
rm -rf %{buildroot}

# Create the build directory.
mkdir -p %{buildroot}/%{_unitdir}

# Move the <package>.service file.
mv application/%{package}.service %{buildroot}/%{_unitdir}

# Move the configuration files.
mkdir -p %{buildroot}/etc/<package>
mv conf/config.sample %{buildroot}/etc/<package>

# Copy the build files.
mkdir -p %{buildroot}/opt/<package>
cp -R * %{buildroot}/opt/<package>

1.2.5. Clean section

This section removes the build files created in the previous sections.

%clean

rm -rf %{buildroot}

1.2.6. Files section

The files section is used for marking configuration and documentation files, and to ensure file permissions and ownerships.

%files

# Mark the file as documentation.
%doc README.md

# Set the attributes for the following directories and file.
%defattr(<file mode>, <user>, <group>, <dir mode>)
/opt/cortex
%{_unitdir}/%{name}.service
%config(noreplace)/etc/cortex

The following directives can be used in this section:

%doc: Flags the filename(s) that follow as part of the documentation.
%config: Flags the specified file as being a configuration file.
%attr: Sets the file permissions, its owner, and its group. It has the following syntax: %attr(<mode>, <user>, <group>) <filename>
%defattr: Sets the default attributes for files and directories (it has a similar syntax as %attr).
%ghost: Marks file(s) to not be installed as part of the package (such as log files).
%verify: Verifies the integrity of installed files. It can take up to nine attributes: %verify(mode md5 size maj min symlink mtime) <directory>

Note: There is another section %changelog which is not mentioned here. It is used for keeping track of the changes made to the package, similar to Git history.

1.3. Spec file macros

The RPM system provides numerous built-in macros for working with spec files. These macros allow generalizing the spec file, by allowing users to not hard-code directory paths, but instead allows providing them using macros.

In addition to this, there also exist macros for debugging spec files. Some of these are listed below:

%dump: Prints out the macro values.
%{echo:message}: Prints message to stderr.
%{error:message}: Prints message to stderr and returns BADSPEC.

Note: It is possible to reference tags as macros in various section of the spec file. For example, the "Name" tag can be referenced as %{name}.

1.3.1. Defining new macros

To make the package management process easier, RPM allows creating custom macros using the following syntax:

%define macro_name value

It also allows expanding the result of shell commands using the %(<command>) syntax. So to create a macro which holds the list of files present in the current directory, one could do:

%define list_files %(ls)

Custom-defined macros can later be referenced using the curly braces syntax. For example, the list_files macro will be referenced as %{list_files}.

1.3.2. Passing parameters to macros

The macros can take one or more parameters, with the following syntax:

%define macro_name(options) value

These parameters can be accessed using the following directives:

%0: The macro name.
%*: All parameters to the macro.
%#: The number of passed parameters.
%1, %2, [...]: The first, second, and remaining parameters.

2. Managing RPM builds with Koji

Koji is an RPM build system which was originally developed and is still used by the Fedora project as their main build system. It allows scheduling build tasks, provides build reproducibility, and versions the data. Under the hood, Koji uses Yum and Mock to create the builds and provides a web interface for numerous tasks, such as viewing and cancelling builds.

Note: The web server can be configured to create a repository with each new tag, so whenever a build is completed and tagged, a new repository is created.

The command line tool koji allows initiating the package build. It has the following syntax:

$ koji build [options] target <srpm path or scm url>

The example below shows how to initiate a build with some additional options:

$ koji --config=<config file path> build --scratch --wait target rpmbuild/SRPMS/<package name>.src.rpm

The options passed to this command are described below:

--config: Koji config file path (a Mock config can be created with koji mock-config).
--scratch: Builds the package without including it in the release.
--wait: Waits for the build to complete.

This command will create a new Koji task for this build, which can be tracked using the web interface. Another important task to perform is to tag the package (which allows organizing and filtering packages):

$ koji tag-pkg <package name>

Note: Koji is made up of several components and provides numerous other features which are explained in the documentation.

3. Automating the build pipeline with continuous integration and deployment

Continuous integration (CI) is the process of automating the build, test, and deployment (this step is part of continuous deployment) process of an application. In case of GitLab, this is done using the .gitlab-ci.yml script file which follows the YAML format and is present in the repository's root directory. It is triggered each time an update is made to the code base.

Note: Runners are isolated virtual machines which run the code defined in .gitlab-ci.yml. The GitLab documentation explains how a runner can be configured.

3.1. Defining variables and the base image

Variables allow generalizing the file, and can be used later on. The image tag defines the base image to be used for running the pipeline.

image: <base image url>

variables:
 NAME: 'value'
 ...

3.2. Defining stages

Before defining the jobs, all stages must be explicitly defined. A stage can run commands in sequence or in parallel. In lieu of building RPMs, the stages can be defined as below:

stages:
 - build
 - package
 - test
 - koji
 - deploy

3.3. Defining jobs

A job defines a series of actions that must be performed when it is invoked. It can take numerous tags as options, such as the job stage, which allows sequencing the job order.

rpm_build_binary: # Name of the job.
 <<: *rpmbuild_deps # References a different job.
 script: # The scripts to execute.
 - rpmbuild -bb $SPEC_FILE --define "dist $DIST" --define "_topdir $(pwd)/rpmbuild" --define "_sourcedir $(pwd)"
 - rpm -qpl rpmbuild/RPMS/noarch/*
 except: # Do not execute the job for the following.
 - tags # This job will not be executed when a new tag is created.
 artifacts: # Lists the artifacts available to the job.
 paths: # Paths available to the job.
 - rpmbuild/RPMS/noarch/*
 expire_in: # The time period after which the job should expire.
 - 1 week

Note: By default, all jobs run independently of each other, but dependency can be injected by introducing the dependencies tag.

3.4. Cross-referencing jobs

The YAML format allows assigning an anchor to each job which can later be referenced in successive jobs, thus avoiding code duplication.

.rpmbuild_deps: &rpmbuild_deps
 before_script:
 - yum install -y rpm-build rpmdevtools redhat-rpm-config

The rpmbuild_deps job can now be expanded at a later stage using <<: *rpmbuild_deps as shown in the previous section.

3.5. Hidden jobs

To hide a job so that it's not processed by the GitLab-CI runner, the job's name must be preceded with a dot (.). This tells the runner to skip processing that job.

Note: The documentation on GitLab provides a good overview on the benefits and workflow of continuous integration and deployment.

Conclusion

This blog post explained the build process of an RPM package using the rpmbuild utility and provided a sample spec file which can be used for building the package. Koji was then introduced which serves as the task scheduler for building RPMs and for keeping a track history of all the running and completed jobs. Finally, it was shown how to automate the entire process using GitLab continuous integration.

The next step, once the GitLab-CI pipeline is finished, can be to deploy the built RPM on a remote machine (or the cloud) using a configuration management tool, like Puppet.