Advanced Unix / Linux

Unix / Linux - Regular Expressions with SED

In this chapter, we will discuss in detail about regular expressions with SED in Unix.A regular expression is a string that can be used to describe several sequences of characters. Regular expressions are used by several different Unix commands, including ed, sed, awk, grep, and to a more limited extent, vi.
Here SED stands for stream editor. This stream-oriented editor was created exclusively for executing scripts. Thus, all the input you feed into it passes through and goes to STDOUT and it does not change the input file.

Invoking sed

Before we start, let us ensure we have a local copy of /etc/passwd text file to work with sed.
As mentioned previously, sed can be invoked by sending data through a pipe to it as follows −

$ cat /etc/passwd | sed
Usage: sed [OPTION]... {script-other-script} [input-file]...

  -n, --quiet, --silent
                 suppress automatic printing of pattern space
  -e script, --expression = script
...............................

The cat command dumps the contents of /etc/passwd to sed through the pipe into sed's pattern space. The pattern space is the internal work buffer that sed uses for its operations.

The sed General Syntax

Following is the general syntax for sed −

/pattern/action

Here, pattern is a regular expression, and action is one of the commands given in the following table. If pattern is omitted, action is performed for every line as we have seen above.
The slash character (/) that surrounds the pattern are required because they are used as delimiters.

Sr.No.	Range & Description
1	p Prints the line
2	d Deletes the line
3	s/pattern1/pattern2/ Substitutes the first occurrence of pattern1 with pattern2

Deleting All Lines with sed

We will now understand how to delete all lines with sed. Invoke sed again; but the sed is now supposed to use the editing command delete line, denoted by the single letter d −

$ cat /etc/passwd | sed 'd'
$

Instead of invoking sed by sending a file to it through a pipe, the sed can be instructed to read the data from a file, as in the following example.
The following command does exactly the same as in the previous example, without the cat command −

$ sed -e 'd' /etc/passwd
$

The sed Addresses

The sed also supports addresses. Addresses are either particular locations in a file or a range where a particular editing command should be applied. When the sed encounters no addresses, it performs its operations on every line in the file.
The following command adds a basic address to the sed command you've been using −

$ cat /etc/passwd | sed '1d' |more
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
backup:x:34:34:backup:/var/backups:/bin/sh
$

Notice that the number 1 is added before the delete edit command. This instructs the sed to perform the editing command on the first line of the file. In this example, the sed will delete the first line of /etc/password and print the rest of the file.

The sed Address Ranges

We will now understand how to work with the sed address ranges. So what if you want to remove more than one line from a file? You can specify an address range with sed as follows −

$ cat /etc/passwd | sed '1, 5d' |more
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
backup:x:34:34:backup:/var/backups:/bin/sh
$

The above command will be applied on all the lines starting from 1 through 5. This deletes the first five lines.
Try out the following address ranges −

Sr.No.	Range & Description
1	'4,10d' Lines starting from the 4^th till the 10^th are deleted
2	'10,4d' Only 10^th line is deleted, because the sed does not work in reverse direction
3	'4,+5d' This matches line 4 in the file, deletes that line, continues to delete the next five lines, and then ceases its deletion and prints the rest
4	'2,5!d' This deletes everything except starting from 2^nd till 5^th line
5	'1~3d' This deletes the first line, steps over the next three lines, and then deletes the fourth line. Sed continues to apply this pattern until the end of the file.
6	'2~2d' This tells sed to delete the second line, step over the next line, delete the next line, and repeat until the end of the file is reached
7	'4,10p' Lines starting from 4^th till 10^th are printed
8	'4,d' This generates the syntax error
9	',10d' This would also generate syntax error

Note − While using the p action, you should use the -n option to avoid repetition of line printing. Check the difference in between the following two commands −

$ cat /etc/passwd | sed -n '1,3p'
Check the above command without -n as follows −
$ cat /etc/passwd | sed '1,3p'

The Substitution Command

The substitution command, denoted by s, will substitute any string that you specify with any other string that you specify.
To substitute one string with another, the sed needs to have the information on where the first string ends and the substitution string begins. For this, we proceed with bookending the two strings with the forward slash (/) character.
The following command substitutes the first occurrence on a line of the string root with the string amrood.

$ cat /etc/passwd | sed 's/root/amrood/'
amrood:x:0:0:root user:/root:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
..........................

It is very important to note that sed substitutes only the first occurrence on a line. If the string root occurs more than once on a line only the first match will be replaced.
For the sed to perform a global substitution, add the letter g to the end of the command as follows −

$ cat /etc/passwd | sed 's/root/amrood/g'
amrood:x:0:0:amrood user:/amrood:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
...........................

Substitution Flags

There are a number of other useful flags that can be passed in addition to the g flag, and you can specify more than one at a time.

Sr.No.	Flag & Description
1	g Replaces all matches, not just the first match
2	NUMBER Replaces only NUMBER^th match
3	p If substitution was made, then prints the pattern space
4	w FILENAME If substitution was made, then writes result to FILENAME
5	I or i Matches in a case-insensitive manner
6	M or m In addition to the normal behavior of the special regular expression characters ^ and $, this flag causes ^ to match the empty string after a newline and $ to match the empty string before a newline

Using an Alternative String Separator

Suppose you have to do a substitution on a string that includes the forward slash character. In this case, you can specify a different separator by providing the designated character after the s.

$ cat /etc/passwd | sed 's:/root:/amrood:g'
amrood:x:0:0:amrood user:/amrood:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/sh

In the above example, we have used : as the delimiter instead of slash / because we were trying to search /root instead of the simple root.

Replacing with Empty Space

Use an empty substitution string to delete the root string from the /etc/passwd file entirely −

$ cat /etc/passwd | sed 's/root//g'
:x:0:0::/:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/sh

Address Substitution

If you want to substitute the string sh with the string quiet only on line 10, you can specify it as follows −

$ cat /etc/passwd | sed '10s/sh/quiet/g'
root:x:0:0:root user:/root:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
backup:x:34:34:backup:/var/backups:/bin/quiet

Similarly, to do an address range substitution, you could do something like the following −

$ cat /etc/passwd | sed '1,5s/sh/quiet/g'
root:x:0:0:root user:/root:/bin/quiet
daemon:x:1:1:daemon:/usr/sbin:/bin/quiet
bin:x:2:2:bin:/bin:/bin/quiet
sys:x:3:3:sys:/dev:/bin/quiet
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
backup:x:34:34:backup:/var/backups:/bin/sh

As you can see from the output, the first five lines had the string sh changed to quiet, but the rest of the lines were left untouched.

The Matching Command

You would use the p option along with the -n option to print all the matching lines as follows −

$ cat testing | sed -n '/root/p'
root:x:0:0:root user:/root:/bin/sh
[root@ip-72-167-112-17 amrood]# vi testing
root:x:0:0:root user:/root:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
backup:x:34:34:backup:/var/backups:/bin/sh

Using Regular Expression

While matching patterns, you can use the regular expression which provides more flexibility.
Check the following example which matches all the lines starting with daemon and then deletes them −

$ cat testing | sed '/^daemon/d'
root:x:0:0:root user:/root:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
backup:x:34:34:backup:/var/backups:/bin/sh

Following is the example which deletes all the lines ending with sh −

$ cat testing | sed '/sh$/d'
sync:x:4:65534:sync:/bin:/bin/sync

The following table lists four special characters that are very useful in regular expressions.

Sr.No.	Character & Description
1	^ Matches the beginning of lines
2	$ Matches the end of lines
3	. Matches any single character
4	* Matches zero or more occurrences of the previous character
5	[chars] Matches any one of the characters given in chars, where chars is a sequence of characters. You can use the - character to indicate a range of characters.

Matching Characters

Look at a few more expressions to demonstrate the use of metacharacters. For example, the following pattern −

Sr.No.	Expression & Description
1	/a.c/ Matches lines that contain strings such as a+c, a-c, abc, match, and a3c
2	*/ac/ Matches the same strings along with strings such as ace, yacc, and arctic**
3	/[tT]he/ Matches the string The and the
4	/^$/ Matches blank lines
5	*/^.$/** Matches an entire line whatever it is
6	*/ /** Matches one or more spaces
7	/^$/ Matches blank lines

Following table shows some frequently used sets of characters −

Sr.No.	Set & Description
1	[a-z] Matches a single lowercase letter
2	[A-Z] Matches a single uppercase letter
3	[a-zA-Z] Matches a single letter
4	[0-9] Matches a single number
5	[a-zA-Z0-9] Matches a single letter or number

Character Class Keywords

Some special keywords are commonly available to regexps, especially GNU utilities that employ regexps. These are very useful for sed regular expressions as they simplify things and enhance readability.
For example, the characters a through z and the characters A through Z, constitute one such class of characters that has the keyword [[:alpha:]]
Using the alphabet character class keyword, this command prints only those lines in the /etc/syslog.conf file that start with a letter of the alphabet −

$ cat /etc/syslog.conf | sed -n '/^[[:alpha:]]/p'
authpriv.*                         /var/log/secure
mail.*                             -/var/log/maillog
cron.*                             /var/log/cron
uucp,news.crit                     /var/log/spooler
local7.*                           /var/log/boot.log

The following table is a complete list of the available character class keywords in GNU sed.

Sr.No.	Character Class & Description
1	[[:alnum:]] Alphanumeric [a-z A-Z 0-9]
2	[[:alpha:]] Alphabetic [a-z A-Z]
3	[[:blank:]] Blank characters (spaces or tabs)
4	[[:cntrl:]] Control characters
5	[[:digit:]] Numbers [0-9]
6	[[:graph:]] Any visible characters (excludes whitespace)
7	[[:lower:]] Lowercase letters [a-z]
8	[[:print:]] Printable characters (non-control characters)
9	[[:punct:]] Punctuation characters
10	[[:space:]] Whitespace
11	[[:upper:]] Uppercase letters [A-Z]
12	[[:xdigit:]] Hex digits [0-9 a-f A-F]

Aampersand Referencing

The sed metacharacter & represents the contents of the pattern that was matched. For instance, say you have a file called phone.txt full of phone numbers, such as the following −

You want to make the area code (the first three digits) surrounded by parentheses for easier reading. To do this, you can use the ampersand replacement character −

$ sed -e 's/^[[:digit:]][[:digit:]][[:digit:]]/(&)/g' phone.txt
(555)5551212
(555)5551213
(555)5551214
(666)5551215

(666)5551216
(777)5551217

Here in the pattern part you are matching the first 3 digits and then using & you are replacing those 3 digits with the surrounding parentheses.

Using Multiple sed Commands

You can use multiple sed commands in a single sed command as follows −

$ sed -e 'command1' -e 'command2' ... -e 'commandN' files

Here command1 through commandN are sed commands of the type discussed previously. These commands are applied to each of the lines in the list of files given by files.
Using the same mechanism, we can write the above phone number example as follows −

$ sed -e 's/^[[:digit:]]\{3\}/(&)/g'  \ 
   -e 's/)[[:digit:]]\{3\}/&-/g' phone.txt 
(555)555-1212 
(555)555-1213 
(555)555-1214 
(666)555-1215 
(666)555-1216 
(777)555-1217

Note − In the above example, instead of repeating the character class keyword [[:digit:]] three times, we replaced it with \{3\}, which means the preceding regular expression is matched three times. We have also used \ to give line break and this has to be removed before the command is run.

Back References

The ampersand metacharacter is useful, but even more useful is the ability to define specific regions in regular expressions. These special regions can be used as reference in your replacement strings. By defining specific parts of a regular expression, you can then refer back to those parts with a special reference character.
To do back references, you have to first define a region and then refer back to that region. To define a region, you insert backslashed parentheses around each region of interest. The first region that you surround with backslashes is then referenced by \1, the second region by \2, and so on.
Assuming phone.txt has the following text −

(555)555-1212
(555)555-1213
(555)555-1214
(666)555-1215
(666)555-1216
(777)555-1217

Try the following command −

$ cat phone.txt | sed 's/\(.*)\)\(.*-\)\(.*$\)/Area \ 
   code: \1 Second: \2 Third: \3/' 
Area code: (555) Second: 555- Third: 1212 
Area code: (555) Second: 555- Third: 1213 
Area code: (555) Second: 555- Third: 1214 
Area code: (666) Second: 555- Third: 1215 
Area code: (666) Second: 555- Third: 1216 
Area code: (777) Second: 555- Third: 1217

Note − In the above example, each regular expression inside the parenthesis would be back referenced by \1, \2 and so on. We have used \ to give line break here. This should be removed before running the command.

Unix / Linux - File System Basics

A file system is a logical collection of files on a partition or disk. A partition is a container for information and can span an entire hard drive if desired.Your hard drive can have various partitions which usually contain only one file system, such as one file system housing the /file system or another containing the /home file system.
One file system per partition allows for the logical maintenance and management of differing file systems.
Everything in Unix is considered to be a file, including physical devices such as DVD-ROMs, USB devices, and floppy drives.

Directory Structure

Unix uses a hierarchical file system structure, much like an upside-down tree, with root (/) at the base of the file system and all other directories spreading from there.
A Unix filesystem is a collection of files and directories that has the following properties −

It has a root directory (/) that contains other files and directories.
Each file or directory is uniquely identified by its name, the directory in which it resides, and a unique identifier, typically called an inode.
By convention, the root directory has an inode number of 2 and the lost+found directory has an inode number of 3. Inode numbers 0 and 1 are not used. File inode numbers can be seen by specifying the -i option to ls command.
It is self-contained. There are no dependencies between one filesystem and another.

The directories have specific purposes and generally hold the same types of information for easily locating files. Following are the directories that exist on the major versions of Unix −

Sr.No.	Directory & Description
1	/ This is the root directory which should contain only the directories needed at the top level of the file structure
2	/bin This is where the executable files are located. These files are available to all users
3	/dev These are device drivers
4	/etc Supervisor directory commands, configuration files, disk configuration files, valid user lists, groups, ethernet, hosts, where to send critical messages
5	/lib Contains shared library files and sometimes other kernel-related files
6	/boot Contains files for booting the system
7	/home Contains the home directory for users and other accounts
8	/mnt Used to mount other temporary file systems, such as cdrom and floppy for the CD-ROM drive and floppy diskette drive, respectively
9	/proc Contains all processes marked as a file by process number or other information that is dynamic to the system
10	/tmp Holds temporary files used between system boots
11	/usr Used for miscellaneous purposes, and can be used by many users. Includes administrative commands, shared files, library files, and others
12	/var Typically contains variable-length files such as log and print files and any other type of file that may contain a variable amount of data
13	/sbin Contains binary (executable) files, usually for system administration. For example, *fdisk* and *ifconfig* utlities
14	/kernel Contains kernel files

Navigating the File System

Now that you understand the basics of the file system, you can begin navigating to the files you need. The following commands are used to navigate the system −

Sr.No.	Command & Description
1	cat filename Displays a filename
2	cd dirname Moves you to the identified directory
3	cp file1 file2 Copies one file/directory to the specified location
4	file filename Identifies the file type (binary, text, etc)
5	find filename dir Finds a file/directory
6	head filename Shows the beginning of a file
7	less filename Browses through a file from the end or the beginning
8	ls dirname Shows the contents of the directory specified
9	mkdir dirname Creates the specified directory
10	more filename Browses through a file from the beginning to the end
11	mv file1 file2 Moves the location of, or renames a file/directory
12	pwd Shows the current directory the user is in
13	rm filename Removes a file
14	rmdir dirname Removes a directory
15	tail filename Shows the end of a file
16	touch filename Creates a blank file or modifies an existing file or its attributes
17	whereis filename Shows the location of a file
18	which filename Shows the location of a file if it is in your PATH

You can use Manpage Help to check complete syntax for each command mentioned here.

The df Command

The first way to manage your partition space is with the df (disk free) command. The command df -k (disk free) displays the disk space usage in kilobytes, as shown below −

$df -k
Filesystem      1K-blocks      Used   Available Use% Mounted on
/dev/vzfs        10485760   7836644     2649116  75% /
/devices                0         0           0   0% /devices
$

Some of the directories, such as /devices, shows 0 in the kbytes, used, and avail columns as well as 0% for capacity. These are special (or virtual) file systems, and although they reside on the disk under /, by themselves they do not consume disk space.
The df -k output is generally the same on all Unix systems. Here's what it usually includes −

Sr.No.	Column & Description
1	Filesystem The physical file system name
2	kbytes Total kilobytes of space available on the storage medium
3	used Total kilobytes of space used (by files)
4	avail Total kilobytes available for use
5	capacity Percentage of total space used by files
6	Mounted on What the file system is mounted on

You can use the -h (human readable) option to display the output in a format that shows the size in easier-to-understand notation.

The du Command

The du (disk usage) command enables you to specify directories to show disk space usage on a particular directory.
This command is helpful if you want to determine how much space a particular directory is taking. The following command displays number of blocks consumed by each directory. A single block may take either 512 Bytes or 1 Kilo Byte depending on your system.

$du /etc
10     /etc/cron.d
126    /etc/default
6      /etc/dfs
...
$

The -h option makes the output easier to comprehend −

$du -h /etc
5k    /etc/cron.d
63k   /etc/default
3k    /etc/dfs
...
$

Mounting the File System

A file system must be mounted in order to be usable by the system. To see what is currently mounted (available for use) on your system, use the following command −

$ mount
/dev/vzfs on / type reiserfs (rw,usrquota,grpquota)
proc on /proc type proc (rw,nodiratime)
devpts on /dev/pts type devpts (rw)
$

The /mnt directory, by the Unix convention, is where temporary mounts (such as CDROM drives, remote network drives, and floppy drives) are located. If you need to mount a file system, you can use the mount command with the following syntax −

mount -t file_system_type device_to_mount directory_to_mount_to

For example, if you want to mount a CD-ROM to the directory /mnt/cdrom, you can type −

$ mount -t iso9660 /dev/cdrom /mnt/cdrom

This assumes that your CD-ROM device is called /dev/cdrom and that you want to mount it to /mnt/cdrom. Refer to the mount man page for more specific information or type mount -h at the command line for help information.
After mounting, you can use the cd command to navigate the newly available file system through the mount point you just made.

Unmounting the File System

To unmount (remove) the file system from your system, use the umount command by identifying the mount point or device.
For example, to unmount cdrom, use the following command −

$ umount /dev/cdrom

The mount command enables you to access your file systems, but on most modern Unix systems, the automount function makes this process invisible to the user and requires no intervention.

User and Group Quotas

The user and group quotas provide the mechanisms by which the amount of space used by a single user or all users within a specific group can be limited to a value defined by the administrator.
Quotas operate around two limits that allow the user to take some action if the amount of space or number of disk blocks start to exceed the administrator defined limits −

Soft Limit − If the user exceeds the limit defined, there is a grace period that allows the user to free up some space.
Hard Limit − When the hard limit is reached, regardless of the grace period, no further files or blocks can be allocated.

There are a number of commands to administer quotas −

Sr.No.

Command & Description

quota
Displays disk usage and limits for a user of group

edquota
This is a quota editor. Users or Groups quota can be edited using this command

quotacheck
Scans a filesystem for disk usage, creates, checks and repairs quota files

setquota
This is a command line quota editor

quotaon
This announces to the system that disk quotas should be enabled on one or more filesystems

quotaoff
This announces to the system that disk quotas should be disabled for one or more filesystems

repquota
This prints a summary of the disc usage and quotas for the specified file systems.

Unix / Linux - User Administration

In this chapter, we will discuss in detail about user administration in Unix.
There are three types of accounts on a Unix system −

Root account

This is also called superuser and would have complete and unfettered control of the system. A superuser can run any commands without any restriction. This user should be assumed as a system administrator.

System accounts

System accounts are those needed for the operation of system-specific components for example mail accounts and the sshd accounts. These accounts are usually needed for some specific function on your system, and any modifications to them could adversely affect the system.

User accounts

User accounts provide interactive access to the system for users and groups of users. General users are typically assigned to these accounts and usually have limited access to critical system files and directories.
Unix supports a concept of Group Account which logically groups a number of accounts. Every account would be a part of another group account. A Unix group plays important role in handling file permissions and process management.

Managing Users and Groups

There are four main user administration files −

/etc/passwd − Keeps the user account and password information. This file holds the majority of information about accounts on the Unix system.
/etc/shadow − Holds the encrypted password of the corresponding account. Not all the systems support this file.
/etc/group − This file contains the group information for each account.
/etc/gshadow − This file contains secure group account information.

Check all the above files using the cat command.
The following table lists out commands that are available on majority of Unix systems to create and manage accounts and groups −

Sr.No.	Command & Description
1	useradd Adds accounts to the system
2	usermod Modifies account attributes
3	userdel Deletes accounts from the system
4	groupadd Adds groups to the system
5	groupmod Modifies group attributes
6	groupdel Removes groups from the system

You can use Manpage Help to check complete syntax for each command mentioned here.

Create a Group

We will now understand how to create a group. For this, we need to create groups before creating any account otherwise, we can make use of the existing groups in our system. We have all the groups listed in /etc/groups file.
All the default groups are system account specific groups and it is not recommended to use them for ordinary accounts. So, following is the syntax to create a new group account −

 groupadd [-g gid [-o]] [-r] [-f] groupname

The following table lists out the parameters −

Sr.No.	Option & Description
1	-g GID The numerical value of the group's ID
2	-o This option permits to add group with non-unique GID
3	-r This flag instructs groupadd to add a system account
4	-f This option causes to just exit with success status, if the specified group already exists. With -g, if the specified GID already exists, other (unique) GID is chosen
5	groupname Actual group name to be created

If you do not specify any parameter, then the system makes use of the default values.
Following example creates a developers group with default values, which is very much acceptable for most of the administrators.

$ groupadd developers

Modify a Group

To modify a group, use the groupmod syntax −

$ groupmod -n new_modified_group_name old_group_name

To change the developers_2 group name to developer, type −

$ groupmod -n developer developer_2

Here is how you will change the financial GID to 545 −

$ groupmod -g 545 developer

Delete a Group

We will now understand how to delete a group. To delete an existing group, all you need is the groupdel command and the group name. To delete the financial group, the command is −

$ groupdel developer

This removes only the group, not the files associated with that group. The files are still accessible by their owners.

Create an Account

Let us see how to create a new account on your Unix system. Following is the syntax to create a user's account −

useradd -d homedir -g groupname -m -s shell -u userid accountname

The following table lists out the parameters −

Sr.No.	Option & Description
1	-d homedir Specifies home directory for the account
2	-g groupname Specifies a group account for this account
3	-m Creates the home directory if it doesn't exist
4	-s shell Specifies the default shell for this account
5	-u userid You can specify a user id for this account
6	accountname Actual account name to be created

If you do not specify any parameter, then the system makes use of the default values. The useradd command modifies the /etc/passwd, /etc/shadow, and /etc/group files and creates a home directory.
Following is the example that creates an account mcmohd, setting its home directory to /home/mcmohd and the group as developers. This user would have Korn Shell assigned to it.

$ useradd -d /home/mcmohd -g developers -s /bin/ksh mcmohd

Before issuing the above command, make sure you already have the developers group created using the groupadd command.
Once an account is created you can set its password using the passwd command as follows −

$ passwd mcmohd20
Changing password for user mcmohd20.
New UNIX password:
Retype new UNIX password:
passwd: all authentication tokens updated successfully.

When you type passwd accountname, it gives you an option to change the password, provided you are a superuser. Otherwise, you can change just your password using the same command but without specifying your account name.

Modify an Account

The usermod command enables you to make changes to an existing account from the command line. It uses the same arguments as the useradd command, plus the -l argument, which allows you to change the account name.
For example, to change the account name mcmohd to mcmohd20 and to change home directory accordingly, you will need to issue the following command −

$ usermod -d /home/mcmohd20 -m -l mcmohd mcmohd20

Delete an Account

The userdel command can be used to delete an existing user. This is a very dangerous command if not used with caution.
There is only one argument or option available for the command .r, for removing the account's home directory and mail file.
For example, to remove account mcmohd20, issue the following command −

$ userdel -r mcmohd20

If you want to keep the home directory for backup purposes, omit the -r option. You can remove the home directory as needed at a later time.

Unix / Linux - System Performance

In this chapter, we will discuss in detail about the system performance in Unix.
We will introduce you to a few free tools that are available to monitor and manage performance on Unix systems. These tools also provide guidelines on how to diagnose and fix performance problems in the Unix environment.
Unix has following major resource types that need to be monitored and tuned −

CPU
Memory
Disk space
Communications lines
I/O Time
Network Time
Applications programs

Performance Components

The following table lists out five major components which take up the system time −

Sr.No.	Component & Description
1	User State CPU The actual amount of time the CPU spends running the users’ program in the user state. It includes the time spent executing library calls, but does not include the time spent in the kernel on its behalf
2	System State CPU This is the amount of time the CPU spends in the system state on behalf of this program. All I/O routines require kernel services. The programmer can affect this value by blocking I/O transfers
3	I/O Time and Network Time This is the amount of time spent moving data and servicing I/O requests
4	Virtual Memory Performance This includes context switching and swapping
5	Application Program Time spent running other programs - when the system is not servicing this application because another application currently has the CPU

Performance Tools

Unix provides following important tools to measure and fine tune Unix system performance −

Sr.No.	Command & Description
1	nice/renice Runs a program with modified scheduling priority
2	netstat Prints network connections, routing tables, interface statistics, masquerade connections, and multicast memberships
3	time Helps time a simple command or give resource usage
4	uptime This is System Load Average
5	ps Reports a snapshot of the current processes
6	vmstat Reports virtual memory statistics
7	gprof Displays call graph profile data
8	prof Facilitates Process Profiling
9	top Displays system tasks

Unix / Linux - System Logging

In this chapter, we will discuss in detail about system logging in Unix. Unix systems have a very flexible and powerful logging system, which enables you to record almost anything you can imagine and then manipulate the logs to retrieve the information you require.
Many versions of Unix provide a general-purpose logging facility called syslog. Individual programs that need to have information logged, send the information to syslog.
Unix syslog is a host-configurable, uniform system logging facility. The system uses a centralized system logging process that runs the program /etc/syslogd or /etc/syslog.
The operation of the system logger is quite straightforward. Programs send their log entries to syslogd, which consults the configuration file /etc/syslogd.conf or /etc/syslog and, when a match is found, writes the log message to the desired log file.
There are four basic syslog terms that you should understand −

Sr.No.	Term & Description
1	Facility The identifier used to describe the application or process that submitted the log message. For example, mail, kernel, and ftp.
2	Priority An indicator of the importance of the message. Levels are defined within syslog as guidelines, from debugging information to critical events.
3	Selector A combination of one or more facilities and levels. When an incoming event matches a selector, an action is performed.
4	Action What happens to an incoming message that matches a selector — Actions can write the message to a log file, echo the message to a console or other device, write the message to a logged in user, or send the message along to another syslog server.

Syslog Facilities

We will now understand about the syslog facilities. Here are the available facilities for the selector. Not all facilities are present on all versions of Unix.

Facility	Description
1	auth Activity related to requesting name and password (getty, su, login)
2	authpriv Same as auth but logged to a file that can only be read by selected users
3	console Used to capture messages that are generally directed to the system console
4	cron Messages from the cron system scheduler
5	daemon System daemon catch-all
6	ftp Messages relating to the ftp daemon
7	kern Kernel messages
8	local0.local7 Local facilities defined per site
9	lpr Messages from the line printing system
10	mail Messages relating to the mail system
11	mark Pseudo-event used to generate timestamps in log files
12	news Messages relating to network news protocol (nntp)
13	ntp Messages relating to network time protocol
14	user Regular user processes
15	uucp UUCP subsystem

Syslog Priorities

The syslog priorities are summarized in the following table −

Sr.No.	Priority & Description
1	emerg Emergency condition, such as an imminent system crash, usually broadcast to all users
2	alert Condition that should be corrected immediately, such as a corrupted system database
3	crit Critical condition, such as a hardware error
4	err Ordinary error
5	Warning Warning
6	notice Condition that is not an error, but possibly should be handled in a special way
7	info Informational message
8	debug Messages that are used when debugging programs
9	none Pseudo level used to specify not to log messages

The combination of facilities and levels enables you to be discerning about what is logged and where that information goes.
As each program sends its messages dutifully to the system logger, the logger makes decisions on what to keep track of and what to discard based on the levels defined in the selector.
When you specify a level, the system will keep track of everything at that level and higher.

The /etc/syslog.conf file

The /etc/syslog.conf file controls where messages are logged. A typical syslog.conf file might look like this −

*.err;kern.debug;auth.notice /dev/console
daemon,auth.notice           /var/log/messages
lpr.info                     /var/log/lpr.log
mail.*                       /var/log/mail.log
ftp.*                        /var/log/ftp.log
auth.*                       @prep.ai.mit.edu
auth.*                       root,amrood
netinfo.err                  /var/log/netinfo.log
install.*                    /var/log/install.log
*.emerg                      *
*.alert                      |program_name
mark.*                       /dev/console

Each line of the file contains two parts −

A message selector that specifies which kind of messages to log. For example, all error messages or all debugging messages from the kernel.
An action field that says what should be done with the message. For example, put it in a file or send the message to a user's terminal.

Following are the notable points for the above configuration −

Message selectors have two parts: a facility and a priority. For example, kern.debug selects all debug messages (the priority) generated by the kernel (the facility).
Message selector kern.debug selects all priorities that are greater than debug.
An asterisk in place of either the facility or the priority indicates "all". For example, *.debug means all debug messages, while kern.* means all messages generated by the kernel.
You can also use commas to specify multiple facilities. Two or more selectors can be grouped together by using a semicolon.

Logging Actions

The action field specifies one of five actions −

Log message to a file or a device. For example, /var/log/lpr.log or /dev/console.
Send a message to a user. You can specify multiple usernames by separating them with commas; for example, root, amrood.
Send a message to all users. In this case, the action field consists of an asterisk; for example, *.
Pipe the message to a program. In this case, the program is specified after the Unix pipe symbol (|).
Send the message to the syslog on another host. In this case, the action field consists of a hostname, preceded by an at sign; for example, @tutorialspoint.com.

The logger Command

Unix provides the logger command, which is an extremely useful command to deal with system logging. The logger command sends logging messages to the syslogd daemon, and consequently provokes system logging.
This means we can check from the command line at any time the syslogd daemon and its configuration. The logger command provides a method for adding one-line entries to the system log file from the command line.
The format of the command is −

logger [-i] [-f file] [-p priority] [-t tag] [message]...

Here is the detail of the parameters −

Sr.No.	Option & Description
1	-f filename Uses the contents of file filename as the message to log.
2	-i Logs the process ID of the logger process with each line.
3	-p priority Enters the message with the specified priority (specified selector entry); the message priority can be specified numerically, or as a facility.priority pair. The default priority is user.notice.
4	-t tag Marks each line added to the log with the specified tag.
5	message The string arguments whose contents are concatenated together in the specified order, separated by the space.

You can use Manpage Help to check complete syntax for this command.

Log Rotation

Log files have the propensity to grow very fast and consume large amounts of disk space. To enable log rotations, most distributions use tools such as newsyslog or logrotate.
These tools should be called on a frequent time interval using the cron daemon. Check the man pages for newsyslog or logrotate for more details.

Important Log Locations

All the system applications create their log files in /var/log and its sub-directories. Here are few important applications and their corresponding log directories −

Application	Directory
httpd	/var/log/httpd
samba	/var/log/samba
cron	/var/log/
mail	/var/log/
mysql	/var/log/

Unix / Linux - Signals and Traps

In this chapter, we will discuss in detail about Signals and Traps in Unix. Signals are software interrupts sent to a program to indicate that an important event has occurred. The events can vary from user requests to illegal memory access errors. Some signals, such as the interrupt signal, indicate that a user has asked the program to do something that is not in the usual flow of control.
The following table lists out common signals you might encounter and want to use in your programs −

Signal Name	Signal Number	Description
SIGHUP	1	Hang up detected on controlling terminal or death of controlling process
SIGINT	2	Issued if the user sends an interrupt signal (Ctrl + C)
SIGQUIT	3	Issued if the user sends a quit signal (Ctrl + D)
SIGFPE	8	Issued if an illegal mathematical operation is attempted
SIGKILL	9	If a process gets this signal it must quit immediately and will not perform any clean-up operations
SIGALRM	14	Alarm clock signal (used for timers)
SIGTERM	15	Software termination signal (sent by kill by default)

List of Signals

There is an easy way to list down all the signals supported by your system. Just issue the kill -l command and it would display all the supported signals −

$ kill -l
 1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL
 5) SIGTRAP      6) SIGABRT      7) SIGBUS       8) SIGFPE
 9) SIGKILL     10) SIGUSR1     11) SIGSEGV     12) SIGUSR2
13) SIGPIPE     14) SIGALRM     15) SIGTERM     16) SIGSTKFLT
17) SIGCHLD     18) SIGCONT     19) SIGSTOP     20) SIGTSTP
21) SIGTTIN     22) SIGTTOU     23) SIGURG      24) SIGXCPU
25) SIGXFSZ     26) SIGVTALRM   27) SIGPROF     28) SIGWINCH
29) SIGIO       30) SIGPWR      31) SIGSYS      34) SIGRTMIN
35) SIGRTMIN+1  36) SIGRTMIN+2  37) SIGRTMIN+3  38) SIGRTMIN+4
39) SIGRTMIN+5  40) SIGRTMIN+6  41) SIGRTMIN+7  42) SIGRTMIN+8
43) SIGRTMIN+9  44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12
47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14
51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10
55) SIGRTMAX-9  56) SIGRTMAX-8  57) SIGRTMAX-7  58) SIGRTMAX-6
59) SIGRTMAX-5  60) SIGRTMAX-4  61) SIGRTMAX-3  62) SIGRTMAX-2
63) SIGRTMAX-1  64) SIGRTMAX

The actual list of signals varies between Solaris, HP-UX, and Linux.

Default Actions

Every signal has a default action associated with it. The default action for a signal is the action that a script or program performs when it receives a signal.
Some of the possible default actions are −

Terminate the process.
Ignore the signal.
Dump core. This creates a file called core containing the memory image of the process when it received the signal.
Stop the process.
Continue a stopped process.

Sending Signals

There are several methods of delivering signals to a program or script. One of the most common is for a user to type CONTROL-C or the INTERRUPT key while a script is executing.
When you press the Ctrl+C key, a SIGINT is sent to the script and as per defined default action script terminates.
The other common method for delivering signals is to use the kill command, the syntax of which is as follows −

$ kill -signal pid

Here signal is either the number or name of the signal to deliver and pid is the process ID that the signal should be sent to. For Example −

$ kill -1 1001

The above command sends the HUP or hang-up signal to the program that is running with process ID 1001. To send a kill signal to the same process, use the following command −

$ kill -9 1001

This kills the process running with process ID 1001.

Trapping Signals

When you press the Ctrl+C or Break key at your terminal during execution of a shell program, normally that program is immediately terminated, and your command prompt returns. This may not always be desirable. For instance, you may end up leaving a bunch of temporary files that won't get cleaned up.
Trapping these signals is quite easy, and the trap command has the following syntax −

$ trap commands signals

Here command can be any valid Unix command, or even a user-defined function, and signal can be a list of any number of signals you want to trap.
There are two common uses for trap in shell scripts −

Clean up temporary files
Ignore signals

Cleaning Up Temporary Files

As an example of the trap command, the following shows how you can remove some files and then exit if someone tries to abort the program from the terminal −

$ trap "rm -f $WORKDIR/work1$$ $WORKDIR/dataout$$; exit" 2

From the point in the shell program that this trap is executed, the two files work1$$ and dataout$$ will be automatically removed if signal number 2 is received by the program.
Hence, if the user interrupts the execution of the program after this trap is executed, you can be assured that these two files will be cleaned up. The exit command that follows the rm is necessary because without it, the execution would continue in the program at the point that it left off when the signal was received.
Signal number 1 is generated for hangup. Either someone intentionally hangs up the line or the line gets accidentally disconnected.
You can modify the preceding trap to also remove the two specified files in this case by adding signal number 1 to the list of signals −

$ trap "rm $WORKDIR/work1$$ $WORKDIR/dataout$$; exit" 1 2

Now these files will be removed if the line gets hung up or if the Ctrl+C key gets pressed.
The commands specified to trap must be enclosed in quotes, if they contain more than one command. Also note that the shell scans the command line at the time that the trap command gets executed and also when one of the listed signals is received.
Thus, in the preceding example, the value of WORKDIR and $$ will be substituted at the time that the trap command is executed. If you wanted this substitution to occur at the time that either signal 1 or 2 was received, you can put the commands inside single quotes −

$ trap 'rm $WORKDIR/work1$$ $WORKDIR/dataout$$; exit' 1 2

Ignoring Signals

If the command listed for trap is null, the specified signal will be ignored when received. For example, the command −

$ trap '' 2

This specifies that the interrupt signal is to be ignored. You might want to ignore certain signals when performing an operation that you don't want to be interrupted. You can specify multiple signals to be ignored as follows −

$ trap '' 1 2 3 15

Note that the first argument must be specified for a signal to be ignored and is not equivalent to writing the following, which has a separate meaning of its own −

$ trap  2

If you ignore a signal, all subshells also ignore that signal. However, if you specify an action to be taken on the receipt of a signal, all subshells will still take the default action on receipt of that signal.

Resetting Traps

After you've changed the default action to be taken on receipt of a signal, you can change it back again with the trap if you simply omit the first argument; so −

$ trap 1 2

This resets the action to be taken on the receipt of signals 1 or 2 back to the default.

Friday, November 2, 2018

Advanced Unix / Linux Unix / Linux - Regular Expressions with SED