Theodoros Emmanouilidis

Notes & Thoughts
Browsing Tip Of The Day

awk Script To Merge Columns From Different Files

February7

This is a very helpful awk script to merge columns from different files into one single file.

Suppose we have two files with two columns each and the same number of lines.
Our goal is to merge column two from the first file with column one from the second.

First we merge the two files and then we use awk to select the desired columns and print them to a new file.

1
pr -m -t -s\  file1 file2 | awk '{print $2,$3}' >out_file.txt

Give Read – Write Permissions To A Specific Group Of Users

January30

Suppose we have the need to give read – write permissions to a specific group of users. As an example we need a share folder to be readable and writable by all students in a specific class. The following guide addresses such an issue.

First we have to create the group.

1
sudo groupadd class01

Now we add some users and make them members of the class01 group,

1
2
3
4
5
6
sudo adduser student01:class01
sudo usermod -g class01 student01
sudo adduser student02:class01
sudo usermod -g class01 student02
sudo adduser student03:class01
sudo usermod -g class01 student03

we create the directory that will be shared to the group members,

1
sudo mkdir /media/shared/class01

and give the appropriate permissions.

1
sudo chmod -R g+rwxs /media/shared/class01

The above command will give full group read – write access to directory “class01” and also, will set the set-groupID flag so that directories created inside it inherit the group.

That’ s all!

Substitute CSV Column Values Plus One With awk

September15

Suppose you have a three column data csv file that the values of one column, the third par example, must be substituted with current value plus one. This is very easy to do using awk.

Just type:

1
awk -F, '{$3++; print $1","$2","$3}' input-file.csv

awk will print by default the result to the console. To save the result you should output it to a file using:

1
awk -F, '{$3++; print $1","$2","$3}' input-file.csv > output-file.csv

The third column of the output file contains the input file values plus one.

Convert PDF Document To Text From Command Line

September15

pdftotext is a nifty command line utility that can be used to convert a PDF document to text. Most Linux distributions include pdftotext as part of the poppler-utils package. Installation in Ubuntu is very easy using apt. Just type

1
sudo apt-get install poppler-utils

and pdftotext is installed.

Usage: pdftotext [options] <PDF-file> [<text-file>]
  -f <int>          : first page to convert
  -l <int>          : last page to convert
  -r <fp>           : resolution, in DPI (default is 72)
  -x <int>          : x-coordinate of the crop area top left corner
  -y <int>          : y-coordinate of the crop area top left corner
  -W <int>          : width of crop area in pixels (default is 0)
  -H <int>          : height of crop area in pixels (default is 0)
  -layout           : maintain original physical layout
  -raw              : keep strings in content stream order
  -htmlmeta         : generate a simple HTML file, including the meta information
  -enc <string>     : output text encoding name
  -listenc          : list available encodings
  -eol <string>     : output end-of-line convention (unix, dos, or mac)
  -nopgbrk          : don't insert page breaks between pages
  -opw <string>     : owner password (for encrypted files)
  -upw <string>     : user password (for encrypted files)
  -q                : don't print any messages or errors
  -v                : print copyright and version info

The simplest way to use it is by typing

1
pdftotext file-to-convert.pdf

and the utility will create a text file with the same name inside the directory that file-to-convert.pdf resides.
Wildcards (*), for example:

1
pdftotext *.pdf

for converting multiple files, cannot be used because pdftotext expects only one file name. Instead, a loop can be used for batch conversions like:

1
2
3
4
for f in *.pdf
do
pdftotext "$f"
done
Newer Entries »