Data Pipelines (Basic)
December 10th, 2006
Pipelines are used to send the standard output of one program to the standard input of another. This allows you to link commands without having to write the output of one command to a file and then passing that file to another command for further processing. Here is an example:
[root@www ~]# find . -type f | wc -l
1708
The above counts the number of files in the current directory and its subdirectories. You can use more than one pipe. The example displays the five largest files or directories in the current directory:
[root@www ~]# du -ks * | sort -rn | head -n 5
17207 08-28-2006.rar
14868 gb2.sql
6738 jpgraph-1.20.5
4387 jpgraph-1.20.5.tar.gz
602 temp.sql
Here is another example which uses four pipes. It displays number of processes currently running next to the user the process is running under in descending order:
[root@www ~]# ps -ef --no-headers | awk '{print $1}' | sort | uniq -c | sort -rn
42 root
9 apache
4 brock
1 xfs
1 smmsp
1 mysql
If you run a database server, you will probaly want to back it up from time to time. This takes my database, compresses it, and saves it a file:
[root@www ~]# mysqldump -p bash | gzip -c > bash.sql.gz
Enter password:
[root@www ~]# ls -lh bash.sql.gz
-rw-r--r-- 1 root root 31K Dec 11 00:21 bash.sql.gz
Leave a Reply