Unlock the Power of AWK for Effective Text Processing

Unlock the Power of AWK for Effective Text Processing

Last Blog Review

In the last, blog we understood how to user “GREP“ effectively, as it is a powerful utility used for searching and manipulating text within files. Its primary function is to scan each line of input and print lines that match a specified pattern, making it essential for tasks involving data extraction and analysis.

What is AWK →

AWK is a tool which is used for pattern scanning and processing process. It’s used to scan file, produce formatted outputs, perform action(s) on matched lines, it is used to split a file into multiple fields. So, all in all awk searches for the text containing user-specified pattern once the line matches one of the pattern then awk performs some user-defined action on it.

Let’s understand it practically →

  • Let create a log file and use awk to search the data as per requirement.
i-0c8dd7a6179312bf1 (EC2)
ubuntu@ip------------:~$ ls
ubuntu@ip------------:~$ vim log.txt
ubuntu@ip------------:~$ cat log.txt 
03/22 08:51:06 TRACE  :...read_physical_netif: Home list entries returned = 7
03/22 08:51:06 INFO   :...read_physical_netif: index #0, interface VLINK1 has address, ifidx 0
03/22 08:51:06 WARNING:.....mailslot_create: setsockopt(MCAST_ADD) failed - EDC8116I Address not available.
03/22 08:52:50 EVENT  :.....api_reader: api request SESSION
  • Now lets find data in the log file where it is giving a warning. So, we can do it with grep command as like "grep -r warning log.txt"
i-0c8dd7a6179312bf1 (EC2)
ubuntu@ip------------:~$ grep -i warning log.txt 
03/22 08:51:06 WARNING:.....mailslot_create: setsockopt(MCAST_ADD) failed - EDC8116I Address not available.
  • But say we want to display warning log's date and time only then in that case we need to use 'awk'
i-0c8dd7a6179312bf1 (EC2)
ubuntu@ip------------:~$ awk '/WARNING/ {print $1,$2}' log.txt 
03/22 08:51:06
  • Let's say we want to print the row number on which the warning has occurred.
i-0c8dd7a6179312bf1 (EC2)
ubuntu@ip------------:~$ awk '/WARNING/ {print NR}' log.txt
ubuntu@ip------------:~$ awk '/WARNING/ {print NR, $1, $2, $3, $4}' log.txt
3 03/22 08:51:06 WARNING:.....mailslot_create: setsockopt(MCAST_ADD)
  • Lets say we want to find first 3 lines of the log file
i-0c8dd7a6179312bf1 (EC2)
ubuntu@ip------------:~$ awk 'NR==1,NR==3 {print}' log.txt    
03/22 08:51:06 TRACE  :...read_physical_netif: Home list entries returned = 7
03/22 08:51:06 INFO   :...read_physical_netif: index #0, interface VLINK1 has address, ifidx 0
03/22 08:51:06 WARNING:.....mailslot_create: setsockopt(MCAST_ADD) failed - EDC8116I Address not available.
  • Lets say we want to print the warning data present between line 2nd to line 4th of the log file.
i-0c8dd7a6179312bf1 (EC2)
ubuntu@ip------------:~$ awk 'NR>=2 && NR<=4 && /WARNING/ {print}' log.txt
03/22 08:51:06 WARNING:.....mailslot_create: setsockopt(MCAST_ADD) failed - EDC8116I Address not available.
  • Say we want to print date in the Linux we will use command "date" but from that we just want day and month and date then we can use "awk" command.

      i-002978a646c30dd8b (EC2)
      ubuntu@ip------------:~/scripts/new_folder$ date
      Tue Sep 19 11:22:10 UTC 2023
      ubuntu@ip------------:~/scripts/new_folder$ date | awk '{print $1, $2, $3}' 
      Tue Sep 19

Conclusion →

So, here you saw how awk is such a powerful programming tool which is used as a text processor which means it’s used for extracting the data and manipulating it. Arithmetic operations can be performed like sum, avg, mul, div. Support’s arrays as well. Pattern matching for advanced regex text. Also supports data filtering and transformation.

Loved this post? Share it with your friends! If you have any questions or thoughts, leave a comment below – I’m looking forward to hearing your perspective. Don’t forget to subscribe for more content delivered directly to you. Stay tuned for more tips, ideas, and inspiration coming soon!