this post was submitted on 28 Mar 2025
272 points (99.6% liked)

Linux

52637 readers
726 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

Hello!

I am pleased to announce a new version of my CLI text processing with GNU awk ebook. This book will dive deep into field processing, show examples for filtering features, multiple file processing, how to construct solutions that depend on multiple records, how to compare records and fields between two or more files, how to identify duplicates while maintaining input order and so on. Regular expressions will also be discussed in detail.

Book links

To celebrate the new release, you can download the PDF/EPUB versions for free till 06-April-2025.

Or, you can read it online at https://learnbyexample.github.io/learn_gnuawk/

Interactive TUI apps

Feedback

I would highly appreciate it if you'd let me know how you felt about this book. It could be anything from a simple thank you, pointing out a typo, mistakes in code snippets, which aspects of the book worked for you (or didn't!) and so on.

Happy learning :)

top 19 comments
sorted by: hot top controversial new old
[–] [email protected] 16 points 2 days ago (1 children)

i'm in awe everytime people do this.

i learned how to do this before code sharing sites like github existed and it forced me to turn everything i've learned into muscle memory and i think that this next generation of greybeards are going to be so much better than my generation's greybeards because of it.

[–] [email protected] 7 points 2 days ago

I thought your comment was going in a totally different direction, it's nice to hear appreciation of improved teaching methods instead of the old "well I figured it out myself so everyone else should too"

[–] [email protected] 13 points 2 days ago

Finally, time to learn how to use awk! Sed, you're next.

[–] [email protected] 2 points 1 day ago* (last edited 1 day ago)

Btw, there's asciidoctor-epub3.

[–] [email protected] 8 points 2 days ago

Awesome! Thanks so much for doing this and sharing!

[–] [email protected] 5 points 2 days ago

Looking forward to reading it! awk has been a huge blind spot for me for a long time now.

[–] [email protected] 5 points 2 days ago* (last edited 2 days ago) (4 children)

Could someone perhaps explain the major use cases or give a real life example of a time you've needed to use awk? I've been using Linux casually for quite a long time now, and although I learned the basics of the tool, I can't recall having ever felt I had a need for it. If I want to glue a bunch of cli stuff together and need to do some text processing, it generally seems like it'd be easier to just use a simple python script.

Is it more for situations that need to be compatible with most *nix systems and you might not necessarily have access to a higher level scripting language?

[–] [email protected] 8 points 2 days ago

Is it more for situations that need to be compatible with most *nix systems and you might not necessarily have access to a higher level scripting language?

Yes, and also because integrating Python one-liners into shell pipelines is awkward in general. I'm more likely to write my entire script in Python than to use it just for text processing, and a lot of the time that's just a pain. Python isn't really designed for one-liners or for use as a shell. You can twist it into working in those use cases, but then I'd ask the reverse question: why would you do that when you could "just" use awk?

On macOS, Python is not installed by default. So if you are writing scripts that you want to be portable across platforms, or for general Mac administration, using Python is a burden.

This is also true when working with some embedded devices. IIRC I can ssh into my router and use awk (thanks to it being included in Busybox), but I'm definitely not going to install an entire Python environment there. I'm not sure there'd even be enough storage space for that.

[–] [email protected] 6 points 2 days ago (1 children)

Well, if you are comfortable with Python scripts, there's not much reason to switch to awk. Unless perhaps you are equating awk to Python as scripting languages instead of CLI usage (like grep, sed, cut, etc) as my ebook focuses on. For example, if you have space separated columns of data, awk '{print $2}' will give you just the second column (no need to write a script when a simple one-liner will do). This of course also allows you to integrate with shell features (like globs).

As a practical example, I use awk to filter and process particular entries from financial data (which is in csv format). Just a case of easily arriving at a solution in a single line of code (which I then save it for future use).

[–] [email protected] 4 points 2 days ago* (last edited 2 days ago)

Also AWK is made to be fast, right? I suppose doing something in CPython in a non efficient way might not be noticeable with a bit of text, but would show up with a large enough data stream.

[–] [email protected] 5 points 2 days ago* (last edited 1 day ago)

I manage some servers and awk can be useful to filter data. If you use commands like grep, and use the pipe operator (the " | " command), awk can be very handy.

Sure, a Python script can do that as well, but doing a one-liner in Bash is waaay faster to program.

[–] [email protected] 5 points 2 days ago (1 children)

I'm not expert in sed or awk. I always have to Google. For me though, it's generally that you can do a great deal in just one line of awk or sed. They're standard on any Linux distribution I've ever used. When building out pipelines, scripts that you want run from an installer you built post install and when removing, sed and awk rather than needing python.

All really nice when you have strict configuration management and versioning and there's something deployed but it doesn't have the python packages installed that would make it easy in python and you can't just pip install it on hundreds+ of computers without going through a process of approval and building a new tagged version release but sed/awk/etc can do the job. If it's hard enough, python and whatever packages you can install. If simple enough to do in a small bash script, no python just what's standard in your Linux distro

[–] [email protected] 1 points 2 days ago

I'm not expert in sed or awk. I always have to Google. For me though, it's generally that you can do a great deal in just one line of awk or sed.

Same here ! I recently used a one liner awk piped into sed, piped into another command to find duplicated lines and merge both files.

Writing a python script would have taken an unknown amount of time !

[–] [email protected] 5 points 2 days ago

I like awk, I started using it in the 90s in university, I'm not a pro, but this is so powerful!

[–] [email protected] 3 points 2 days ago (1 children)

Perfect. I'm feeling comfortable enough with bash that next on my list is AWK. Gonna download this when I get home!

[–] [email protected] 2 points 1 day ago (1 children)
[–] [email protected] 1 points 1 day ago

Heh. I haven't started exploring other shells yet.

[–] [email protected] 2 points 2 days ago

Beautiful, thank you so much!

[–] [email protected] -4 points 2 days ago

Thanks but omg no. I'm happy to use awk like cut.