Intro To 'tr' Command In Linux
2023-08-23 - By Robert Elder
I use the 'tr' command to replace or delete single-byte characters in files or streams:
echo "Arch Linus." | tr 's' 'x'
Arch Linux.
Replacing Multiple Characters Using 'tr' (Character Sets)
I can replace all occurrences of the letter 'a' with the letter 'b' like this:
echo "Apple_Banana" | tr 'a' 'b'
Apple_Bbnbnb
I can perform multiple character replacements at once by listing the search and replace character one after another, like this:
echo "Apple_Banana" | tr 'a_' 'b='
Apple=Bbnbnb
echo "Apple_Banana" | tr 'a_n' 'b=q'
Apple=Bbqbqb
echo "Apple_Banana" | tr 'a_np' 'b=qg'
Aggle=Bbqbqb
Replacing The Complement Of A Character Set
The '-c' flag causes the replacement to act on the complement of the character set. In this command, I'm replacing any non-printable characters in the bash executable binary with a period character:
tr -c '[:print:]' '.' < /bin/bash | less
Deleting Characters
If we start with the file 'nani.txt':
cat nani.txt
Omae
wa
mou
shindeiru
xxd nani.txt
00000000: 4f6d 6165 200a 7761 200a 6d6f 7520 0a73 Omae .wa .mou .s
00000010: 6869 6e64 6569 7275 200a hindeiru .
The '-d' flag lets me delete unwanted characters like newlines:
cat nani.txt | tr -d '\n'
Omae wa mou shindeiru
cat nani.txt | tr -d '\n' | xxd
00000000: 4f6d 6165 2077 6120 6d6f 7520 7368 696e Omae wa mou shin
00000010: 6465 6972 7520 deiru
Squeezing Repeated Characters Together
The '-s' flag allows you to squeeze repeated characters together. I can correct the repeated quote characters in this JSON document named 'blocks.json':
{
"""inventory""":[
{
"""type""":"""coal""",
"""count""": 43
},
{
"""type""":"""diamond""",
"""count""": 64
},
{
"""type""":"""lapis_lazuli""",
"""count""": 9999999
}
]
}
using this command:
cat blocks.json | tr -s '"'
{
"inventory":[
{
"type":"coal",
"count": 43
},
{
"type":"diamond",
"count": 64
},
{
"type":"lapis_lazuli",
"count": 9999999
}
]
}
Caveats Of The 'tr' Command
The tr command has many caveats, such as its use of the less popular POSIX notation to describe character sets:
man tr
...
[:alnum:]
all letters and digits
[:alpha:]
all letters
[:blank:]
all horizontal whitespace
[:cntrl:]
all control characters
...
and non-portability of character ranges:
info tr
...
Many historically common and even accepted uses of ranges are not
portable. For example, on EBCDIC hosts using the ‘A-Z’ range will
not do what most would expect because ‘A’ through ‘Z’ are not
contiguous as they are in ASCII. If you can rely on a POSIX
compliant version of ‘tr’, then the best way to work around this is
to use character classes (see below). Otherwise, it is most
portable (and most ugly) to enumerate the members of the ranges.
...
It also currently (as of 2023) lacks support for multi-byte character replacements:
info tr
...
Currently ‘tr’ fully supports only single-byte characters.
Eventually it will support multibyte characters; when it does, the ‘-C’
option will cause it to complement the set of characters, whereas ‘-c’
will cause it to complement the set of values. This distinction will
matter only when some values are not characters, and this is possible
only in locales using multibyte encodings when the input contains
encoding errors.
...
And that's why the 'tr' command is my favourite Linux command.
Intro To 'stty' Command In Linux
Published 2023-10-04 |
$1.00 CAD |
Intro To 'nproc' Command In Linux
Published 2023-07-15 |
Intro To 'comm' Command In Linux
Published 2023-09-06 |
How To Force The 'true' Command To Return 'false'
Published 2023-07-09 |
A Surprisingly Common Mistake Involving Wildcards & The Find Command
Published 2020-01-21 |
A Guide to Recording 660FPS Video On A $6 Raspberry Pi Camera
Published 2019-08-01 |
Intro To 'chroot' Command In Linux
Published 2023-06-23 |
Join My Mailing List Privacy Policy |
Why Bother Subscribing?
|