Advertisement

grep command

Started by August 27, 2008 04:39 AM
5 comments, last by Bregma 16 years, 2 months ago
Why doesn't # grep "[a-z]\{2\}" text give me all the instances of a double lowercase letter in text? What's the correct way? Thanks.
"We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare. Now, thanks to the internet, we know this is not true." -- Professor Robert Silensky
That regex will match any occurrence of two consecutive lower case letters.

I think what you want is this BRE.
  grep '\([a-z]\)\1' text

That uses a backref to match a second occurrence of whatever matched the first re.

Stephen M. Webb
Professional Free Software Developer

Advertisement
Don't you want to use egrep or grep -e? By default, my grep doesn't understand regular expressions.
Widelands - laid back, free software strategy
Quote: Original post by Prefect
Don't you want to use egrep or grep -e? By default, my grep doesn't understand regular expressions.


Wow, that's weird. grep is an acronym for generalized regular expression pattern matcher. For grep to not understand regular expressions is a pretty weird and most unusual phenomenon.

Perhaps you are just confused by the plethora of regular expression dialects. The regular expression dialect supported by grep is the basic regular expression, or BRE. With the '-e' flag (or with egrep), the dialect is the extended regular expression, or ERE. Both of these dialects are defined in IEEE Std 1003.1-2001. I am unaware of an implementation of grep that does not conform to this POSIX standard.

Some implementations of grep also support (via the -p switch) the regular expression dialect known as Perl5 or ECMAScript, defined in ECMA-262. This dialect is a extension of EREs with all kinds of bonus extensions like lookbehind operators and special shortcut character classes to duplicate the existing character classes.

Where did you get a version of grep that does not understand regular expression?

Stephen M. Webb
Professional Free Software Developer

Ah, that makes sense, thanks for the explanation. I just remembered that grep didn't understand some regular expression that I wanted to use and egrep fixed it.

Is there a way to match "foo OR bar" with basic regular expressions? For some reason all the variants that I'd tried always failed on me. Of course it might have been something stupid like incorrect escaping.
Widelands - laid back, free software strategy
As in the following?
$ echo 'foobarapplefoobar' | grep 'foo\|bar'foobarfoobar
Advertisement
Quote: Original post by Prefect
Is there a way to match "foo OR bar" with basic regular expressions?

The BRE grammar as defined in the POSIX standard does not support the alternation operator. There is no way to match "foo OR bar" using standard basic regular expressions.

GNU grep supports BRE alternation as an extension. Relying on that behaviour is not be portable (which may not matter to you). GNU grep provides a host of extensions that are non-portable. Consult your local man pages.

Basic regular expressions as implemented in the C++ tr1 and the upcoming C++09 standard do not support alternation in basic mode. It uses the vertical bar to indicate alternation in all the other modes except grep mode, in which newline is the alternation operator.

Stephen M. Webb
Professional Free Software Developer

This topic is closed to new replies.

Advertisement