Regular Expressions


alternation using pipe character
Backreferences to find the same text again
search and replace for chrome bookmarks clean up
class names in java and c#
lookahead and lookbehind
Negation and character classes

links:
regex cheatsheet     (local .pdf)
regular-expressions.info tutorials


Regular-Expressions.info -  great detailed tutorial   
use \Q ... \E to treat any escape chars as string literals - a quantifier such as + after the \E repeats only the last character

Visual Studio regex table - use {} rather than Editpad's () to tag for replacement with \1 \2 etc

Microsoft reference for regex in .NET

short though incomplete summary

 

\b word boundary
\d digit
\w word character [A-Za-z0-9_] and potentially more depending on regex flavor
\s whitespace character [ \t\r\n\f] and potentially more depending on regex flavor - note that \f is a form feed
\D not a digit = [^\d]
\W not a word character
\S not a whitespace character


? optional

the question mark for optional, the plus, the star and the repetition using curly braces are greedy; to make them lazy, append a question mark


Backreferences to find the same text again

regular-expressions.info page


class names in java and c#

this stackexchange post claims c# uses the same syntax as java and points to this post about regular expressions searching for java class names


 

lookahead and lookbehind

detailed info

 

  ahead behind
positive (?=regex) (?<=regex)
negative (?!regex)       

error(?!( - process|:.{0,1}\r\n|:\x0c))

error not immediately followed by any of these
 - process
:[optional space]\r\n
:[form feed]

(?<!regex)
     
positive capture (?=(regex)) (?<=(regex))
negavtive capture (?!(regex)) (?<!(regex))

 

e.g. to find newlines that don't have a preceding carriage return use negative lookbehind  (?<!\r)\n


Negation and Character Classes

regular-expressions.info page

[] for a character class
[^] for a negated character class

^(.*?\,){4}    .csv 4th item from start of line

(([^,\r\n]+),){3}  3 commas anywhere on a line except first position

 


alternation

detailed info

e.g. finds lines that don't have var or if ^(?!.*(var|if).*\r\n)

\b(cat|dog)\b  finds [word boundary]   [then either cat or dog]    [then word boundary]             compared to \bcat|dog\b  which finds either [word boundary then cat]    or   [dog then word boundary]


search and replace to clean up chrome bookmarks exports

see this page

 




last updated:    Thu 2023-03-09 5:43 AM