5.3 Using Standard Regular
Expressions Options and Operations
You can check the "Use regular expression" option in Find/Replace or Batch Replace to enable regular expression.
When this option is checked, the options "Match whole word" and "Use special characters" will be hidden, but the "Match case" option still can be used.
The "Match whole word" option is hidden because there are alternative and more delicate options within regular expression syntax. You can add \b switches at both sides of an expression for the same result. So the regular expression \bword\b means word with "Match whole word" on. You can also add the \b switch to only one end of an expression, so \bword matches both word and words. There is also a relevant capital switch \B, which means non-word border. So the regular expression word\B can match words, but not word.
The "Use special characters" option is hidden because it is STGuru's private feature. All it has are covered in regular expression syntax.
Regular Expression Basics
Regular expression is a highly professional technology, but also with strong
power. You may need a half day or even two whole days to learn its basics if you
haven't learned it before. The knowledge of regular expression needs a book to
describe, and we will not provide detailed instructions on this knowledge system
in this page. There is a list of select online regular expression tutorials
later in this page. You can learn the tutorials if interested.
Ex 1 This matches lines, each containing a label of a format like "[StudentA02]". Specifically, the label is bordered at left and right with "[" and "]". The part in the bracket is started with the capital word Student, followed by a capital letter and two numbers (01-99), such as:
学生 [StudentA02] 上午上数学课。 学生 [StudentC93] 早上打扫卫生。
If we suppose there is no rare exceptions, such as [StudentN00], we can simply search with the following expression:
^.*?\[Student[A-Z][0-9][0-9]\].*?\n
Ex 2 a This matches all lines NOT containing "student":
(?!.*student)^.*$
The syntax involved is "(?!exp)", which matches a position where exp is not found. This usage is described in The 30 Minute Regex Tutorial below. It can be seemed as a simplified version of:
b This matches all lines containing "teacher", but NOT containing "student":
(?!.*student)^.*?teacher.*?$
Ex 3 a This matches the line “Start Line” and all lines before it:
\A(.*?\n){2,}Start Line\n
b This matches the line "End Line" and all lines after it:
^End Line(.*?\n){2,}.*?\Z
Ex 4 a This matches a line containing a string that starts with the word “and”, ends with the word “whose” and contains any 2 words (a word is a string made up of alphanumeric characters):
^.*?\band \w+ \w+ whose\b.*?\n
b This matches a line containing a string that starts with the word “and”, ends with the word “whose” and contains any 0-5 words (a word is a string made up of alphanumeric characters). If the word count is 0, it means a line containing “and whose”:
^.*?\band (\w+ ){0,5}whose\b.*?\n
Recommended Online Regular Expression Tutorials
Do not be misled by the words 30 minute in the following titles. You usually need a half day or two to have a rough understanding of the delicate use of regular expressions. To fully master it?...a lot lot of time, but may not be really necessary. You can also focus on some most attracting features and use them to hasten your work. It may not take too long for you to start this way.
The 30 Minute Regex Tutorial (English) An English tutorial. 30 minutes is obviously NOT enough. You need a half day or even two days to grasp the basics.
Original URL: http://www.codeproject.com/KB/dotnet/regextutorial.aspx Search in Google:
Introduction to Regular Expressions (English) A tutorial by Microsoft.
Original URL: http://msdn.microsoft.com/en-us/library/28hw3sce Search in Google: The most important part (regular expression syntax): http://msdn.microsoft.com/en-us/library/ae5bf541.aspx
The 30 Minute Regex Tutorial (Chinese) This is a Chinese version built clearly based on the above English version.
Original URL: http://deerchao.net/tutorials/regex/regex.htm Search in Google: Search in Baidu:
Settings of the Regular Expression Engine Used by STGuru
Match Case mode: You can specify this option in the dialog box. Multiline mode: Fixed as True. Singleline mode: Fixed as False.
There are different regular expression engines. They are same in standard/major features, but might be slightly different in some minor details. The engine used by STGuru can also be different in a few minor details from those in the tutorials. You need to test by yourself to find the differences.
When you check on the Enable Replace check box at bottom left of the Find/Replace dialog box, the professional level edit function "Batch Replace" is enabled.
Click the "Batch Replace" button to open the "Batch Replace" dialog box:
Pic UG-5-2 The main Batch Replace dialog box
You can, in one click, perform a series of replace operation for unlimited number of find/replace pairs in predefined order. You can set four independent options for each pair - Apply, Match Whole Word, Match Case, and Use Special Characters. You can save each batch replace configuration to a batch file for long term use.
This is not only a great tool for text editing, but also of great additional help for code conversion between Simplified Chinese and Traditional Chinese.
Three page cleaning macros are included in the installation pack. They can be used to normalize and reorganize punctuation marks, blank spaces and paragraph-level page layout, and can be used as samples against which to edit and create batch replace macros. |