core.texttools.regext #
escape_regex_chars
Escapes special regex metacharacters in a string to make it safe for use in regex patterns.
import incubaid.herolib.core.texttools.regext
escaped := regext.escape_regex_chars('file.txt')
// Result: "file\.txt"
// Use in regex patterns:
safe_search := regext.escape_regex_chars('[test]')
// Result: "\[test\]"
Special characters escaped: . ^ $ * + ? { } [ ] \ | ( )
wildcard_to_regex
Converts simple wildcard patterns to regex patterns for flexible file matching.
Conversion rules:- * becomes .* (matches any sequence of characters)
?becomes.(matches any single character)- Special regex characters are escaped (
. + ( ) [ ] { } ^ $ \ |) - Patterns without wildcards return the literal pattern (no implicit
.*wrapping)
Note: This function only converts wildcards to regex. It does NOT add implicit
^and$anchors or.*wrappers. The caller is responsible for determining how the resulting pattern should be matched (e.g., substring vs exact match). When used with theMatcher, patterns without wildcards are treated as exact matches.
import incubaid.herolib.core.texttools.regext
// Match files ending with .txt
pattern1 := regext.wildcard_to_regex('*.txt')
// Result: ".*\.txt"
// Match anything starting with test
pattern2 := regext.wildcard_to_regex('test*')
// Result: "test.*"
// Literal pattern (no wildcards) - returns as-is with escaped special chars
pattern3 := regext.wildcard_to_regex('config')
// Result: "config"
// Complex pattern with special chars
pattern4 := regext.wildcard_to_regex('src/*.v')
// Result: "src/.*\.v"
// Multiple wildcards
pattern5 := regext.wildcard_to_regex('*test*file*')
// Result: ".*test.*file.*"
// For substring matching, use explicit wildcards:
pattern6 := regext.wildcard_to_regex('*config*')
// Result: ".*config.*"
Regex Group Finders
find_sid
Extracts unique sid values from a given text. A sid is identified by the pattern sid:XXXXXX, where XXXXXX can be alphanumeric characters.
import incubaid.herolib.core.texttools.regext
text := `
!!action.something sid:aa733
sid:aa733
...sid:aa733 ss
...sid:rrrrrr ss
sid:997
sid:s d
sid:s_d
`
r := regext.find_sid(text)
// Result: ['aa733', 'aa733', 'aa733', '997']
find_simple_vars
Extracts simple variable names enclosed in curly braces, e.g., {var_name}, from a given text. Variable names can contain letters, numbers, and underscores.
import incubaid.herolib.core.texttools.regext
text := `
!!action.something {sid}
sid:aa733
{a}
...sid:rrrrrr ss {a_sdsdsdsd_e__f_g}
sid:997
sid:s d
sid:s_d
`
r := regext.find_simple_vars(text)
// Result: ['sid', 'a', 'a_sdsdsdsd_e__f_g']
regex replacer
Tool to flexibly replace elements in file(s) or text.
import incubaid.herolib.core.texttools.regext
text := '
this is test_1 SomeTest
this is Test 1 SomeTest
need to replace TF to ThreeFold
need to replace ThreeFold0 to ThreeFold
need to replace ThreeFold1 to ThreeFold
'
text_out := '
this is TTT SomeTest
this is TTT SomeTest
need to replace ThreeFold to ThreeFold
need to replace ThreeFold to ThreeFold
need to replace ThreeFold to ThreeFold
'
mut ri := regext.regex_instructions_new()
ri.add(['TF:ThreeFold0:ThreeFold1:ThreeFold']) or { panic(err) }
ri.add_item('test_1', 'TTT') or { panic(err) }
ri.add_item('^Stest 1', 'TTT') or { panic(err) } //will be case insensitive search
mut text_out2 := ri.replace(text: text, dedent: true) or { panic(err) }
//pub struct ReplaceDirArgs {
//pub mut:
// path string
// extensions []string
// dryrun bool
//}
// if dryrun is true then will not replace but just show
ri.replace_in_dir(path:'/tmp/mypath',extensions:['md'])!
fn escape_regex_chars #
fn escape_regex_chars(s string) string
escape_regex_chars escapes special regex metacharacters in a string This makes a literal string safe to use in regex patterns. Examples: "file.txt" -> "file.txt" "a[123]" -> "a[123]"
fn find_sid #
fn find_sid(txt string) []string
find parts of text in form sid:abc till sid:abcde (can be a...z 0...9) . return list of the found elements . to make all e.g. lowercase do e.g. words = words.map(it.to_lower()) after it
fn find_simple_vars #
fn find_simple_vars(txt string) []string
find parts of text which are in form {NAME} . NAME is as follows: . Lowercase letters: a-z . Digits: 0-9 . Underscore: _ . . will return list of the found NAME's
fn new #
fn new(args_ MatcherArgs) !Matcher
Create a new matcher from arguments
Parameters:- regex: Include if matches regex pattern (e.g., $r'.*.v'$')
- regex_ignore: Exclude if matches regex pattern
- filter: Include if matches wildcard pattern (e.g., $r'.txt'$, $r'test'$, $r'config'$)
- filter_ignore: Exclude if matches wildcard pattern
Logic:- If both regex and filter patterns are provided, BOTH must match (AND logic)
- If only regex patterns are provided, any regex pattern can match (OR logic)
- If only filter patterns are provided, any filter pattern can match (OR logic)
- Exclude patterns take precedence over include patterns
Examples: $m := regex.new(regex: [r'..v$'])!$ $m := regex.new(filter: ['.txt'], filter_ignore: ['.bak'])!$ $m := regex.new(regex: [r'.test.'], regex_ignore: [r'._test.v$'])!$
fn regex_instructions_new #
fn regex_instructions_new() ReplaceInstructions
fn regex_rewrite #
fn regex_rewrite(r string) !string
rewrite a filter string to a regex . each char will be checked for in lower case as well as upper case (will match both) . will only look at ascii . '_- ' will be replaced to match one or more spaces . the returned result is a regex string
struct Matcher #
struct Matcher {
mut:
regex_include []regex.RE
filter_include []regex.RE
regex_exclude []regex.RE
}
Matcher matches strings against include/exclude regex patterns
fn (Matcher) match #
fn (m Matcher) match(text string) bool
match checks if a string matches the include patterns and not the exclude patterns
Logic:- If both regex and filter patterns exist, string must match BOTH (AND logic)
- If only regex patterns exist, string must match at least one (OR logic)
- If only filter patterns exist, string must match at least one (OR logic)
- Then check if string matches any exclude pattern; if yes, return false
- Otherwise return true
Examples: $m := regex.new(regex: [r'.*.v$'])!$ $result := m.match('file.v') // true$ $result := m.match('file.txt') // false$
$m2 := regex.new(filter: ['.txt'], filter_ignore: ['.bak'])!$ $result := m2.match('readme.txt') // true$ $result := m2.match('backup.bak') // false$
$m3 := regex.new(filter: ['src*'], regex: [r'.*.v$'])!$ $result := m3.match('src/main.v') // true (matches both)$ $result := m3.match('src/config.txt') // false (doesn't match regex)$ $result := m3.match('main.v') // false (doesn't match filter)$
struct MatcherArgs #
struct MatcherArgs {
pub mut:
// Include if matches any regex pattern
regex []string
// Exclude if matches any regex pattern
regex_ignore []string
// Include if matches any wildcard pattern (* = any sequence)
filter []string
// Exclude if matches any wildcard pattern
filter_ignore []string
}
Arguments for creating a matcher
struct ReplaceArgs #
struct ReplaceArgs {
pub mut:
text string
dedent bool
}
struct ReplaceDirArgs #
struct ReplaceDirArgs {
pub mut:
path string
extensions []string
dryrun bool
}
struct ReplaceInstruction #
struct ReplaceInstruction {
pub:
regex_str string
find_str string
replace_with string
pub mut:
regex regex.RE
}
struct ReplaceInstructions #
struct ReplaceInstructions {
pub mut:
instructions []ReplaceInstruction
}
fn (ReplaceInstructions) add_item #
fn (mut self ReplaceInstructions) add_item(regex_find_str string, replace_with string) !
regex string see https://github.com/vlang/v/blob/master/vlib/regex/README.md . find_str is a normal search (text) . replace is the string we want to replace the match with
fn (ReplaceInstructions) add #
fn (mut ri ReplaceInstructions) add(replacelist []string) !
each element of the list can have more search statements . a search statement can have 3 forms.- regex start with ^R see https://github.com/vlang/v/blob/master/vlib/regex/README.md .
- case insensitive string find start with ^S (will internally convert to regex).
- just a string, this is a literal find (case sensitive) .input is ["^Rregex:replacewith",...] . input is ["^Rregex:^Rregex2:replacewith"] . input is ["findstr:findstr:replacewith"] . input is ["findstr:^Rregex2:replacewith"] .
fn (ReplaceInstructions) add_from_text #
fn (mut ri ReplaceInstructions) add_from_text(txt string) !
a text input file where each line has one of the following- regex start with ^R see https://github.com/vlang/v/blob/master/vlib/regex/README.md .
- case insensitive string find start with ^S (will internally convert to regex).
- just a string, this is a literal find (case sensitive) .example input ''' ^Rregex:replacewith ^Rregex:^Rregex2:replacewith ^Sfindstr:replacewith findstr:findstr:replacewith findstr:^Rregex2:replacewith ^Sfindstr:^Sfindstr2::^Rregex2:replacewith ''''
fn (ReplaceInstructions) replace #
fn (mut self ReplaceInstructions) replace(args ReplaceArgs) !string
this is the actual function which will take text as input and return the replaced result does the matching line per line . will use dedent function, on text
fn (ReplaceInstructions) replace_in_dir #
fn (mut self ReplaceInstructions) replace_in_dir(args ReplaceDirArgs) !int
if dryrun is true then will not replace but just show