Skip to content

core.texttools.regext #

escape_regex_chars

Escapes special regex metacharacters in a string to make it safe for use in regex patterns.

import incubaid.herolib.core.texttools.regext

escaped := regext.escape_regex_chars('file.txt')
// Result: "file\.txt"

// Use in regex patterns:
safe_search := regext.escape_regex_chars('[test]')
// Result: "\[test\]"

Special characters escaped: . ^ $ * + ? { } [ ] \ | ( )

wildcard_to_regex

Converts simple wildcard patterns to regex patterns for flexible file matching.

Conversion rules:- * becomes .* (matches any sequence of characters)

  • ? becomes . (matches any single character)
  • Special regex characters are escaped (. + ( ) [ ] { } ^ $ \ |)
  • Patterns without wildcards return the literal pattern (no implicit .* wrapping)

Note: This function only converts wildcards to regex. It does NOT add implicit ^ and $ anchors or .* wrappers. The caller is responsible for determining how the resulting pattern should be matched (e.g., substring vs exact match). When used with the Matcher, patterns without wildcards are treated as exact matches.

import incubaid.herolib.core.texttools.regext

// Match files ending with .txt
pattern1 := regext.wildcard_to_regex('*.txt')
// Result: ".*\.txt"

// Match anything starting with test
pattern2 := regext.wildcard_to_regex('test*')
// Result: "test.*"

// Literal pattern (no wildcards) - returns as-is with escaped special chars
pattern3 := regext.wildcard_to_regex('config')
// Result: "config"

// Complex pattern with special chars
pattern4 := regext.wildcard_to_regex('src/*.v')
// Result: "src/.*\.v"

// Multiple wildcards
pattern5 := regext.wildcard_to_regex('*test*file*')
// Result: ".*test.*file.*"

// For substring matching, use explicit wildcards:
pattern6 := regext.wildcard_to_regex('*config*')
// Result: ".*config.*"

Regex Group Finders

find_sid

Extracts unique sid values from a given text. A sid is identified by the pattern sid:XXXXXX, where XXXXXX can be alphanumeric characters.

import incubaid.herolib.core.texttools.regext

text := `
!!action.something sid:aa733

sid:aa733

...sid:aa733 ss

...sid:rrrrrr ss
sid:997

   sid:s d
sid:s_d
`

r := regext.find_sid(text)
// Result: ['aa733', 'aa733', 'aa733', '997']

find_simple_vars

Extracts simple variable names enclosed in curly braces, e.g., {var_name}, from a given text. Variable names can contain letters, numbers, and underscores.

import incubaid.herolib.core.texttools.regext

text := `
!!action.something {sid}

sid:aa733

{a}

...sid:rrrrrr ss {a_sdsdsdsd_e__f_g}
sid:997

   sid:s d
sid:s_d
`

r := regext.find_simple_vars(text)
// Result: ['sid', 'a', 'a_sdsdsdsd_e__f_g']

regex replacer

Tool to flexibly replace elements in file(s) or text.

import incubaid.herolib.core.texttools.regext
text := '

this is test_1 SomeTest
this is Test 1 SomeTest

need to replace TF to ThreeFold
need to replace ThreeFold0 to ThreeFold
need to replace ThreeFold1 to ThreeFold

'

text_out := '

this is TTT SomeTest
this is TTT SomeTest

need to replace ThreeFold to ThreeFold
need to replace ThreeFold to ThreeFold
need to replace ThreeFold to ThreeFold

'

mut ri := regext.regex_instructions_new()
ri.add(['TF:ThreeFold0:ThreeFold1:ThreeFold']) or { panic(err) }
ri.add_item('test_1', 'TTT') or { panic(err) }
ri.add_item('^Stest 1', 'TTT') or { panic(err) } //will be case insensitive search

mut text_out2 := ri.replace(text: text, dedent: true) or { panic(err) }

//pub struct ReplaceDirArgs {
//pub mut:
// path       string
// extensions []string
// dryrun     bool
//}
// if dryrun is true then will not replace but just show
ri.replace_in_dir(path:'/tmp/mypath',extensions:['md'])!

fn escape_regex_chars #

fn escape_regex_chars(s string) string

escape_regex_chars escapes special regex metacharacters in a string This makes a literal string safe to use in regex patterns. Examples: "file.txt" -> "file.txt" "a[123]" -> "a[123]"

fn find_sid #

fn find_sid(txt string) []string

find parts of text in form sid:abc till sid:abcde (can be a...z 0...9) . return list of the found elements . to make all e.g. lowercase do e.g. words = words.map(it.to_lower()) after it

fn find_simple_vars #

fn find_simple_vars(txt string) []string

find parts of text which are in form {NAME} . NAME is as follows: . Lowercase letters: a-z . Digits: 0-9 . Underscore: _ . . will return list of the found NAME's

fn new #

fn new(args_ MatcherArgs) !Matcher

Create a new matcher from arguments

Parameters:- regex: Include if matches regex pattern (e.g., $r'.*.v'$')

  • regex_ignore: Exclude if matches regex pattern
  • filter: Include if matches wildcard pattern (e.g., $r'.txt'$, $r'test'$, $r'config'$)
  • filter_ignore: Exclude if matches wildcard pattern

Logic:- If both regex and filter patterns are provided, BOTH must match (AND logic)

  • If only regex patterns are provided, any regex pattern can match (OR logic)
  • If only filter patterns are provided, any filter pattern can match (OR logic)
  • Exclude patterns take precedence over include patterns

Examples: $m := regex.new(regex: [r'..v$'])!$ $m := regex.new(filter: ['.txt'], filter_ignore: ['.bak'])!$ $m := regex.new(regex: [r'.test.'], regex_ignore: [r'._test.v$'])!$

fn regex_instructions_new #

fn regex_instructions_new() ReplaceInstructions

fn regex_rewrite #

fn regex_rewrite(r string) !string

rewrite a filter string to a regex . each char will be checked for in lower case as well as upper case (will match both) . will only look at ascii . '_- ' will be replaced to match one or more spaces . the returned result is a regex string

struct Matcher #

struct Matcher {
mut:
	regex_include  []regex.RE
	filter_include []regex.RE
	regex_exclude  []regex.RE
}

Matcher matches strings against include/exclude regex patterns

fn (Matcher) match #

fn (m Matcher) match(text string) bool

match checks if a string matches the include patterns and not the exclude patterns

Logic:- If both regex and filter patterns exist, string must match BOTH (AND logic)

  • If only regex patterns exist, string must match at least one (OR logic)
  • If only filter patterns exist, string must match at least one (OR logic)
  • Then check if string matches any exclude pattern; if yes, return false
  • Otherwise return true

Examples: $m := regex.new(regex: [r'.*.v$'])!$ $result := m.match('file.v') // true$ $result := m.match('file.txt') // false$

$m2 := regex.new(filter: ['.txt'], filter_ignore: ['.bak'])!$ $result := m2.match('readme.txt') // true$ $result := m2.match('backup.bak') // false$

$m3 := regex.new(filter: ['src*'], regex: [r'.*.v$'])!$ $result := m3.match('src/main.v') // true (matches both)$ $result := m3.match('src/config.txt') // false (doesn't match regex)$ $result := m3.match('main.v') // false (doesn't match filter)$

struct MatcherArgs #

@[params]
struct MatcherArgs {
pub mut:
	// Include if matches any regex pattern
	regex []string
	// Exclude if matches any regex pattern
	regex_ignore []string
	// Include if matches any wildcard pattern (* = any sequence)
	filter []string
	// Exclude if matches any wildcard pattern
	filter_ignore []string
}

Arguments for creating a matcher

struct ReplaceArgs #

@[params]
struct ReplaceArgs {
pub mut:
	text   string
	dedent bool
}

struct ReplaceDirArgs #

@[params]
struct ReplaceDirArgs {
pub mut:
	path       string
	extensions []string
	dryrun     bool
}

struct ReplaceInstruction #

struct ReplaceInstruction {
pub:
	regex_str    string
	find_str     string
	replace_with string
pub mut:
	regex regex.RE
}

struct ReplaceInstructions #

struct ReplaceInstructions {
pub mut:
	instructions []ReplaceInstruction
}

fn (ReplaceInstructions) add_item #

fn (mut self ReplaceInstructions) add_item(regex_find_str string, replace_with string) !

regex string see https://github.com/vlang/v/blob/master/vlib/regex/README.md . find_str is a normal search (text) . replace is the string we want to replace the match with

fn (ReplaceInstructions) add #

fn (mut ri ReplaceInstructions) add(replacelist []string) !

each element of the list can have more search statements . a search statement can have 3 forms.- regex start with ^R see https://github.com/vlang/v/blob/master/vlib/regex/README.md .

  • case insensitive string find start with ^S (will internally convert to regex).
  • just a string, this is a literal find (case sensitive) .input is ["^Rregex:replacewith",...] . input is ["^Rregex:^Rregex2:replacewith"] . input is ["findstr:findstr:replacewith"] . input is ["findstr:^Rregex2:replacewith"] .

fn (ReplaceInstructions) add_from_text #

fn (mut ri ReplaceInstructions) add_from_text(txt string) !

a text input file where each line has one of the following- regex start with ^R see https://github.com/vlang/v/blob/master/vlib/regex/README.md .

  • case insensitive string find start with ^S (will internally convert to regex).
  • just a string, this is a literal find (case sensitive) .example input ''' ^Rregex:replacewith ^Rregex:^Rregex2:replacewith ^Sfindstr:replacewith findstr:findstr:replacewith findstr:^Rregex2:replacewith ^Sfindstr:^Sfindstr2::^Rregex2:replacewith ''''

fn (ReplaceInstructions) replace #

fn (mut self ReplaceInstructions) replace(args ReplaceArgs) !string

this is the actual function which will take text as input and return the replaced result does the matching line per line . will use dedent function, on text

fn (ReplaceInstructions) replace_in_dir #

fn (mut self ReplaceInstructions) replace_in_dir(args ReplaceDirArgs) !int

if dryrun is true then will not replace but just show