pub struct Config { /* private fields */ }
Expand description
The configuration used for building a bounded backtracker.
A bounded backtracker configuration is a simple data object that is
typically used with Builder::configure
.
Implementations§
source§impl Config
impl Config
sourcepub fn prefilter(self, pre: Option<Prefilter>) -> Config
pub fn prefilter(self, pre: Option<Prefilter>) -> Config
Set a prefilter to be used whenever a start state is entered.
A Prefilter
in this context is meant to accelerate searches by
looking for literal prefixes that every match for the corresponding
pattern (or patterns) must start with. Once a prefilter produces a
match, the underlying search routine continues on to try and confirm
the match.
Be warned that setting a prefilter does not guarantee that the search will be faster. While it’s usually a good bet, if the prefilter produces a lot of false positive candidates (i.e., positions matched by the prefilter but not by the regex), then the overall result can be slower than if you had just executed the regex engine without any prefilters.
By default no prefilter is set.
Example
use regex_automata::{
nfa::thompson::backtrack::BoundedBacktracker,
util::prefilter::Prefilter,
Input, Match, MatchKind,
};
let pre = Prefilter::new(MatchKind::LeftmostFirst, &["foo", "bar"]);
let re = BoundedBacktracker::builder()
.configure(BoundedBacktracker::config().prefilter(pre))
.build(r"(foo|bar)[a-z]+")?;
let mut cache = re.create_cache();
let input = Input::new("foo1 barfox bar");
assert_eq!(
Some(Match::must(0, 5..11)),
re.try_find(&mut cache, input)?,
);
Be warned though that an incorrect prefilter can lead to incorrect results!
use regex_automata::{
nfa::thompson::backtrack::BoundedBacktracker,
util::prefilter::Prefilter,
Input, HalfMatch, MatchKind,
};
let pre = Prefilter::new(MatchKind::LeftmostFirst, &["foo", "car"]);
let re = BoundedBacktracker::builder()
.configure(BoundedBacktracker::config().prefilter(pre))
.build(r"(foo|bar)[a-z]+")?;
let mut cache = re.create_cache();
let input = Input::new("foo1 barfox bar");
// No match reported even though there clearly is one!
assert_eq!(None, re.try_find(&mut cache, input)?);
sourcepub fn visited_capacity(self, capacity: usize) -> Config
pub fn visited_capacity(self, capacity: usize) -> Config
Set the visited capacity used to bound backtracking.
The visited capacity represents the amount of heap memory (in bytes) to
allocate toward tracking which parts of the backtracking search have
been done before. The heap memory needed for any particular search is
proportional to haystack.len() * nfa.states().len()
, which an be
quite large. Therefore, the bounded backtracker is typically only able
to run on shorter haystacks.
For a given regex, increasing the visited capacity means that the
maximum haystack length that can be searched is increased. The
BoundedBacktracker::max_haystack_len
method returns that maximum.
The default capacity is a reasonable but empirically chosen size.
Example
As with other regex engines, Unicode is what tends to make the bounded backtracker less useful by making the maximum haystack length quite small. If necessary, increasing the visited capacity using this routine will increase the maximum haystack length at the cost of using more memory.
Note though that the specific maximum values here are not an API guarantee. The default visited capacity is subject to change and not covered by semver.
use regex_automata::nfa::thompson::backtrack::BoundedBacktracker;
// Unicode inflates the size of the underlying NFA quite a bit, and
// thus means that the backtracker can only handle smaller haystacks,
// assuming that the visited capacity remains unchanged.
let re = BoundedBacktracker::new(r"\w+")?;
assert!(re.max_haystack_len() <= 7_000);
// But we can increase the visited capacity to handle bigger haystacks!
let re = BoundedBacktracker::builder()
.configure(BoundedBacktracker::config().visited_capacity(1<<20))
.build(r"\w+")?;
assert!(re.max_haystack_len() >= 25_000);
assert!(re.max_haystack_len() <= 28_000);
sourcepub fn get_prefilter(&self) -> Option<&Prefilter>
pub fn get_prefilter(&self) -> Option<&Prefilter>
Returns the prefilter set in this configuration, if one at all.
sourcepub fn get_visited_capacity(&self) -> usize
pub fn get_visited_capacity(&self) -> usize
Returns the configured visited capacity.
Note that the actual capacity used may be slightly bigger than the configured capacity.