Regex Tools |
grep |
PowerGREP |
RegexBuddy |
RegexMagic |
General Applications |
EditPad Lite |
EditPad Pro |
Databases |
MySQL |
Oracle |
PostgreSQL |
Boost is a free source code library for C++. After downloading and unzipping, you need to run the bootstrap batch file or script and then run b2 --with-regex to compile Boost’s regex library. Then add the folder into which you unzipped Boost to the include path of your C++ compiler. Add the stage\lib subfolder of that folder to your linker’s library path. Then you can add #include <boost/regex.hpp> to your C++ code to make use of Boost regular expressions.
If you use C++Builder, you should download the Boost libraries for your specific version of C++Builder from Embarcadero. The version of Boost you get depends on your version of C++Builder and whether you’re targeting Win32 or Win64. The Win32 compiler in XE3 through XE8, and the classic Win32 compiler in C++Builder 10 Seattle through 10.1 Berlin are all stuck on Boost 1.39. The Win64 compiler in XE3 through XE6 uses Boost 1.50. The Win64 compiler in XE7 through 10.1 Berlin uses Boost 1.55. The new C++11 Win32 compiler in C++Builder 10 and later uses the same version of boost as the Win64 compiler.
This website covers Boost 1.38, 1.39, and 1.42 through the latest 1.73. Boost 1.40 introduced many new regex features borrowed from Perl 5.10. But it also introduced some serious bugs that weren’t fixed until Boost 1.42. So we completely ignore Boost 1.40 and 1.41. We still cover Boost 1.38 and 1.39 (which have identical regex features) because the classic Win32 C++Builder compiler is stuck on this version. If you’re using another compiler, you should definitely use Boost 1.42 or later to avoid what are now old bugs. You should preferably use Boost 1.47 or later as this version changes certain behaviors involving backreferences that may change how some of your regexes behave if you later upgrade from pre-1.47 to post-1.47.
In practice, you’ll mostly use the Boost’s ECMAScript grammar. It’s the default grammar and offers far more features that the other grammars. Whenever the tutorial on this website mentions Boost without mentioning any grammars then what is written applies to the ECMAScript grammar and may or may not apply to any of the other grammars. You’ll really only use the other grammars if you want to reuse existing regular expressions from old POSIX code or UNIX scripts.
The Boost documentation likes to talk about being compatible with Perl and JavaScript and how boost::regex was standardized as std::regex in C++11. When we compare the Dinkumware implementation of std::regex (included with Visual Studio and C++Builder) with boost::regex, we find that the class and function templates are almost the same. Your C++ compiler will just as happily compile code using boost::regex as it does compiling the same code using std::regex. So all the code examples given in the std::regex topic on this website work just fine with Boost if you replace std with boost.
But when you run your C++ application then it can make a big difference whether it is Dinkumware or Boost that is interpreting your regular expressions. Though both offer the same six grammars, their syntax and behavior are not the same between the two libraries. Boost defines regex_constants::perl which is not part of the C++11 standard. This is not actually an additional grammar but simply a synonym to ECMAScript and JavaScript. There are major differences in the regex flavors used by actual JavaScript and actual Perl. So it’s obvious that a library treating these as one flavor or grammar can’t be compatible with either. Boost’s ECMAScript grammar is a cross between the actual JavaScript and Perl flavors, with a bunch of Boost-specific features and peculiarities thrown in. Dinkumware’s ECMAScript grammar is closer to actual JavaScript, but still has significant behavioral differences. Dinkumware didn’t borrow any features from Perl that JavaScript doesn’t have.
The table below highlights the most important differences between the ECMAScript grammars in std::regex and Boost and actual JavaScript and Perl. Some are obvious differences in feature sets. But others are subtle differences in behavior that may bite you unexpectedly.
Feature | std::regex | Boost | JavaScript | Perl |
---|---|---|---|---|
Dot matches line breaks | never | default | never | option |
Anchors match at line breaks | always | default | option | option |
Line break characters | CR, LF | CR, LF, FF, NEL, LS, PS | CR, LF, LS, PS | LF |
Backreferences to non-participating groups | Match empty string | fail since 1.47 | Match empty string | fail |
Empty character class | Fails to match | Not possible | Fails to match | Not possible |
Free-spacing mode | no | YES | no | YES |
Mode modifiers | no | YES | no | YES |
Possessive quantifiers | no | YES | no | YES |
Named capture | no | .NET syntax | no | .NET & Python syntax |
Recursion | no | atomic | no | backtracking |
Subroutines | no | backtracking | no | backtracking |
Conditionals | no | YES | no | YES |
Atomic groups | no | YES | no | YES |
Atomic groups backtrack capturing groups | n/a | no | n/a | YES |
Start and end of word boundaries | no | YES | no | no |
Standard POSIX classes | YES | YES | no | YES |
Single letter POSIX classes | no | YES | no | no |
Feature | std::regex | Boost | JavaScript | Perl |
| Quick Start | Tutorial | Tools & Languages | Examples | Reference | Book Reviews |
| grep | PowerGREP | RegexBuddy | RegexMagic |
| EditPad Lite | EditPad Pro |
| Boost | Delphi | GNU (Linux) | Groovy | Java | JavaScript | .NET | PCRE (C/C++) | PCRE2 (C/C++) | Perl | PHP | POSIX | PowerShell | Python | R | Ruby | std::regex | Tcl | VBScript | Visual Basic 6 | wxWidgets | XML Schema | Xojo | XQuery & XPath | XRegExp |
| MySQL | Oracle | PostgreSQL |
Page URL: https://www.regular-expressions.info/boost.html
Page last updated: 24 August 2021
Site last updated: 06 November 2024
Copyright © 2003-2024 Jan Goyvaerts. All rights reserved.