I wrote a semantic action for my Boost Spirit Lexer to convert escape sequences in strings to what they stand for. It works perfectly and I want to convert it to a Boost Phoenix expression, but can't get that one to compile.
Here is what works:
// the semantic action
struct ConvertEscapes
{
template <typename ItT, typename IdT, typename CtxT>
void operator () (ItT& start, ItT& end, lex::pass_flags& matched, IdT& id, CtxT& ctx)
{
static boost::wregex escapeRgx(L"(\\\\r)|(\\\\n)|(\\\\t)|(\\\\\\\\)|(\\\\\")");
static std::wstring escapeRepl = L"(?1\r)(?2\n)(?3\t)(?4\\\\)(?5\")";
static std::wstring wval; // static b/c set_value doesn't seem to copy
auto const& val = ctx.get_value();
wval.assign(val.begin(), val.end());
wval = boost::regex_replace(wval,
escapeRgx,
escapeRepl,
boost::match_default | boost::format_all);
ctx.set_value(wval);
}
};
// the token declaration
lex::token_def<std::wstring, wchar_t> literal_str;
// the token definition
literal_str = L"\\\"([^\\\\\"]|(\\\\.))*\\\""; // string with escapes
// adding it to the lexer
this->self += literal_str [ ConvertEscapes() ];
This is what I tried to convert it:
this->self += literal_str
[
lex::_val = boost::regex_replace(lex::_val /* this is the place I can't figure out */,
boost::wregex(L"(\\\\r)|(\\\\n)|
(\\\\t)|(\\\\\\\\)|(\\\\\")"),
L"(?1\r)(?2\n)(?3\t)(?4\\\\)(?5\")",
boost::match_default | boost::format_all)
];
A wstring can't be constructed from _val. _val also doesn't have begin() or end(), how is it supposed to be used anyway?
This std::wstring(lex::_start, lex::_end) fails, too, because those arguments aren't recognized as iterators.
In this question, I found phoenix::construct<std::wstring>(lex::_start, lex::_end), but this also doesn't really result in a wstring.
How do I get either a string or a pair of wchar_t iterators for the current token?
I'm going to chant the oft-heard "Why"?
This time, for good reason.
In general, avoid semantic actions: Boost Spirit: "Semantic actions are evil"?.
Phoenix Actors are needlessly more complex than the dedicated functor. They have a sweet point (mainly simple assignment or builtin operations). But if the actor is any kind of non-trivial you'll see the complexity ramp up quickly, not just for the human but also for the compiler. This leads to
This specific case
Can't work. At all.
The problem is that you're mixing lazy/deferred actors with direct invocations. That can never work. The type of
phoenix::construct<std::wstring>(lex::_start, lex::_end)isn't supposed to bestd::wstring. Of course. It is supposed to be a lazy actor¹ that can be used at some later time to create astd::wstring.Now that we know that (and why)
phoenix::construct<std::wstring>(lex::_start, lex::_end)is an actor type, it should become clear why it is completely bogus to callboost::regex_replaceon it. You might as well sayAnd wonder why it would not compile.
Summary:
You should probably just have the dedicated functor. You can of course Phoenix-adapt the regex functions you require, but all it does is shift the complexity tax for some syntactic sugar.
I'd always opt for the more naive approach that is going to be more understandable to a seasoned c++ programmer, and avoids pitfalls that come with high-wire acts².
Nevertheless, here's a pointer should you be curious:
http://www.boost.org/doc/libs/1_63_0/libs/phoenix/doc/html/phoenix/modules/function.html
Live On Coliru
¹ think composed function object that can be invoked at a later time
² the balance might tip if you were designing this as an EDSL for further configuration by non-experts, but then you will have the added responsibility of documenting your EDSL and the constraints in which it can be used
³ should we say, spirit-child of a brain?