Bison Stack emantic value - debugging

This one is Lexical Analyzer using Flex.
#include <iostream>
#include <cstdio>
#define YY_DECL extern "C" int yylex()
#include "conv.tab.h"
using namespace std;
%}
eq [ \t]*=
%%
[ \t] ;
(?:POINT|LINE) { yylval.ename = strdup(yytext); return ENAME; }
x{eq} { yylval.xval = atof(yytext);
return XVAL; }
y{eq} { yylval.yval = atof(yytext);
return YVAL; }
. ;
%%
And other file is Bison grammar file
%{
#include <iostream>
#include <cstdio>
#include <stdio.h>
using namespace std;
extern "C" int yylex ();
extern "C" int yyparse (void);
extern "C" FILE *yyin;
extern int line_no;
void yyerror(const char *s);
%}
%union{
float xval;
float yval;
char *ename;
}
%token <ename> ENAME
%token XVAL
%token YVAL
%%
converter:
converter ENAME { cout << "entity = " << $2 << endl; }
| converter XVAL {// x -> xval = $2;
cout << "x value = " << endl; }
| converter YVAL {// y -> yval = $2;
cout << "y value = " << endl; }
| ENAME { cout << "entity = " << $1 << endl; }
| XVAL { cout << "xvalue " << endl; }
| YVAL { cout << "yvalue " << endl; }
%%
main() {
FILE *myfile = fopen("conv.aj", "r");
if (!myfile) {
cout << "I can't open file" << endl;
return -1;
}
yyin = myfile;
do{
yydebug = 1;
yyparse();
} while (!feof(yyin));
yydebug = 2;
}
void yyerror(const char *s) {
cout << "Parser error! Message: " << s << endl;
exit(-1);
}
Actually, I want to retrieve values from a file. I used the Bison Debugger and get to know that those values are not able to push onto Bison Stack. So basically I want to push those values onto the stack.My file is like :
POINT
x=38
y=47

Nothing in your lexical analyzer matches a number, so the 38 and 47 from the input will both be handled by your default rule (. ;) which will cause them to be ignored. In your rules for XVAL and YVAL, you call atoi on yytext, which will be x= (or y=); that is clearly not a number and atoi will probably return 0.
It's not clear to me what you mean by "those values are not able to push onto Bison Stack", but I think this problem has nothing to do with bison or its stack.
By the way:
There is no need to have two different members in your semantic type for xval and yval. The type is a union, not a struct, so having two members of the same type (float) is redundant.
flex doesn't do regex captures. So there is really no point avoiding a capture with (?:...); it just obscures your grammar. You might as well use:
POINT|LINE: { yylval.ename = strdup(yytext); return ENAME; }
On the other hand, you might be better off defining two different token types, which would avoid the need for the strdup. (You don't seem to be freeing the duplicated string, so the strdup is also a memory leak.) Alternatively, you could use an enumerated value in your semantic type:
POINT { yylval.ename_enum=POINT; return ENAME; }
LINE { yylval.ename_enum=LINE; return ENAME; }
. ; is not really a good idea, especially during development, because it hides errors (such as the one you have). You can use %option nodefault to avoid flex's default rule, and then flex will present an error when an illegal character is detected.
Unless you're using really old versions of bison and flex, you can just compile the generated code as c++. There should not be a need to use extern "C"

Related

Selecting which overload is used in c++11

In the following code, as none of the arguments is const, i can't understand why the second overload is called in the 3 following cases.
#include <iostream>
#include <algorithm>
using namespace std;
void ToLower( std::string& ioValue )
{
std::transform( ioValue.begin(), ioValue.end(), ioValue.begin(), ::tolower );
}
std::string ToLower( const std::string& ioValue )
{
std::string aValue = ioValue;
ToLower(aValue);
return aValue;
}
int main()
{
string test = "test";
cout<<"Hello World" << endl;
// case 1
cout << ToLower("test") << endl;
// case 2
cout << ToLower(static_cast<string>(test)) << endl;
// case 3
cout << ToLower(string(test)) << endl;
}
In all 3 cases you are creating a temporary std::string, this is an unnamed object, an R-value. R-values aren't allowed to bind to non-const l-value references (T&) and so only the overload taking const std::string& ioValue is valid.
The reasoning is the return type is std::string for the second function but void for the first. std::cout << (void) << std::endl is not a valid set of operations. std::cout << (std::string) << std::endl is. If you return a std::string& from the first function you'd probably see #2 & #3 probably use your first function call.

How to correctly transfer the ownership of a shared_ptr?

I have the following code snipet:
// code snipet one:
#include <memory>
#include <iostream>
#include <queue>
struct A {
uint32_t val0 = 0xff;
~A() {
std::cout << "item gets freed" << std::endl;
}
};
typedef std::shared_ptr<A> A_PTR;
int main()
{
std::queue<A_PTR> Q;
Q.push(std::make_shared<A>());
auto && temp_PTR = Q.front();
std::cout << "first use count = " << temp_PTR.use_count() << std::endl;
Q.pop();
std::cout << "second use count = " << temp_PTR.use_count() <<std::endl;
return 0;
}
After running it, I got the result as following:
first use count = 1
item gets freed
second use count = 0
Q1: is anybody can explain what the type of temp_PTR after the third line of main function is called?
if I change that line as
A_PTR && temp_PTR = Q.front();
compiler complains that
main.cpp: In function 'int main()':
main.cpp:26:32: error: cannot bind '__gnu_cxx::__alloc_traits > >::value_type {aka std::shared_ptr}' lvalue to 'A_PTR&& {aka std::shared_ptr&&}'
A_PTR && temp_PTR = Q.front();
Q2: I remember that the return value of a function should be a r-value, but it seems here the compiler tell me: " hey, the return value of Queue.front() is a l-value", why is here?
For Q2, I just check the C++ docs, that the return value of Queue.front() is refernece, that means it return a l-value
reference& front();
const_reference& front() const;
For Q3, it works for A_PTR temp_PTR = std::move(Q.front());, it is what I want.

ASIO handler arguments and boost::bind, compile time error

I am struggling with compile time errors, and try as I might, I dont see in what way am I doing it wrong or different from handler function signature as set out in documentation/examples. (I am using Boost 1.41 on Linux)
Please help me understand the error! (included below as snippet)
My application has objects whose methods are handlers for async_* functions. Below is the code snippet. The error is reported in the line labelled as "line 58", where I use boost::bind
class RPC {
public:
char recv_buffer[56];
void data_recv (void) {
socket.async_read_some (
boost::asio::buffer(recv_buffer),
boost::bind ( &RPC::on_data_recv, this, _1, _2 )
); // **<<==== this is line 58, that shows up in error listing**
global_stream_lock.lock();
std::cout << "[" << boost::this_thread::get_id()
<< "] data recvd" << std::endl;
global_stream_lock.unlock();
} // RPC::data_recv
void on_data_recv (boost::system::error_code& ec, std::size_t bytesRx) {
global_stream_lock.lock();
std::cout << "[" << boost::this_thread::get_id()
<< "] bytes rcvd: " << std::endl;
global_stream_lock.unlock();
data_recv(); // call function that waits for more data
} // RPC::on_data_recv
}; // RPC class def
There is a huge error output, but the relevant lines seem to be:
../src/besw.cpp:58: instantiated from here
/usr/include/boost/bind/bind.hpp:385: error: no match for call to ‘(boost::_m fi::mf2<void, RPC, boost::system::error_code&, long unsigned int>) (RPC*&, boost::asio::error::basic_errors&, int&)’
/usr/include/boost/bind/mem_fn_template.hpp:272: note: candidates are: R boost::_mfi::mf2<R, T, A1, A2>::operator()(T*, A1, A2) const [with R = void, T = RPC, A1 = boost::system::error_code&, A2 = long unsigned int]
/usr/include/boost/bind/mem_fn_template.hpp:291: note: R boost::_mfi::mf2<R, T, A1, A2>::operator()(T&, A1, A2) const [with R = void, T = RPC, A1 = boost::system::error_code&, A2 = long unsigned int]
make: *** [src/besw.o] Error 1
When I remove the place holders (_1 and _2) and have a handler without arguments, then it compiles and executes without errors. Here's that modified code snippet.
void data_recv (void) {
socket.async_read_some (
boost::asio::buffer(recv_buffer),
boost::bind ( &RPC::on_data_recv, this )
);
global_stream_lock.lock();
std::cout << "[" << boost::this_thread::get_id()
<< "] data recvd" << std::endl;
global_stream_lock.unlock();
} // RPC::data_recv
void on_data_recv (void) {
...
}
The error code cannot be taken by reference. Make it by-value or by const&:
void on_data_recv(boost::system::error_code/* ec */, size_t /*bytes_transferred*/) {
Also, consider using the Asio specific placeholders:
socket.async_read_some(boost::asio::buffer(recv_buffer),
boost::bind(&RPC::on_data_recv, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));
Also use proper lock guards. We're in C++! It's easy to make things exception-safe, so why not?
Live On Coliru
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <iostream>
#include <boost/thread.hpp>
static boost::mutex global_stream_lock;
class RPC {
char recv_buffer[56];
public:
void data_recv() {
socket.async_read_some(boost::asio::buffer(recv_buffer),
boost::bind(&RPC::on_data_recv, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));
boost::lock_guard<boost::mutex> lk(global_stream_lock);
std::cout << "[" << boost::this_thread::get_id() << "] data recvd" << std::endl;
global_stream_lock.unlock();
}
void on_data_recv(boost::system::error_code/* ec */, size_t /*bytes_transferred*/) {
{
boost::lock_guard<boost::mutex> lk(global_stream_lock);
std::cout << "[" << boost::this_thread::get_id() << "] bytes rcvd: " << std::endl;
}
data_recv(); // call function that waits for more data
}
boost::asio::io_service service;
boost::asio::ip::tcp::socket socket{service};
}; // RPC class def
int main() {}

Use of unicode predefined character classes in Boost Spirit

I am trying to use the letter character class from unicode i.e. \p{L} with Boost Spirit but I have no luck so far. Below is an example where I am trying to use (on line 30) the \p{L} character class. When I replace line 30 with line 29 it works but that is not the intended use as I need any letter from Unicode in my example.
My use case is for UTF8 only. At the end of they day what I am trying to do here is substract a unicode range from all unicode letters when using boost-spirit lexer.
PS
Of course, my example is trimmed down and may not make a lot of sense as a use case but I hope you get the idea.
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/fusion/include/std_pair.hpp>
#include <iostream>
#include <fstream>
#include <chrono>
#include <vector>
using namespace boost;
using namespace boost::spirit;
using namespace std;
using namespace std::chrono;
std::vector<pair<string, string> > getTokenMacros() {
std::vector<pair<string, string> > tokenDefinitionsVector;
tokenDefinitionsVector.emplace_back("JAPANESE_HIRAGANA", "[\u3041-\u3096]");
tokenDefinitionsVector.emplace_back("JAPANESE_HIRAGANA1",
"[\u3099-\u309E]");
tokenDefinitionsVector.emplace_back("ASIAN_NWS", "{JAPANESE_HIRAGANA}|"
"{JAPANESE_HIRAGANA1}");
tokenDefinitionsVector.emplace_back("ASIAN_NWS_WORD", "{ASIAN_NWS}*");
//tokenDefinitionsVector.emplace_back("NON_ASIAN_LETTER", "[A-Za-z0-9]");
tokenDefinitionsVector.emplace_back("NON_ASIAN_LETTER", "[\\p{L}-[{ASIAN_NWS}]]");
tokenDefinitionsVector.emplace_back("WORD", "{NON_ASIAN_LETTER}+");
tokenDefinitionsVector.emplace_back("ANY", ".");
return tokenDefinitionsVector;
}
;
struct distance_func {
template<typename Iterator1, typename Iterator2>
struct result: boost::iterator_difference<Iterator1> {
};
template<typename Iterator1, typename Iterator2>
typename result<Iterator1, Iterator2>::type operator()(Iterator1& begin,
Iterator2& end) const {
return distance(begin, end);
}
};
boost::phoenix::function<distance_func> const distance_fctor = distance_func();
template<typename Lexer>
struct word_count_tokens: lex::lexer<Lexer> {
word_count_tokens() :
asianNwsWord("{ASIAN_NWS_WORD}", lex::min_token_id + 110), word(
"{WORD}", lex::min_token_id + 170), any("{ANY}",
lex::min_token_id + 3000) {
using lex::_start;
using lex::_end;
using boost::phoenix::ref;
std::vector<pair<string, string> > tokenMacros(getTokenMacros());
for (auto start = tokenMacros.begin(), end = tokenMacros.end();
start != end; start++) {
this->self.add_pattern(start->first, start->second);
}
this->self = asianNwsWord | word | any;
}
lex::token_def<> asianNwsWord, word, any;
};
int main(int argc, char* argv[]) {
typedef lex::lexertl::token<string::iterator> token_type;
typedef lex::lexertl::actor_lexer<token_type> lexer_type;
word_count_tokens<lexer_type> word_count_lexer;
// read in the file int memory
ifstream sampleFile("/home/dan/Documents/wikiSample.txt");
string str = "abc efg ぁあ";
string::iterator first = str.begin();
string::iterator last = str.end();
lexer_type::iterator_type iter = word_count_lexer.begin(first, last);
lexer_type::iterator_type end = word_count_lexer.end();
typedef boost::iterator_range<string::iterator> iterator_range;
vector<iterator_range> parsed_tokens;
while (iter != end && token_is_valid(*iter)) {
cout << (iter->id() - lex::min_token_id) << " " << iter->value()
<< endl;
const iterator_range range = get<iterator_range>(iter->value());
parsed_tokens.push_back(range);
++iter;
}
if (iter != end) {
string rest(first, last);
cout << endl << "!!!!!!!!!" << endl << "Lexical analysis failed\n"
<< "stopped at: \"" << rest << "\"" << endl;
cout << "#" << (int) rest.at(0) << "#" << endl;
}
return 0;
}

How would I implement a forth-style reverse-polish notation parser in boost spirit?

I'm trying to implement a parser for an old forth-based grammar where most of the functions take the form of: "num" "num" "command" where command is a string of some kind.
For example:
0 1 HSFF
41 SENSOR ON
1 12.0 BH 4 LNON
As you can see, the grammar is [mostly] reverse polish notation, with some string of arguments preceding the command. The grammar is pseudo white-space dependent, in that:
0 1 HSFF 41 SENSOR ON
Is as valid as:
0 1 HSFF
41 SENSOR ON
(In other words '\n' is treated just as a space)
Extra whitespace is also skipped, so:
0 1 HSFF 41 SENSOR ON
Is 2 valid commands with a lot of unnecessary whitespace.
All of this seemed simple enough, so I started chugging away at implementing the grammar. Of course, things are never as simple as they seem, and I found that my parser fails on the very first character (in this case an int). So, boiling things down, I tried implementing a single rule:
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
qi::rule<Iterator> Cmd_TARGETSENSPAIRCMD =
qi::int_ >> (lit("TARGET") | lit("SENSOR") | lit("PAIR") )
>> (lit("ON") | lit("OFF") | lit("ERASE") );
std::string in("0 TARGET ERASE\n");
Iterator = in.begin();
bool success = qi::parse(in.begin(), in.end(), Cmd_TARGETSENSPAIRCMD, ascii::space);
This code block always returns false, indicating that parsing has failed.
As you can see, the rule is that an int must be followed by two literals, in this case indicating whether the command is for a target, sensor, or pair, identified by the int, to be turned on, off, or erased.
If I look at the iterator to see where the parsing has stopped, it shows that it has failed immediately on the int. So I changed the rule to simply be +qi::int_, which succeeds in parsing the int, but fails on the literals. Shortening the rule to simply qi::int_ >> lit("TARGET") also fails.
I think the problem may be in the whitespace skipper I'm using, but I have been unable to determine what I'm doing wrong.
Is there a way to tell spirit that all tokens are separated by whitespace, with the exception of quoted strings (which turn into labels in my grammar)?
I have phantasized a little for you.
The first step I usually take is to come up with an AST model:
namespace Ast
{
enum Command { NO_CMD, TARGET, SENSOR, PAIR };
enum Modifier { NO_MODIFIER, ON, OFF, ERASE };
struct ModifiedCommand
{
Command cmd = NO_CMD;
Modifier mod = NO_MODIFIER;
};
struct OtherCommand
{
std::string token;
OtherCommand(std::string token = "") : token(std::move(token))
{ }
};
typedef boost::variant<int, double> Operand;
typedef boost::variant<Operand, ModifiedCommand, OtherCommand> RpnMachineInstruction;
typedef std::vector<RpnMachineInstruction> RpnMachineProgram;
}
As you can see I intend to distinguish integers and double for operand values, and I treat any "other" commands (like "HSSF") that wasn't actively described in your grammar as free-form tokens (uppercase alphabetical).
Now, we map the rule definitions onto this:
RpnGrammar() : RpnGrammar::base_type(_start)
{
_start = *_instruction;
_instruction = _operand | _mod_command | _other_command;
_operand = _strict_double | qi::int_;
_mod_command = _command >> _modifier;
_other_command = qi::as_string [ +qi::char_("A-Z") ];
// helpers
_command.add("TARGET", Ast::TARGET)("SENSOR", Ast::SENSOR)("PAIR", Ast::PAIR);
_modifier.add("ON", Ast::ON)("OFF", Ast::OFF)("ERASE", Ast::ERASE);
}
The grammar parses the result into a list of instructions (Ast::RpnMachineProgram), where each instruction is either an operand or an operation (a command with modifier, or any other free-form command like "HSSF"). Here are the rule declarations:
qi::rule<It, Ast::RpnMachineProgram(), Skipper> _start;
qi::rule<It, Ast::RpnMachineInstruction(), Skipper> _instruction;
qi::rule<It, Ast::ModifiedCommand(), Skipper> _mod_command;
qi::rule<It, Ast::Operand(), Skipper> _operand;
// note: omitting the Skipper has the same effect as wrapping with `qi::lexeme`
qi::rule<It, Ast::OtherCommand()> _other_command;
qi::real_parser<double, boost::spirit::qi::strict_real_policies<double> > _strict_double;
qi::symbols<char, Ast::Command> _command;
qi::symbols<char, Ast::Modifier> _modifier;
You can see it parse the sample from the question:
Parse succeeded, 10 stack instructions
int:0 int:1 'HSFF'
int:41 SENSOR [ON]
int:1 double:12 'BH'
int:4 'LNON'
The output is created with a sample visitor that you could use as inspiration for an interpreter/executor.
See it Live On Coliru
Full Listing
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <fstream>
namespace qi = boost::spirit::qi;
namespace Ast
{
enum Command { NO_CMD, TARGET, SENSOR, PAIR };
enum Modifier { NO_MODIFIER, ON, OFF, ERASE };
struct ModifiedCommand
{
Command cmd = NO_CMD;
Modifier mod = NO_MODIFIER;
};
struct OtherCommand
{
std::string token;
OtherCommand(std::string token = "") : token(std::move(token))
{ }
};
typedef boost::variant<int, double> Operand;
typedef boost::variant<Operand, ModifiedCommand, OtherCommand> RpnMachineInstruction;
typedef std::vector<RpnMachineInstruction> RpnMachineProgram;
// for printing, you can adapt this to execute the stack instead
struct Print : boost::static_visitor<std::ostream&>
{
Print(std::ostream& os) : os(os) {}
std::ostream& os;
std::ostream& operator()(Ast::Command cmd) const {
switch(cmd) {
case TARGET: return os << "TARGET" << " ";
case SENSOR: return os << "SENSOR" << " ";
case PAIR: return os << "PAIR" << " ";
case NO_CMD: return os << "NO_CMD" << " ";
default: return os << "#INVALID_COMMAND#" << " ";
}
}
std::ostream& operator()(Ast::Modifier mod) const {
switch(mod) {
case ON: return os << "[ON]" << " ";
case OFF: return os << "[OFF]" << " ";
case ERASE: return os << "[ERASE]" << " ";
case NO_MODIFIER: return os << "[NO_MODIFIER]" << " ";
default: return os << "#INVALID_MODIFIER#" << " ";
}
}
std::ostream& operator()(double d) const { return os << "double:" << d << " "; }
std::ostream& operator()(int i) const { return os << "int:" << i << " "; }
std::ostream& operator()(Ast::OtherCommand const& cmd) const {
return os << "'" << cmd.token << "'\n";
}
std::ostream& operator()(Ast::ModifiedCommand const& cmd) const {
(*this)(cmd.cmd);
(*this)(cmd.mod);
return os << "\n";
}
template <typename... TVariant>
std::ostream& operator()(boost::variant<TVariant...> const& v) const {
return boost::apply_visitor(*this, v);
}
};
}
BOOST_FUSION_ADAPT_STRUCT(Ast::ModifiedCommand, (Ast::Command, cmd)(Ast::Modifier, mod))
template <typename It, typename Skipper = qi::space_type>
struct RpnGrammar : qi::grammar<It, Ast::RpnMachineProgram(), Skipper>
{
RpnGrammar() : RpnGrammar::base_type(_start)
{
_command.add("TARGET", Ast::TARGET)("SENSOR", Ast::SENSOR)("PAIR", Ast::PAIR);
_modifier.add("ON", Ast::ON)("OFF", Ast::OFF)("ERASE", Ast::ERASE);
_start = *_instruction;
_instruction = _operand | _mod_command | _other_command;
_operand = _strict_double | qi::int_;
_mod_command = _command >> _modifier;
_other_command = qi::as_string [ +qi::char_("A-Z") ];
}
private:
qi::rule<It, Ast::RpnMachineProgram(), Skipper> _start;
qi::rule<It, Ast::RpnMachineInstruction(), Skipper> _instruction;
qi::rule<It, Ast::ModifiedCommand(), Skipper> _mod_command;
qi::rule<It, Ast::Operand(), Skipper> _operand;
// note: omitting the Skipper has the same effect as wrapping with `qi::lexeme`
qi::rule<It, Ast::OtherCommand()> _other_command;
qi::real_parser<double, boost::spirit::qi::strict_real_policies<double> > _strict_double;
qi::symbols<char, Ast::Command> _command;
qi::symbols<char, Ast::Modifier> _modifier;
};
int main()
{
std::ifstream ifs("input.txt");
typedef boost::spirit::istream_iterator It;
ifs.unsetf(std::ios::skipws);
RpnGrammar<It> grammar;
It f(ifs), l;
Ast::RpnMachineProgram program;
bool ok = qi::phrase_parse(f, l, grammar, qi::space, program);
if (ok)
{
std::cout << "Parse succeeded, " << program.size() << " stack instructions\n";
std::for_each(
program.begin(),
program.end(),
Ast::Print(std::cout));
}
else
{
std::cout << "Parse failed\n";
}
if (f != l)
{
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
}

Resources