2008-07-30

BZW Spirit Parser -- Proof of Concept

I've been hacking away at spirit and some sort of BZW-parsed storage structure, and this is what I have so far:
  Example test BZW file:
    test.bzw
  Preliminary Parser source file:
    parser.cpp
    Things left out
  • Most BZW objects and parameters
  • Blocks within Blocks
  • Any sort of sane group management
  • Some method for pulling data out of the Parser object
  • Meaningful errors
  • etc.
I'm aware of the ugliness of having all that junk in one source file but this was merely a test which took way longer than I would have liked. However I learned much more about Boost, Spirit and C++ in general (templates, functors). Coming from a mostly C background, the magic of templates and functors is quite new and interesting to me. C++'s use of the & for references threw me for a loop when I was expecting a C-like result. All is well now, it would seem. I do wish the people at Boost/Spirit would clean up the error messages (hint: I forgot a const at the end of what is now line 144 of parser.cpp). I guess there's not a lot they can do, though. Silly templates.

1 comment:

Anonymous said...

I'm not a real fan of "using". There's really nothing wrong with fully qualifying stuff. For example, I don't know boost inside and out, so I can't tell what you are using from boost.

As a general rule, pass read-only arguments as const& (e.g., std::string const& arg) rather than by value. It avoids an extra copy that may be non-trivial for some data types. This rule does not apply to plain-old-data (POD), i.e., int/float/char*, because the size/cost of the copy is equal to or less than the size of the reference.

Line 19: There is no need to explicitly initialize a value that has a default constructor that does the same thing. That is, a default constructed std::string is identical to std::string(""). It's not wrong, but neither is it right ;-)

Line 24: What's this? You have a template function (meaning it can take two arguments of any type, then you construct a string passing it those two arguments. This is a recipe for disaster. I'd really encourage you to add comments describing what/why you are doing stuff (what is the Debug class supposed to do? Why did you create an "action" method (poorly named) instead of just putting the stream statement in-line)

You don't need to define a constructor/destructor/operator= if the compiler generated version will do the same thing. Line 43 is not needed.

Line 49: set identifier in the member initializer list, not the constructor body. This seems to be an issue in most of your classes.

I strongly suspect that 3 levels of nested classes is not really what you want/need.

Your AddXxx structures have an operator() that takes a single template argument and passes it to a function that requires a string. I really doubt you want/need these functions to be templates.

Classes/structure (even functor) names should be noun-based; method names should be verb-based. AddXxx is a poor class/structure name. Rather than AddParameter, consider ParameterAdder.

In general, I find it a bad idea to have iterators as class members. An iterator is like a loop index, except it should be kept very close to the iteration. Too many things can happen to invalidate an iterator, leading to difficult to diagnose run-time errors.

Line 239: apart from being insanely long and complex, I think you may have a scope issue here. The return from std::map::insert is a std::pair<bool, std::map::iterator>. Taking the address of this return value (or a member of the return value is a sure recipe for disaster. It may offend your terse coding style, but try this instead:

parsedBlocks.insert( pair<string, Block>(identifier, Block( *registeredBlocks_i->second) )) );
current = &parsedBlocks[identifier];

I can't comment on your boost::spirit stuff, because I have no clue what it does.

Because of all the in-line code, it's really hard to tell what state (members) each class has. I wonder if your parameter classes for "name" and "size" shouldn't be instantiated for each block they apply to, rather than being global.

I also can't see what happens after everything is parsed. Surely you want to create some real objects from this parsed nonsense, don't you?