2008-08-10

Closing Block Tags

I don't understand why define and face have their own special enddef and endface, respectively. I'm missing what the benefit is to not just using the regular end. Any hints?

2008-08-01

World to Parser to Objects to World

Several design/implementation issues/choices have been swaying me back and forth. I think the following is a fairly sane plan:
  1. World is instantiated, stream to BZW passed
  2. World instantiates Parser, registers anything necessary*
  3. World passes BZW stream, possibly reference to self, to Parser
  4. Parser runs BZW source through Spirit parser.
  5. Spirit parser instantiates World Objects (populating the World reference)
  6. Spirit sends World Objects individual line data, such as size or position
  7. Parsing complete, World populated
  8. Any remaining group/magic left is handled
  9. Populated World object returned
(*: This depends on whether or not the Parser will know about objects beforehand (hard coded), or if the World should inform the Parser of objects through an interface.)

Letting the World Objects "parse" themselves (similar to the current implementation in BZFS' BZW reader) allows for some code reuse on the World Object end (copy much from BZFS' Custom*.read implementation), less complexifcated Parser object (and improved coherence and decoupling), as well as less Parser set-up, as the Parser should need to know less information. Things such as duplicate entries would be managed by the World Object's parsing, which could take advantage of STL's string/stream extraction operators.

2008-07-30

BZW Spirit Parser -- Proof of Concept

I've been hacking away at spirit and some sort of BZW-parsed storage structure, and this is what I have so far:
  Example test BZW file:
    test.bzw
  Preliminary Parser source file:
    parser.cpp
    Things left out
  • Most BZW objects and parameters
  • Blocks within Blocks
  • Any sort of sane group management
  • Some method for pulling data out of the Parser object
  • Meaningful errors
  • etc.
I'm aware of the ugliness of having all that junk in one source file but this was merely a test which took way longer than I would have liked. However I learned much more about Boost, Spirit and C++ in general (templates, functors). Coming from a mostly C background, the magic of templates and functors is quite new and interesting to me. C++'s use of the & for references threw me for a loop when I was expecting a C-like result. All is well now, it would seem. I do wish the people at Boost/Spirit would clean up the error messages (hint: I forgot a const at the end of what is now line 144 of parser.cpp). I guess there's not a lot they can do, though. Silly templates.

2008-07-11

Spirit

Please allow me to contradict myself. In my last post, I spoke of writing a parser from scratch. Things have inevitably changed. Now the plan is to use the Boost::Spirit parsing library to flexibly parse BZW files. It's a good idea to use Spirit because it's as flexible as one can be with a Backus-Naur Form based engine, and the majority of the work is already done for me. In addition adding Spirit to BZFlag isn't all too deadly, compared to other dependencies: Spirit is composed entirely of header files, utilizing all forms of templates I probably would prefer not to know about. Downside is it's a lot of header files. Fortunately the valuable bare bones can be extracted from Boost via a supplied tool, but enough about that.

Unfortunately for the Spirit developers, they are restricted to using C++ operators. This brings about ambiguous situations such as
*real_p
which matches (rather, parses and matches) any number of real numbers. The issue is obvious. So throwing chunks of Spirit code amongst C++ code can be very messy indeed, not to mention the ugliness of C++ templates.

It's a learning experience with Spirit; transposing BNF notation from paper to Spirit's odd looking syntax. All of this helped resurface some old questions: If objects are to handle their own parsing, how should this be done? Should they all have a Spirit parser within, or go through some abstracted Parser by informing it which information the object is interested to, and maybe providing a callback for the Parser to use if it should find that interesting information. The latter seems like a good idea, and it could take advantage of Spirit's dynamic parser composition features. Plug-ins that define new Objects could simply document what sort of information they expect and the Parser could handle it like every other object. No need for Spirit code to spaghetti into other people's source. This seems like a good idea. All we need is a BNF document for BZW format and we're off to the races.

The only difficulty in writing BNF for the BZW format is that too many generalizations breaks the format, and none make the parser very specific. So, the goal is to figure out which rules can be bent. For example, one such rule is the name field:
world
  name Simple World
  size 100.0
end
How should it be treated? How many "values" are there? If I treat it like a generic property, a good example of which is position or size, properties that have anywhere from 1 to many values associated with them, then it is left to the object to deal with all the values. For example, if the callback for an object a name like the above was provided, then the object would have to push all the values off a stack or queue onto a string to get the name back. There are alternative options for passing back the data that could solve this problem, but it's easier to solve it upfront in the parser. Should the object perhaps inform the parser exactly how many values it expects? Things to be decided. More to come soon.

Edit: I realize I may have been vague in the preceding paragraph. Allow me to clarify: By "values" I refer to space-separated character-groups, like "Simple" and "Word" and 100.0. The problem is differentiating between [<identifier> <value> <value>] and [<identifier> <magical value that continues until the end of line>]. Simple solution: Two different types of parameters.

2008-06-26

Parsing (the old fasioned way)

The trend of my development so far has really been parsing. Normally for a parsing problem this complex I would turn to lex/yacc, but additional dependencies are not required, and lex/yacc can be cross-platform problematic. The two parser implementations currently in bzfs and bzwb tackle the file very differently. The bzfs implementation does a very hard-coded, procedural run through the file, bloated with ifs and elses, and way too many strcasecmp's for my liking. On the other hand, bzwb parses in a fairly organized manner, breaking the parsing down into a fairly flexible set of functions reliant on a list of supported input, which could be modified fairly easily. This is more along my line of thinking. The only problem I have with it is that it involves reading and storing the file in memory for a while until parsing completes. So, with the spirit of a new library, why not write a new parser?

It's been a bit of a mind-bend so far trying to accommodate some strange features the BZW format currently has, and figuring out a set of rules that abstractly cover all possibilities isn't easy. I originally thought it would be best to supply to the parser the information it needs to know about what sorts of objects it can read, what sorts of parameters those objects contain, etc,. before parsing the actual file. Then I thought it might be better to just read everything in, no matter how valid it might be, and proceed to pulling out the required information that was read, complaining about bits that were unnecessary and thus, most likely, invalid. While this approach has the benefit of being very simple and not very difficult to implement, it's not very clean. It leaves a big mess of potential warnings to fire off when it's done, instead of while it's reading, and almost welcomes very messy BZW files, which is a bad idea. After much internal debate, I decided once again to go with my first plan, albeit, slightly backwards than I had originally plan:
  1. Create and record all the required object information in Parser Object types.
  2. For each Parser Object, create the required parameters.
  3. Feed these objects into a new Parser instance.
  4. Run the Parser, which will use the provided rules
  5. Pull out all the required data for each object and do some magic.
I guess we'll see if it works out, shortly!

2008-06-25

Hello, World!

So, I have finally created this page, after much delay.