CURRENT ROADMAP:
I've cleaned up the Grammar some: I had to separate preprocessor directives from the rest of the grammar (this allows #if to be intermixed anywhere), made explanations MUCH shorter, and compacted the conventions into a small table.
I need to clean up the syntax classes section a bit, and then the example walkthrough at the bottom (which is WAY out of date).
After that, I will clean up the Overview by making it current, and I will try to cut down the explanations significantly, and try to provide more relevant examples.
So first a bit of maintenance, and then the coding. I hope to get that started before the new year.
My current thoughts on coding:
* I will get rid of Meta tokens used for indicating errors, line-numbers, and files. Instead I will have different phases of the compilation pipeline take something to receive error messages and file info, etc. so that it never dirties up any tokenized code.
* I will have still have the Token class just return a generic "end of line" or "end of input" token as they come, but as signals rather than as marks to be "kept" within the token stream.
* Preprocessing will act as a filter between the Tokenization and Syntax-Parsing layers: It will use the Tokenizer to get tokens for preprocessor directives as needed, but skip tokenization altogether when it needs to skip over code looking for an #endif or an #else (but still necessarily allow strings and comments to be tokenized out). It will keep track of line numbers and file info (either explicitly or via some interface to be passed in), and otherwise just spit tokens out directly from the Tokenizer to the Parser.
* The parser will act as if it is only ever working with one file. However, to avoid allowing multiple namespace and "using" instructions amid a file (those can only go at the top, and only once), The preprocessor will initiate a separate Parsing context for each file. From an API standpoint, the Parser can be invoked on multiple files, and the results can be combined.
I don't think that part will take very long; I see it as a precursor to churning out the syntax-parsing code which will spit out a syntax tree. Hopefully that will go fast too, since everything is all spelled out in the grammar. At that point, I'll have something that can "do stuff" with code, and even be interfaced with at an API level.
I've cleaned up the Grammar some: I had to separate preprocessor directives from the rest of the grammar (this allows #if to be intermixed anywhere), made explanations MUCH shorter, and compacted the conventions into a small table.
I need to clean up the syntax classes section a bit, and then the example walkthrough at the bottom (which is WAY out of date).
After that, I will clean up the Overview by making it current, and I will try to cut down the explanations significantly, and try to provide more relevant examples.
So first a bit of maintenance, and then the coding. I hope to get that started before the new year.
My current thoughts on coding:
* I will get rid of Meta tokens used for indicating errors, line-numbers, and files. Instead I will have different phases of the compilation pipeline take something to receive error messages and file info, etc. so that it never dirties up any tokenized code.
* I will have still have the Token class just return a generic "end of line" or "end of input" token as they come, but as signals rather than as marks to be "kept" within the token stream.
* Preprocessing will act as a filter between the Tokenization and Syntax-Parsing layers: It will use the Tokenizer to get tokens for preprocessor directives as needed, but skip tokenization altogether when it needs to skip over code looking for an #endif or an #else (but still necessarily allow strings and comments to be tokenized out). It will keep track of line numbers and file info (either explicitly or via some interface to be passed in), and otherwise just spit tokens out directly from the Tokenizer to the Parser.
* The parser will act as if it is only ever working with one file. However, to avoid allowing multiple namespace and "using" instructions amid a file (those can only go at the top, and only once), The preprocessor will initiate a separate Parsing context for each file. From an API standpoint, the Parser can be invoked on multiple files, and the results can be combined.
I don't think that part will take very long; I see it as a precursor to churning out the syntax-parsing code which will spit out a syntax tree. Hopefully that will go fast too, since everything is all spelled out in the grammar. At that point, I'll have something that can "do stuff" with code, and even be interfaced with at an API level.