OPIA is an Object-Oriented but low-level language that I am designing z80 models. The entire language is currently in an experimental/hypothetical state, so much of what is on the following site might contradict (I am still revising it), but you can read for reference: http://dancookplusplus.blogspot.com/
I've been reforming the language for some time because I've made it too complex while trying to also keep it simple (and I've taken some advice from Kllrnohj about the interpreted elements), but I think I've found some middle ground and made some good improvements since my last "announcement" of it here. I am posting here both to inform, and to hopefully receive some feedback.
Basically, I am removing almost all interpreted aspects of the language in favor of having the compiler use data-flow analysis to optimize code (but having an operator to ENSURE that certain removal-optimizations occur; I will explain), and I have removed the ability to store interpreted late-evaluated expressions in favor of now providing closures, function-pointers, and anonymous functions. Now comes the part where I elaborate on these features:
Function pointers
----------------------
A function-pointer datatype is declared with the a list parameter-types and an optional return-type in square brackets, separated with the => ("returns") operator. I borrowed the syntax from a proposal for the Java 7 update, but chose the "[...]" syntax over their "{...}" syntax because it would be easy to parse (since array indexing must always follow a variable). Examples:
Code:
funcPtr = [byte,byte]func // using a type-cast to specify WHICH version of func to point to[/code]
These are stored as simple pointers, with compile-time checks to ensure consistency (the language will not allow a non-functional pointer to be "invoked" as a function). A function pointer which does not specify a target return-type (as in f2 and f4 above) can point to any function with compatible argument types (if the return type is not void, then it is simply ignored). Ambiguous function names would mostly be resolved by context (i.e. the type of the pointer), but for extra clarity, a type-cast can be used to specify which function of the given name to use (as in funcPtr above).
Anonymous Functions
---------------------------
Anonymous functions (or "function literals" if you prefer) are declared by inserting argument names into function-pointer-type and following it with a function body. An optional function-name (as in "factorial" below) allows for recursion, but is immediately forgotten afterward (i.e. used only for within the anonymous function-body). Any of the following could be coded where where a function-pointer is expected (the result being that the function is stored else-where in the program, and a pointer to it is inserted in its place):
Code:
[byte n=>byte] factorial { return (n < 3) n : n * factorial(n-1); }[/code]
Functions as Closures
---------------------------
A closure is a function that is declared inside of another function, and contains references to the entities (variables) within the outer function. This allows a function to pass a closure to another function to designate an action to be performed upon local entities. They were originally introduced to OPIA to replace a messy system of late-evaluated expressions (which a function essentially is anyway) Support for this feature brings up some complications:
References to external variables are stored statically within closures, because all OPIA variables are stored statically (i.e. at fixed, predetermined addresses; the compiler uses recursion-detection to push and pop items from the stack when necessary). The resulting quirk is that when a closure is invoked, its "free variables" (as they are called) always affect the current (or last) invokation of the function or context to which they belong. OPIA does not intend to actually "free" such variables from their source contexts because closures are only meant to be using during the life of the context in which they are declared (which could be gotten around, but I'll call that "dirty coding"). Having free-variables persist outside of a function would require each "instance" of a closure to have an associated pointer to a table containing modifiable copies of the free variables -- which requires dynamic allocation, which I'd rather not have happen each time a closure is used. Perhaps a special declaration (i.e. "virtual") can allow for this; but that currently I am not planning on supporting that (would it be worth it?)
Interpreted Elements (Replaced with optimization checks)
-------------------------------------------------------------------------
OPIA originated with the idea that one could declare variables, flow-control constructs, and functions which were entirely interpreted by the compiler, as a means to precompute values or to dynamically modify a program based on factors that should not be part of the actual program. The nastiness of trying to mix interpreted and compiled elements is THE reason that OPIA development has gone in so many circles. I've now remodeled such that ALL code is targeted at the runtime program environment (as it should be), but that entities can be marked with the $ operator to ensure that they are optimized (lifted) out of the program. This operates on the following principle(s):
The compiler goes beyond obvious optimizations (e.g. simplifying expressions) by using data-flow analysis to trace values from variable to variable so as to predict the contents of variables throughout the program. This information is used (where possible) to circumvent variables either by knowing the value of a variable, or by knowing which variables ought to hold the same values (and deeper into the pipeline, which registers hold the same values and what values or variables might be associated with each). These optimizations remove the need for language features to be used on the GROUNDS of that they are "more efficient" (e.g. when to use constant-reference parameters in C++). For example, if a function is inlined, the compiler can detect which variables are redundant and refer directly to their source information (other variables, or exact values). ... At that point, coding methodology becomes a matter of NEED or PREFERENCE, without having to worry about which is "better" (e.g. declaring a class-member as "static" because "static variables/methods are more efficient" -- the compiler will detect when non-static members do not reply on instance information).
Anyway, with that all said, I am still providing $ as the "interpret" operator, but all it does it ensure that certain optimizations occur. For example, prefixing a datatype with $ in a variable declaration will cause a compilation error if the associated variable can not be lifted out of the program (as well as some hints to the compiler to circumvent that variable when it has the choice). The $ operator also replaces the "inline" keyword for functions, and can be applied either to a function declaration or to a call to any function (to inline a function whether or not that is it's typical behavior). Of course, the compiler will still detect when a function ought to be inlined anyway, but the $ operator is a guarantee that it will. The $ operator can also be applied to flow-control constructs to cause If's to be reduced stripped away (leaving only the "true" code) and for loops to be unrolled. The $$ operator (doubling it) makes it recursive, such that a construct and all of it's containing constructs are to be "interpreted". Functions and control-flow constructs can be marked with $$, but not variables because it is redundant (even supposing it were to mean that the variable must result in an exact value and not another variable is pointless, because either the context of use already restricts this enough, or it doesn't -- in which case the optimization was pointless).
To sum up, the $ and $$ operators simply ensure that inlining and lifting optimizations happen. These optimizations are done anyway (except for inlining and loop unrolling, which are only done automatically when they are determinably "better"); but this guarantees it (and provides hints). All the while, everything about the language is still targetted to JUST the runtime environment; there is not goofy "interpreted layer" ABOVE the program to deal with.
I've been reforming the language for some time because I've made it too complex while trying to also keep it simple (and I've taken some advice from Kllrnohj about the interpreted elements), but I think I've found some middle ground and made some good improvements since my last "announcement" of it here. I am posting here both to inform, and to hopefully receive some feedback.
Basically, I am removing almost all interpreted aspects of the language in favor of having the compiler use data-flow analysis to optimize code (but having an operator to ENSURE that certain removal-optimizations occur; I will explain), and I have removed the ability to store interpreted late-evaluated expressions in favor of now providing closures, function-pointers, and anonymous functions. Now comes the part where I elaborate on these features:
Function pointers
----------------------
A function-pointer datatype is declared with the a list parameter-types and an optional return-type in square brackets, separated with the => ("returns") operator. I borrowed the syntax from a proposal for the Java 7 update, but chose the "[...]" syntax over their "{...}" syntax because it would be easy to parse (since array indexing must always follow a variable). Examples:
Code:
[byte, char => bool] f1; // f1 can point to a "bool function(byte, char)"
[byte, char] f2; // f2 can point to a "void function(byte, char)"
[=>bool] f3; // f3 can point to a "bool function( )"
[ ] f4; // f4 can point to a "void function( )"
funcPtr = [byte,byte]func // using a type-cast to specify WHICH version of func to point to[/code]
These are stored as simple pointers, with compile-time checks to ensure consistency (the language will not allow a non-functional pointer to be "invoked" as a function). A function pointer which does not specify a target return-type (as in f2 and f4 above) can point to any function with compatible argument types (if the return type is not void, then it is simply ignored). Ambiguous function names would mostly be resolved by context (i.e. the type of the pointer), but for extra clarity, a type-cast can be used to specify which function of the given name to use (as in funcPtr above).
Anonymous Functions
---------------------------
Anonymous functions (or "function literals" if you prefer) are declared by inserting argument names into function-pointer-type and following it with a function body. An optional function-name (as in "factorial" below) allows for recursion, but is immediately forgotten afterward (i.e. used only for within the anonymous function-body). Any of the following could be coded where where a function-pointer is expected (the result being that the function is stored else-where in the program, and a pointer to it is inserted in its place):
Code:
[char a, byte b => bool] { return (a < b); }
[byte n] { return n*2; }
[=>char] { char c; /* compute c */ return c; }
[ ] { /* do something */ }
[byte n=>byte] factorial { return (n < 3) n : n * factorial(n-1); }[/code]
Functions as Closures
---------------------------
A closure is a function that is declared inside of another function, and contains references to the entities (variables) within the outer function. This allows a function to pass a closure to another function to designate an action to be performed upon local entities. They were originally introduced to OPIA to replace a messy system of late-evaluated expressions (which a function essentially is anyway) Support for this feature brings up some complications:
References to external variables are stored statically within closures, because all OPIA variables are stored statically (i.e. at fixed, predetermined addresses; the compiler uses recursion-detection to push and pop items from the stack when necessary). The resulting quirk is that when a closure is invoked, its "free variables" (as they are called) always affect the current (or last) invokation of the function or context to which they belong. OPIA does not intend to actually "free" such variables from their source contexts because closures are only meant to be using during the life of the context in which they are declared (which could be gotten around, but I'll call that "dirty coding"). Having free-variables persist outside of a function would require each "instance" of a closure to have an associated pointer to a table containing modifiable copies of the free variables -- which requires dynamic allocation, which I'd rather not have happen each time a closure is used. Perhaps a special declaration (i.e. "virtual") can allow for this; but that currently I am not planning on supporting that (would it be worth it?)
Interpreted Elements (Replaced with optimization checks)
-------------------------------------------------------------------------
OPIA originated with the idea that one could declare variables, flow-control constructs, and functions which were entirely interpreted by the compiler, as a means to precompute values or to dynamically modify a program based on factors that should not be part of the actual program. The nastiness of trying to mix interpreted and compiled elements is THE reason that OPIA development has gone in so many circles. I've now remodeled such that ALL code is targeted at the runtime program environment (as it should be), but that entities can be marked with the $ operator to ensure that they are optimized (lifted) out of the program. This operates on the following principle(s):
The compiler goes beyond obvious optimizations (e.g. simplifying expressions) by using data-flow analysis to trace values from variable to variable so as to predict the contents of variables throughout the program. This information is used (where possible) to circumvent variables either by knowing the value of a variable, or by knowing which variables ought to hold the same values (and deeper into the pipeline, which registers hold the same values and what values or variables might be associated with each). These optimizations remove the need for language features to be used on the GROUNDS of that they are "more efficient" (e.g. when to use constant-reference parameters in C++). For example, if a function is inlined, the compiler can detect which variables are redundant and refer directly to their source information (other variables, or exact values). ... At that point, coding methodology becomes a matter of NEED or PREFERENCE, without having to worry about which is "better" (e.g. declaring a class-member as "static" because "static variables/methods are more efficient" -- the compiler will detect when non-static members do not reply on instance information).
Anyway, with that all said, I am still providing $ as the "interpret" operator, but all it does it ensure that certain optimizations occur. For example, prefixing a datatype with $ in a variable declaration will cause a compilation error if the associated variable can not be lifted out of the program (as well as some hints to the compiler to circumvent that variable when it has the choice). The $ operator also replaces the "inline" keyword for functions, and can be applied either to a function declaration or to a call to any function (to inline a function whether or not that is it's typical behavior). Of course, the compiler will still detect when a function ought to be inlined anyway, but the $ operator is a guarantee that it will. The $ operator can also be applied to flow-control constructs to cause If's to be reduced stripped away (leaving only the "true" code) and for loops to be unrolled. The $$ operator (doubling it) makes it recursive, such that a construct and all of it's containing constructs are to be "interpreted". Functions and control-flow constructs can be marked with $$, but not variables because it is redundant (even supposing it were to mean that the variable must result in an exact value and not another variable is pointless, because either the context of use already restricts this enough, or it doesn't -- in which case the optimization was pointless).
To sum up, the $ and $$ operators simply ensure that inlining and lifting optimizations happen. These optimizations are done anyway (except for inlining and loop unrolling, which are only done automatically when they are determinably "better"); but this guarantees it (and provides hints). All the while, everything about the language is still targetted to JUST the runtime environment; there is not goofy "interpreted layer" ABOVE the program to deal with.