note Help Needed
This wiki is the result of an ongoing community effort — thank you all for helping!
If you want to provide changes to this page then please click here.
APPENDICES
Error and warning messages
When the compiler finds an error in a file, it outputs a message giving, in this order:
the name of the file
the line number were the compiler detected the error between parentheses, directly behind the filename
the error class (“error”, “fatal error” or “warning”)
an error number
a descriptive error message
For example:
demo.p(3) : error 001: expected token: ";", but found "{"
Note: the line number given by the compiler may specify a position behind the actual error, since the compiler cannot always establish an error before having analyzed the complete expression.
After termination, the return code of the compiler is:
0 no errors —there may be warnings, though
1 errors found
2 reserved
3 aborted by user
These return codes may be checked within batch processors (such as the “make” utility).
• Error categories
Errors are separated into three classes:
Type | Description |
---|---|
Errors | Describe situations where the compiler is unable to generate appropriate code. Errors messages are numbered from 1 to 99. |
Fatal errors | Fatal errors describe errors from which the compiler cannot recover. Parsing is aborted. Fatal error messages are numbered from 100 to 199. |
Warnings | Warnings are displayed for unintended compiler assumptions and common mistakes. Warning messages are numbered from 200 to 299. |
• Errors
Number | Description |
---|---|
001 | expected token: token, but found token |
A required token is omitted. | |
002 | only a single statement (or expression) can follow each “case” |
Every case in a switch statement can hold exactly one statement. To put multiple statements in a case, enclose these statements between braces (which creates a combound statement). | |
003 | declaration of a local variable must appear in a compound block |
The declaration of a local variable must appear between braces (“{. . . }”) at the active scope level. | |
When the parser flags this error, a variable declaration appears as the only statement of a function or the only statement below an if, else, for, while or do statement. Note that, since local variables are accessible only from (or below) the scope that their declaration appears in, having a variable declaration as the only statement at any scope is useless. | |
004 | function name is not implemented |
There is no implementation for the designated function. The function may have been “forwardly” declared —or prototyped— but the full function definition including a statement, or statement block, is missing. | |
005 | function may not have arguments |
The function main() is the program entry point. It may not have arguments. | |
006 | must be assigned to an array |
String literals or arrays must be assigned to an array. This error message may also indicate a missing index (or indices) at the array on the right side of the “=” sign. | |
007 | operator cannot be redefined |
Only a select set of operators may be redefined, this operator is not one of them. See page 86 for details. | |
008 | must be a constant expression; assumed zero |
The size of arrays and the parameters of most directives must be constant values. | |
009 | invalid array size (negative or zero) |
The number of elements of an array must always be 1 or more. | |
010 | illegal function or declaration |
The compiler expects a declaration of a global variable or of a function at the current location, but it cannot interpret it as such. | |
011 | invalid outside functions |
The instruction or statement is invalid at a global level. Local labels and (compound) statements are only valid if used within functions. | |
012 | invalid function call, not a valid address |
The symbol is not a function. | |
013 | no entry point (no public functions) |
The file does not contain a main function or any public function. The compiled file thereby does not have a starting point for the execution. | |
014 | invalid statement; not in switch |
The statements case and default are only valid inside a switch statement. | |
015 | “default” must be the last clause in switch statement |
pawn requires the default clause to be the last clause in a switch statement. | |
016 | multiple defaults in “switch” |
Each switch statement may only have one default clause. | |
017 | undefined symbol symbol |
The symbol (variable, constant or function) is not declared. | |
018 | initialization data exceeds declared size |
Initialization: 65 An array with an explicit size is initialized, but the number of initiallers exceeds the number of elements specified. For example, in “arr[3]={1,2,3,4};” the array is specified to have three elements, but there are four initiallers. | |
019 | not a label: name |
A goto statement branches to a symbol that is not a label. | |
020 | invalid symbol name |
A symbol may start with a letter, an underscore or an “at” sign (“@”) and may be followed by a series of letters, digits, underscore characters and “@” characters. | |
021 | symbol already defined: identifier |
The symbol was already defined at the current level. | |
022 | must be lvalue (non-constant) |
The symbol that is altered (incremented, decremented, assigned a value, etc.) must be a variable that can be modified (this kind of variable is called an lvalue). Functions, string literals, arrays and constants are no lvalues. Variables declared with the “const” attribute are no lvalues either. | |
023 | array assignment must be simple assignment |
When assigning one array to another, you cannot combine an arithmetic operation with the assignment (e.g., you cannot use the “+=” operator). | |
024 | “break” or “continue” is out of context |
The statements break and continue are only valid inside the context of a loop (a do, for or while statement). Unlike the languages C/C⁺⁺ and Java, break does not jump out of a switch statement. | |
025 | function heading differs from prototype |
The number of arguments given at a previous declaration of the function does not match the number of arguments given at the current declaration. | |
026 | no matching “#if...” |
The directive #else or #endif was encountered, but no matching #if directive was found. | |
027 | invalid character constant |
One likely cause for this error is the occurrence of an unknown escape sequence, like “\x”. Putting multiple characters between single quotes, as in ’abc’ also issues this error message. A third cause for this error is a situation where a character constant was expected, but none (or a non-character expression) were provided. | |
028 | invalid subscript (not an array or too many subscripts): identifier |
The subscript operators “[” and “]” are only valid with arrays. The number of square bracket pairs may not exceed the number of dimensions of the array. | |
029 | invalid expression, assumed zero |
The compiler could not interpret the expression. | |
030 | compound statement not closed at the end of file |
An unexpected end of file occurred. One or more compound statements are still unfinished (i.e. the closing brace “ ” has not been found). | |
031 | unknown directive |
The character “#” appears first at a line, but no valid directive was specified. | |
032 | array index out of bounds |
The array index is larger than the highest valid entry of the array. | |
033 | array must be indexed (variable name) |
An array as a whole cannot be used in a expression; you must indicate an element of the array between square brackets. | |
034 | argument does not have a default value (argument index) |
You can only use the argument placeholder when the function definition specifies a default value for the argument. | |
035 | argument type mismatch (argument index) |
The argument that you pass is different from the argument that the function expects, and the compiler cannot convert the passed-in argument to the required type. For example, you cannot pass the literal value “1” as an argument when the function expects an array or a reference. | |
036 | empty statement |
The line contains a semicolon that is not preceded by an expression. pawn does not support a semicolon as an empty statement, use an empty compound block instead. | |
037 | invalid string (possibly non-terminated string) |
A string was not well-formed; for example, the final quote that ends a string is missing, or the filename for the #include directive was not enclosed in double quotes or angle brackets. | |
038 | extra characters on line |
There were trailing characters on a line that contained a directive (a directive starts with a # symbol, see page 117). | |
039 | constant symbol has no size |
A variable has a size (measured in a number of cells), a constant has no size. That is, you cannot use a (symbolic) constant with the sizeof operator, for example. | |
040 | duplicate “case” label (value value) |
A preceding “case label” in the list of the switch statement evaluates to the same value. | |
041 | invalid ellipsis, array size is not known |
You used a syntax like “arr[] = { 1, ... };”, which is invalid, because the compiler cannot deduce the size of the array from the declaration. | |
042 | invalid combination of class specifiers |
A function or variable is denoted as both “public” and “native”, which is unsupported. Other compinations may also be unsupported; for example, a function cannot be both “public” and “stock” (a variable may be declared both “public” and “stock”). | |
043 | character constant exceeds range for packed string |
Usually an attempt to store a Unicode character in a packed string where a packed character is 8-bits. | |
044 | mixing named and positional parameters |
You must either use named parameters or positional parameters for all parameters of the function. | |
045 | too many function arguments |
The maximum number of function arguments is currently limited to 64. | |
046 | unknown array size (variable name) |
For array assignment, the size of both arrays must be explicitly defined, also if they are passed as function arguments. | |
047 | array sizes do not match, or destination array is too small |
For array assignment, the arrays on the left and the right side of the assignment operator must have the same number of dimensions. In addition: | |
- for multi-dimensional arrays, both arrays must have the same size; | |
- for single arrays with a single dimension, the array on the left side of the assignment operator must have a size that is equal or bigger than the one on the right side. | |
When passing arrays to a function argument, these rules also hold for the array that is passed to the function (in the function call) versus the array declared in the function definition. | |
When a function returns an array, all return statements must specify an array with the same size and dimensions. | |
048 | array dimensions do not match |
For an array assignment, the dimensions of the arrays on both sides of the “=” sign must match; when passing arrays to a function argument, the arrays passed to the function (in the function call) must match with the definition of the function arguments. | |
When a function returns an array, all return statements must specify an array with the same size and dimensions. | |
049 | invalid line continuation |
A line continuation character (a backslash at the end of a line) is at an invalid position, for example at the end of a file or in a single line comment. | |
050 | invalid range |
A numeric range with the syntax “n1 .. n2”, where n1 and n2 are numeric constants, is invalid. Either one of the values in not a valid number, or n1 is not smaller than n2. | |
051 | invalid subscript, use “[ ]” operators on major dimensions |
You can use the “array character index” operator (braces: “{ }” only for the last dimension. For other dimensions, you must use the cell index operator (square brackets: “[ ]”). | |
052 | multi-dimensional arrays must be fully initialized |
If an array with more than one dimension is initialized at its declaration, then there must be equally many literal vectors/subarrays at the right of the equal sign (“=”) as specified for the major dimension(s) of the array. | |
053 | exceeding maximum number of dimensions |
The current implementation of the pawn compiler only supports arrays with one or two dimensions. | |
054 | unmatched closing brace |
A closing brace (“}”) was found without matching opening brace (“{”). | |
055 | start of function body without function header |
An opening brace (“{”) was found outside the scope of a function. This may be caused by a semicolon at the end of a preceding function header. | |
056 | local variables and function arguments cannot be public |
A local variable or a function argument starts with the character “@”, which is invalid. | |
057 | Unfinished expression before compiler directive |
Compiler directives may only occur between statements, not inside a statement. This error typically occurs when an expression statement is split over multiple lines and a compiler directive appears between the start and the end of the expression. This is not supported. | |
058 | duplicate argument; same argument is passed twice |
In the function call, the same argument appears twice, possibly through a mixture of named and positional parameters. | |
059 | function argument may not have a default value (variable name) |
All arguments of public functions must be passed explicitly. Public functions are typically called from the host application, who has no knowledge of the default parameter values. Arguments of user defined operators are implied from the expression and cannot be inferred from the default value of an argument. | |
060 | multiple “#else” directives between “#if . . . #endif |
Two or more #else directives appear in the body between the matching #if and #endif. | |
061 | “#elseif” directive follows an “#else” directive |
All #elseif directives must appear before the #else directive. This error may also indicate that an #endif directive for a higher level is missing. | |
062 | number of operands does not fit the operator |
When redefining an operator, the number of operands that the operator has (1 for unary operators and 2 for binary operators) must be equal to the number of arguments of the operator function. | |
063 | operator requires that the function result has a “bool” tag |
Logical and relational operators are defined as having a result that is either true (1) or false (0) and having a “bool” tag. A user defined operator should adhere to this definition. | |
064 | cannot change predefined operators |
One cannot define operators to work on untagged values, for example, because pawn already defines this operation. | |
065 | function argument may only have a single tag (argument number) |
In a user defined operator, a function argument may not have multiple tags. | |
066 | function argument may not be a reference argument or an array (argument number) |
In a user defined operator, all arguments must be cells (non-arrays) that are passed “by value”. | |
067 | variable cannot be both a reference and an array (variable name) |
A function argument may be denoted as a “reference” or as an array, but not as both. | |
068 | invalid rational number precision in #pragma |
The precision was negative or too high. For floating point rational numbers, the precision specification should be omitted. | |
069 | rational number format already defined |
This #pragma conflicts with an earlier #pragma that specified a different format. | |
070 | rational number support was not enabled |
A rational literal number was encountered, but the format for rational numbers was not specified. | |
071 | user-defined operator must be declared before use (function name) |
Like a variable, a user-defined operator must be declared before its first use. This message indicates that prior to the declaration of the user-defined operator, an instance where the operator was used on operands with the same tags occurred. This may either indicate that the program tries to make mixed use of the default operator and a user-defined operator (which is unsupported), or that the user-defined operator must be “forwardly declared”. | |
072 | “sizeof ” operator is invalid on “function” symbols |
You used something like “sizeof MyCounter” where the symbol “MyCounter” is not a variable, but a function. You cannot request the size of a function. | |
073 | function argument must be an array (argument name) |
The function argument is a constant or a simple variable, but the function requires that you pass an array. | |
074 | #define pattern must start with an alphabetic character |
Any pattern for the #define directive must start with a letter, an underscore (“_”) or an “@”-character. The pattern is the first word that follows the #define keyword. | |
075 | input line too long (after substitutions) |
Either the source file contains a very long line, or text substitutions make a line that was initially of acceptable length grow beyond its bounds. This may be caused by a text substitution that causes recursive substitution (the pattern matching a portion of the replacement text, so that this part of the replacement text is also matched and replaced, and so forth). | |
076 | syntax error in the expression, or invalid function call |
The expression statement was not recognized as a valid statement (so it is a “syntax error”). From the part of the string that was parsed, it looks as if the source line contains a function call in a “procedure call” syntax (omitting the parentheses), but the function result is used —assigned to a variable, passed as a parameter, used in an expession. . . | |
077 | malformed UTF-8 encoding, or corrupted file: filename |
The file starts with an UTF-8 signature, but it contains encodings that are invalid UTF-8. If the source file was created by an editor or converter that supports UTF-8, the UTF-8 support is non-conforming. | |
078 | function uses both “return” and “return ¡value¿” |
The function returns both with and without a return value. The function should be consistent in always returning with a function result, or in never returning a function result. | |
079 | inconsistent return types (array & non-array) |
The function returns both values and arrays, which is not allowed. If a function returns an array, all return statements must specify an array (of the same size and dimensions). | |
080 | unknown symbol, or not a constant symbol (symbol name) |
Where a constant value was expected, an unknown symbol or a non-constant symbol (variable) was found. | |
081 | cannot take a tag as a default value for an indexed array parameter (symbol name) |
The tagof operator was used on an array parameter where the array also had an index. This is unsupported. | |
082 | user-defined operators and native functions may not havestates |
Only standard and public functions may have states. | |
083 | a function or variable may only belong to a single automaton (symbol name) |
There are multiple automatons in the state declaration for the indicated function or variable, which is not supported. In the case of a function: all instances of the function must belong to the same automaton. In the case of a variable: it is allowed to have several variables with the same name belonging to different automatons, but only in separate declarations —these are distinct variables. | |
084 | state conflict: one of the states is already assigned to another implementation (symbol name) |
The specified state appears in the state specifier of two implementations of the same function. | |
085 | no states are defined for symbol name |
When this error occurs on a function, this function has a fall-back implementation, but no other states. If the error refers to a variable, this variable does not have a list of states between the < and > characters. Use a state-less function ir variable instead. | |
086 | unknown automaton name |
The “state” statement refers to an unknown automaton. | |
087 | unknown state name for automaton name |
The “state” statement refers to an unknown state (for the specified automaton). | |
088 | public variables and local variables may not have states (symbol name) |
Only standard (global) variables may have a list of states (and an automaton) at the end of a declaration. | |
089 | state variables may not be initialized (symbol name) |
Variables with a state list attached may not have initializers. State variables should always be explicitly initialized, as their initial value is indeterminate. | |
090 | public functions may not return arrays (symbol name) |
A public function may not return an array. Returning arrays is allowed only for normal functions. |
• Fatal Errors
Number | Description |
---|---|
100 | cannot read from file: filename |
The compiler cannot find the specified file or does not have access to it. | |
101 | cannot write to file: filename |
The compiler cannot write to the specified output file, probably caused by insufficient disk space or restricted access rights (the file could be read-only, for example). | |
102 | table overflow: table name |
An internal table in the pawn parser is too small to hold the required data. Some tables are dynamically growable, which means that there was insufficient memory to resize the table. The “table name” is one of the following: | |
“staging buffer”: the staging buffer holds the code generated for an expression before it is passed to the peephole optimizer. The staging buffer grows dynamically, so an overflow of the staging buffer basically is an “out of memory” error. | |
“loop table”: the loop table is a stack used with nested do, for, and while statements. The table allows nesting of these statements up to 24 levels. | |
“literal table”: this table keeps the literal constants (numbers, strings) that are used in expressions and as initiallers for arrays. The literal table grows dynamically, so an overflow of the literal table basically is an “out of memory” error. | |
“compiler stack”: the compiler uses a stack to store temporary information it needs while parsing. An overflow of this stack is probably caused by deeply nested (or recursive) file inclusion. The compiler stack grows dynamically, so an overflow of the compiler stack basically is an “out of memory” error. | |
“option table”: in case that there are more options on the command line or in the response file than the compiler can cope with. | |
103 | insufficient memory |
General “out of memory” error. | |
104 | invalid assembler instruction symbol |
An invalid opcode in an #emit directive. | |
105 | numeric overflow, exceeding capacity |
A numeric constant, notably a dimension of an array, is too large for the compiler to handle. For example, when compiled as a 16-bit application, the compiler cannot handle arrays with more than 32767 elements. | |
106 | compiled script exceeds the maximum memory size (number bytes) |
The memory size for the abstract machine that is needed to run the script exceeds the value set with #pragma amxlimit. This means that the script is too large to be supported by the host. You might try reducing the script’s memory requirements by: | |
- setting a smaller stack/heap area —see #pragma dynamic at page 121; | |
- using packed strings instead of unpacked strings —see pages 99 and 137; | |
- putting repeated code in separate functions; | |
- putting repeated data (strings) in global variables; | |
- trying to find more compact algorithms to perform the same task. | |
107 | too many error/warning messages on one line |
A single line that causes several error/warning messages is often an indication that the pawn parser is unable to “recover” from an earlier error. In this situation, the parser is unlikely to make any sense of the source code that follows —producing only (more) inappropriate error messages. Therefore, compilation is halted. | |
108 | codepage mapping file not found |
The file for the codepage translation that was specified with the -c compiler option or the #pragma codepage directive could not be loaded. | |
109 | invalid path: path name |
A path, for example for include files or codepage files, is invalid. | |
110 | assertion failed: expression |
Compile-time assertion failed. | |
111 | user error: message |
The parser fell on an #error directive. |
• Warnings
Number | Description |
---|---|
200 | symbol is truncated to number characters |
The symbol is longer than the maximum symbol length. The maximum length of a symbol depends on wether the symbol is native, public or neither. Truncation may cause different symbol names to become equal, which may cause error 021 or warning 219. | |
201 | redefinition of constant/macro (symbol name) |
The symbol was previously defined to a different value, or the text substitution macro that starts with the prefix name was redefined with a different substitution text. | |
202 | number of arguments does not match definition |
At a function call, the number of arguments passed to the function (actual arguments) differs from the number of formal arguments declared in the function heading. To declare functions with variable argument lists, use an ellipsis (...) behind the last known argument in the function heading; for example: print(formatstring,...); (see page 80). | |
203 | symbol is never used: identifier |
A symbol is defined but never used. Public functions are excluded from the symbol usage check (since these may be called from the outside). | |
204 | symbol is assigned a value that is never used: identifier |
A value is assigned to a symbol, but the contents of the symbol are never accessed. | |
205 | redundant code: constant expression is zero |
Where a conditional expression was expected, a constant expression with the value zero was found, e.g. “while (0)” or “if (0)”. | |
The the conditional code below the test is never executed, and it is therefore redundant. | |
206 | redundant test: constant expression is non-zero |
Where a conditional expression was expected, a constant expression with a non-zero value was found, e.g. if (1). The test is redundant, because the conditional code is always executed. | |
207 | unknown “#pragma” |
The compiler ignores the pragma. The #pragma directives may change between compilers of different vendors and between different versions of a compiler of the same version. | |
208 | function with tag result used before definition, forcing reparse |
When a function is “used” (invoked) before being declared, and that function returns a value with a tag name, the parser must make an extra pass over the source code, because the presence of the tag name may change the interpretation of operators (in the presence of user-defined operators). You can speed up the parsing/compilation process by declaring the relevant functions before using them. | |
209 | function should return a value |
The function does not have a return statement, or it does not have an expression behind the return statement, but the function’s result is used in a expression. | |
210 | possible use of symbol before initialization: identifier |
A local (uninitialized) variable appears to be read before a value is assigned to it. The compiler cannot determine the actual order of reading from and storing into variables and bases its assumption of the execution order on the physical appearance order of statements an expressions in the source file. | |
211 | possibly unintended assignment |
Where a conditional expression was expected, the assignment operator (=) was found instead of the equality operator (==). As this is a frequent mistake, the compiler issues a warning. To avoid this message, put parentheses around the expression, e.g. if ( (a=2) ). | |
212 | possibly unintended bitwise operation |
Where a conditional expression was expected, a bitwise operator (& or |) was found instead of a Boolean operator (&& or ||). In situations where a bitwise operation seems unlikely, the compiler issues this warning. To avoid this message, put parentheses around the expression. | |
213 | tag mismatch |
A tag mismatch occurs when: | |
- assigning to a tagged variable a value that is untagged or that has a different tag | |
- the expressions on either side of a binary operator have different tags | |
- in a function call, passing an argument that is untagged or that has a different tag than what the function argument was defined with | |
- indexing an array which requires a tagged index with no tag or a wrong tag name | |
214 | possibly a “const” array argument was intended: identifier |
Arrays are always passed by reference. If a function does not modify the array argument, however, the compiler can sometimes generate more compact and quicker code if the array argument is specifically marked as “const”. | |
215 | expression has no effect |
The result of the expression is apparently not stored in a variable or used in a test. The expression or expression statement is therefore redundant. | |
216 | nested comment |
PAWN does not support nested comments. | |
217 | loose indentation |
Statements at the same logical level do not start in the same column; that is, the indents of the statements are different. Although pawn is a free format language, loose indentation frequently hides a logical error in the control flow. | |
The compiler can also incorrectly assume loose indentation if the tab size with which you indented the source code differs from the assumed size, see #pragma tabsize on page 122 or the compiler option -t on page 169. | |
218 | old style prototypes used with optional semicolon |
When using “optional semicolons”, it is preferred to explicitly declare forward functions with the forward keyword than using terminating semicolon. | |
219 | local variable identifier shadows a symbol at a preceding level |
A local variable has the same name as a global variable, a function, a function argument, or a local variable at a lower precedence level. This is called “shadowing”, as the new local variable makes the previously defined function or variable inaccessible. | |
Note: if there are also error messages further on in the script about missing variables (with these same names) or brace level problems, it could well be that the shadowing warnings are due to these syntactical and sematical errors. Fix the errors first before looking at the shadowing warnings. | |
220 | expression with tag override must appear between parentheses |
In a case statement and in expressions in the conditional operator (“ ? : ”), any expression that has a tag override should be enclosed between parentheses, to avoid the colon to be misinterpreted as a separator of the case statement or as part of the conditional operator. | |
221 | label name identifier shadows tag name |
A code label (for the goto instruction) has the same name as a previously defined tag. This may indicate a faultily applied tag override; a typical case is an attempt to apply a tag override on the variable on the left of the = operator in an assignment statement. | |
222 | number of digits exceeds rational number precision |
A literal rational number has more decimals in its fractional part than the precision of a rational number supports. The remaining decimals are ignored. | |
223 | redundant “sizeof ”: argument size is always 1 (symbol name) |
A function argument has a as its default value the size of another argument of the same function. The “sizeof” default value is only useful when the size of the referred argument is unspecified in the declaration of the function; i.e., if the referred argument is an array. | |
224 | indeterminate array size in “sizeof ” expression (symbol name) |
The operand of the sizeof operator is an array with an unspecified size. That is, the size of the variable cannot be determined at compile time. If used in an “if” instruction, consider a conditionally compiled section, replacing if by #if. | |
225 | unreachable code |
The indicated code will never run, because an instruction before (above) it causes a jump out of the function, out of a loop or elsewhere. Look for return, break, continue and goto instructions above the indicated line. | |
226 | a variable is assigned to itself (symbol name) |
There is a statement like “x = x” in the code. The parser checks for self assignments after performing any text and constant substitutions, so the left and right sides of an assignment may appear to be different at first sight. For example, if the symbol “TWO” is a constant with the value 2, then “var[TWO] = var[2]” is also a self-assignment. | |
Self-assignments are, of course, redundant, and they may hide an error (assignment to the wrong variable, error in declaring constants). | |
Note that the pawn parser is limited to performing “static checks” only. In this case it means that it can only compare array assignments for self-assignment with constant array indices. | |
227 | more initiallers than enum fields |
An array whose size is declared with an enum symbol contains more values/fields as initiallers than the enumeration defines. | |
228 | length of initialler exceeds size of the enum field |
An array whose size is declared with an enum symbol, and the relevant enumeration field has a size. The initialler in the array contains more values than the size of the enumeration field allows. | |
229 | index tag mismatch (symbol name) |
When indexing an array, the expression used as the index has a different tag than what the one in the declaration of the array. See pages 29 and 68 for an explanation and examples. | |
230 | no implementation for state name in function name , no fall-back |
A function is lacking an implementation for the indicated state. The compiler cannot (statically) check whether the function will ever be called in that state, and therefore it issues this warning. When the function would be called for the state for which no implementation exists, the abstract machine aborts with a run time error. | |
See page 83 on how to specify a fall-back function, and page 44 for a description and an example. | |
231 | state specification on forward declaration is ignored |
A state specification is redundant on forward declarations. The function signature must be equal for all states. Only the implementations of the function are state-specific. | |
232 | compaction buffer overflow |
Compact encoding may in some particular cases result in files that would actually be bigger than the non-compact encoding. The abstract machine cannot handle this, as it unpacks the P-code “in place”. When the compiler deticts this situation, it re-builds the file with compact encoding switched off. To avoid this warning, force building the file with plain (“non-compact”) encoding —see page 120. | |
233 | state variable name shadows a global variable |
The state variable has the same name as a global variable (without state specifiers). This means that the global variable is inaccessible for a function with one of the same states as those of the variable. | |
234 | function is depricated (symbol name) |
The script uses a function which as marked as “depricated”. The host application can mark (native) functions as depricated when better alternatives for the function are available or if the function may not be supported in future versions of the host application. | |
235 | call to undeclared public function (symbol name) |
The script defines a public function, but no forward declaration of this function is present. Possibly the function name was written incorrectly. The requirement for forward declarations of public functions guards against a common error. | |
236 | unknown parameter in substitution (incorrect #define pattern) |
A #define pattern contains a parameter in the replacement (e.g. “%1”, but one in the match pattern. See page 93 for the preprocessor syntax. |
Pitfalls: 134
Compound statement: 112
Compound statement: 112
Forward declaration: 82
Symbol name syntax: 97
Escape sequence: 99
Empty compound block: 112
Single line comment: 97
Named versus positional parameters: 74
#pragma rational: 121
Forward declaration: 82
State specifiers: 83
Fall-back: 83
See also #pragma
amxlimit on page 119
#pragma codepage: 120
#assert directive: 117
#error directive: 117
User-defined operators: 86
Forward declaration: 82
Tags are discussed on page 68
#if . . . #else . . . #endif: 117
State specifiers: 83
The compiler
Many applications that embed the PAWN scripting language use the stand- alone compiler that comes with the PAWN toolkit. The PAWN compiler is a command-line utility, meaning that you must run it from a “console window”, a terminal/shell, or a “DOS box” (depending on how your operating system calls it).
• Usage
Assuming that the command-line PAWN compiler is called “pawncc” (Unix/ Linux) or “pawncc.exe” (DOS/Windows), the command line syntax is:
pawncc <filename> [more filenames...] [options]
The input file name is any legal filename. If no extension is given, “.pawn” or “.p” is assumed. The compiler creates an output file with, by default, the same name as the input file and the extension “.amx”.
After switching to the directory with the sample programs, the command:
pawncc hello
should compile the very first “hello world” example (page 5). Should, because the command implies that:
the operating system can locate the “pawncc” program —you may need to add it to the search path;
the PAWN compiler is able to determine its own location in the file system so that it can locate the include files —a few operating systems do not support this and require that you use the -i option (see below).
• Input file
The input file for the PAWN compiler, the “source code” file for the
script/program, must be a plain text file. All reserved words and all symbol names
(names for variables, functions, symbolic constants, tags, . . . ) must
use the ascii character set. Literal strings, i.e text between quotes, may be in
extended ascii, such as one of the sets standardized in the ISO 8859 norm —ISO 8859-1
is the well known “Latin 1” set.
The PAWN compiler also supports UTF-8 encoded text files, which are practical
in an environment based on Unicode or UCS-4. The PAWN compiler only
recognizes UTF-8 encoded characters inside unpacked strings and character
constants. The compiler interprets the syntax rules for UTF-8 files
strictly; non-conforming UTF-8 files are not recognized. The input file may have, but
does not require, a “Byte Order Mark” signature; the compiler
recognizes the UTF-8 format based on the file’s content.
• Options
Options start with a dash (“-”) or, on Microsoft Windows and DOS, with a
forward slash (“/”). In other words, all platforms accept an option written
as “-a” (see below for the purpose of this option) and the
DOS/Windows platforms accept “/a” as an alternative way to write “-a”.
All options should be separated by at least one space.
Many options accept a value —which is sometimes mandatory. A value may be separated from the option letter by a colon or an equal sign (a “:” and a “=” respectively), or the value may be glued to the option letter. Three equivalent options to set the debug level to two are thus:
-d2
-d:2
-d=2
The options are:
Option | Description |
---|---|
-a | Assembler: generate a text file with the pseudo-assembler code for the PAWN abstract machine, instead of binary code. |
-C+/- | Compact encoding of the binary file, which reduces the size a the output file typically to less than half the original size. Use -C+ to enable it and -C- to revert to “plain” encoding. The option -C (without + or − suffix) toggles the current setting. |
-cname | Codepage: set the codepage for translating the source file from extended ascii to Unicode/UCS-4. The default is no translation. The name parameter can specify a full path to a “mapping file” or just the identifier of the codepage —in the latter case, the compiler prefixes the identifier with the letters “cp”, appends the extension “.txt” and loads the mapping file from a system directory. |
-Dpath | Directory: the “active” directory, where the compiler should search for its input files and store its output files. |
This option is not supported on every platform. To verify whether the PAWN compiler supports this option, run the compiler without any option or filename on the command line. The compiler will then list its usage syntax and all available options in alphabetical order. If the -D switch is absent, the option is not available. | |
-dlevel | Debug level: 0 = none, 1 = bounds checking and assertions only, 2 = full symbolic information, 3 = full symbolic information and optimizations disabled (same as the combination -d2 and -O0). |
When the debug level is 2 or 3, the PAWN compiler also prints the estimated number of stack/heap space required for the program. | |
-efilename | Error file: set the name of the file into which the compiler must write any warning and error messages; when set, there is no output to the screen. |
-Hvalue | “HWND” (Microsoft Windows version only): the compiler can optionally post a message to the specified window handle upon completion of the P-code generation. Host applications that invoke the PAWN compiler can wait for the arrival of this message or signal the user of the completion of the compile. |
The message number that is sent to the window is created with the Microsoft Windows SDK function RegisterWindowMessage using the name “PawnNotify”. The wParam of the message holds the compiler return code: 0 = success, 1 = warnings, 2 = errors (plus possibly warnings), 3 = compilation aborted by the user. | |
-ipathname | Include path: set the path where the compiler can find the include files. This option may appear multiple times at the command line, to allow you to set several include paths. |
-l | Listing: perform only the file reading and preprocessing steps; for example, to verify the effect of the text substitution macros and the conditionally compiled/skipped sections. |
-Olevel | Optimization level: 0 = no optimizations; 1 = JIT compatible optimizations only (JIT = “Just In Time” compiler, a high performance abstract machine); 2 = full optimizations. |
-ofilename | Output file: set the name and path of the binary output file. |
-pfilename | Prefix file: the name of the “prefix file”, this is a file that is parsed before the input file (as a kind of implicit “include file”). If used, this option overrides the default include file “default.inc”. The -p option on its own (without a filename) disables the processing of any implicit include file. |
-rfilename | Report: enable the creation of the report and optionally set the filename to which the extracted documentation and a crossreference report will be written. |
The report is in “XML” format. The filename parameter is optional; if not specified, the report file has the same name as the input file with the extension “.XML”. | |
-Svalue | Stack size: the size of the stack and the heap in cells. |
-svalue | Skip count: the number of lines to skip in the input file before starting to compile; for example, to skip a “header” in the source file which is not in a valid PAWN syntax. |
-tvalue | tab size: the number of space characters to use for a tab character. When set to zero (i.e. option -t0) the compiler will no longer issue warning 217 (loose indentation). |
-vvalue | Verbose: display imformational messages during the compilation. The value can be 0 (zero) for “quiet” compile, 1 (one) for the normal output and 2 for a code/data/stack usage report. |
-wvalue+/- | Warning control: the warning number following the “-w” is enabled or disabled, depending on whether a “+” or a “-” follows the number. When a “+” or “-” is absent, the warning status is toggled. For example, -w225- disables the warning for “unreachable code”, -w225+ enables it and -w225 toggles between enabled/disabled. |
Only warnings can be disabled (errors and fatal errors cannot be disabled). By default, all warnings are enabled. | |
-Xvalue | Limit for the abstract machine: the maximum memory requirements that a compiled script may have, in bytes. This value is is useful for (embedded) environments where the maximum size of a script is bound to a hard upper limit. If there is no setting for the amount of RAM for the data and stack, this refers to the total memory requirements; if the amount of RAM is explicitly set, this value only goves the amount of memory needed for the code and the static data. |
-XDvalue | RAM limit for the abstract machine: the maximum memory requirements for data and stack that a compiled script may have, in bytes. This value is is useful for (embedded) environments where the maximum data size of a script is bound to a hard upper limit. Especially in the case where the PAWN script runs from ROM, the sizes for the code and data sections need both to be set. |
-\ | Control characters start with “\” (for the sake of similarity with C, C++ and Java) |
-^ | Control characters start with “ˆ” (for compatibility with earlier versions of pawn). |
-;+/- | With -;+ every statement is required to end with a semicolon; with -;-, semicolons are optional to end a statement if the statement is the last on the line. The option -; (without + or − suffix) toggles the current setting. |
sym=value | define constant “sym” with the given (numeric) value, the value is optional; |
@filename | read (more) options from the specified “response file”. |
• Response file
To support operating systems with a limited command line length (e.g., Mi- crosoft DOS), the PAWN compiler supports “response files”. A response file is a text file that contains the options that you would otherwise put at the command line. With the command:
pawncc @opts.txt prog.pawn
the PAWN compiler compiles the file “prog.pawn” using the options that are listed in the response file “opts.txt”.
• Configuration file
On platforms that support it (currently Microsoft DOS, Microsoft Windows and Linux), the compiler reads the options in a “configuration file” on startup.
The configuration file must have the name “pawn.cfg” and it must reside in the same directory as the compiler executable program.
In a sense, the configuration file is an implicit response file. Options specified on the command line may overrule those in the configuration file.
Packed/unpacked strings: 99
Character constants: 99
#pragma dynamic: 121
Warnings: 161
See also #pragma amxlimit on page 119
See also #pragma amxram on page 120
Rationale
The first issue in the presentation of a new computer language should be: why a new language at all?
Indeed, I did look at several existing languages before I designed
my own. Many little languages were aimed at scripting the command shell (TCL, Perl,
Python). Other languages were not designed as extension languages, and put
the burden to embedding solely on the host application.
As I initially attempted to use Java as an extension language (rather than build my own, as I have done now), the differences between PAWN and Java are illus- trative for the almost reciprocal design goals of both languages. For example,
Java promotes distributed computing where “packages” reside on diverse ma- chines, PAWN is designed so that the compiled applets can be easily stored in a compound file together with other data. Java is furthermore designed to be architecture neutral and application independent, inversely PAWN is designed to be tightly coupled with an application; native functions are a taboo to some extent in Java (at least, it is considered “impure”), whereas native functions are “the reason to be” for PAWN. From the viewpoint of PAWN, the intended use of Java is upside down: native functions are seen as an auxiliary library that the application —in Java— uses; in PAWN, native functions are part of “the application” and the PAWN program itself is a set of auxiliary functions that the application uses.
A language for scripting applications: PAWN is targeted as an exten- sion language, meant to write application-specific macros or subprograms with. PAWN is not the appropriate language for implementing business applications or operating systems in. PAWN is designed to be easily integrated with, and embedded in, other systems/applications.
As an extension language, PAWN programs typically manipulate objects of the host application. In an animation system, PAWN scripts deal with sprites, events and time intervals; in a communication application, PAWN scripts handle packets and connections. I assume that the host application will make (a subset of) its resources and functionality available via functions, handles, magic cookies. . . in a similar way that a contemporary operating system provides an interface to processes written in C/C++ —e.g., the Win32 API (“handles everywhere”) or GNU/Linux’ “glibc”. To that end, PAWN has a simple and efficient interface to the “native” functions of the host application. A PAWN script manipulates data objects in the host application through function calls, but it cannot access the data of the host application directly.
The first and foremost criterions for the PAWN language were execution speed and reliability. Reliability in the sense that a PAWN program should not be able to crash the application or tool in which it is embedded —at least, not easily. Although this limits the capabilities of the language significantly, the advantages are twofold:
the application vendor can rest assured that its application will not crash due to user additions or macros,
the user is free to experiment with the language with no (or little) risk of damaging the application files.
Speed is essential: PAWN programs would probably run in an abstract ma- chine, and abstract machines are notoriously slow. I had to make a language that has low overhead and a language for which a fast abstract machine can be written. Speed should also be reliable, in the sense that a PAWN script should not slow down over time or have an occasional performance hiccup. Conse- quently, PAWN excludes any required “background process”, such as garbage collection, and the core of the abstract machine does not implicitly allocate any system or application resources while it runs. That is, PAWN does not allocate memory or open files, not without the help of a native function that the script calls explicitly.
As Dennis Ritchie said, by intent the C language confines itself to facilities
that can be mapped relatively efficiently and directly to machine instructions. The
same is true for PAWN, and this is also a partial explication why PAWN looks
so much like C. Even though PAWN runs on an abstract machine, the goal is to
keep that abstract machine small and quick. PAWN is used in tiny embedded
systems with ram sizes of 32 kiB or less, as well as in high-performance games
that need every processor cycle for their graphics engine and game-play.
In both environments, a heavy-weight scripting support is difficult to swallow.
A brief analysis showed that the instruction decoding logic for an abstract ma-
chine would quickly become the bottleneck in the performance of the abstract
machine. To keep the decoding simple, each opcode should have the
same size (excluding operands), and the opcode should fully specify the instruction
(including the addressing methods, size of the operands, etc.). That meant
that for each operation on a variable, the abstract machine needed a separate
opcode for every combination of variable type, storage class and access method
(direct, or dereferenced). For even three types (int, char and unsigned int),
two storage classes (global and local) and three access methods (direct, indi-
rect or indexed), a total of 18 opcodes (323) are needed to simply fetch the
value of a variable.
At the same time, to keep the abstract machine small and manageable, I set the goal at approximately 100 instructions.∗ With 18 opcodes to load a variable in a register, 18 more to store a register into a variable, another 18 to get the address of a variable, etc. . . I was quickly exceeding my self-imposed limit of a hundred opcodes.
The languages bob and rexx inspired me to design a typeless language. This saved me a lot of opcodes. At the same time, the language could no longer be called a “subset of C”. I was changing the language. Why, then, not go a foot further in changing the language? This is where a few more design guidelines came into play:
give the programmer a general purpose tool, not a special purpose solution
avoid error prone language constructs; promote error checking
be pragmatic
A general purpose tool: PAWN is targeted as an extension language, with-
out specifying exactly what it will extent. Typically, the application or
the tool that uses PAWN for its extension language will provide many, optimized
routines or commands to operate on its native objects, be it text, database
records or animated sprites. The extension language exists to permit the user
to do what the application developer forgot, or decided not to include. Rather
than providing a comprehensive library of functions to sort data, match reg-
ular expressions, or draw cubic B´ezier splines, PAWN should supply a (general
purpose) means to use, extend and combine the specific (“native”) functions
that an application provides.
PAWN lacks a comprehensive standard library. By intent, PAWN also lacks fea-
tures like pointers, dynamic memory allocation, direct access to the operating
system or to the hardware, that are needed to remain competitive in the field of
general purpose application or system programming. You cannot build linked
lists or dynamic tree data structures in PAWN, and neither can you access any
memory beyond the boundaries of the abstract machine. That is not to say
that a PAWN program can never use dynamic, sorted symbol tables, or change
a parameter in the operating system; it can do that, but it needs
to do so by calling a “native” function that an application provides to the abstract machine.
∗ 136 Opcodes are defined at this writing, plus 20 “macro” opcodes. To exploit performance gains by forcing proper alignment of memory words (essential on ARM microprocessors), the current abstract machine uses 32-bit opcodes. There is no technical limit on the number of opcodes, but in the interest of a small footprint, the number of opcodes should be restricted.
In other words, if an application chooses to implement the well known peek and poke functions (from BASIC) in the abstract machine, a PAWN program can access any byte in memory, insofar the operating system permits this. Likewise, an application can provide native functions that insert, delete or search symbols in a table and allows several operations on them. The proposed core functions getproperty and setproperty are an example of native functions that build a linked list in the background.
Promote error checking: As you may have noticed, one of the foremost
design criterions of the C language, “trust the programmer”, is absent from
my list of design criterions. Users of script languages may not be experienced
programmers; and even if they are, PAWN will probably not be their primary
language. Most PAWN programmers will keep learning the language as they
go, and will even after years not have become experts. Enough reason, hence,
to replace error prone elements from the C language (pointers) with
saver, albeit less general, constructs (references).† References are copied from C++.
They are nothing else than pointers in disguise, but they are
restricted in various, mostly useful, ways. Turn to a C⁺⁺ book to find more justification for references.
I find it sad that many, even modern, programming languages have so
little built-in, or easy to use, support for confirming that programs do
as the programmer intended. I am not referring to theoretical correctness (which is
too costly to achieve for anything bigger than toy programs), but
practical, easy to use, verification mechanisms as a help to the programmer.
PAWN provides both compile time and execution time assertions to use for preconditions, postconditions and invariants.
The typing mechanism that most programming languages use is also an auto- matic “catcher” of a whole class of bugs. By virtue of being a typeless language, PAWN lacked these error checking abilities. This was clearly a weakness, and I created the “tag” mechanism as an equivalent for verifying function parameter passing, array indexing and other operations.
† You should see this remark in the context of my earlier assertion that many “PAWN” programmers will be novice programmers. In my (teaching) experience, novice programmers make many pointer errors, as opposed to experienced C/C++ programmers
The quality of the tools: the compiler and the abstract machine, also have a
great impact on the robustness of code —whatever the language. Although
this is only very loosely related to the design of the language, I set out to
build the tools such that they promote error checking. The warning system of PAWN
goes a step beyond simply reporting where the parser fails to
interpret the data according to the language grammar. At several occasions, the compiler
runs checks that are completely unrelated to generating code and that are im-
plemented specifically to catch possible errors. Likewise, the “debugger hook”
is designed right into the abstract machine, it is not an add-on implemented
as an after-thought.
Be pragmatic: The object-oriented programming paradigm has not entirely
lived up to its promise, in my opinion. On the one hand, OOP solves many
tasks in an easier or cleaner way, due to the added abstraction layer. On the
other hand, contemporary object-oriented languages leave you struggling with
the language as much as with the task at hand. Object-oriented languages are
attractive mainly because of the comprehensive class libraries that they come
with —but leaning on a standard library goes against one of the design goal
for PAWN. Object-oriented programming is not a solution for a
non-expert programmer with little patience for artificial complexity. The criterion
“be pragmatic” is a reminder to seek solutions, not elegancy.
• Practical design criterions
The fact that PAWN looks so much like C cannot be a coincidence, and it isn’t. PAWN started as a C dialect and stayed that way, because C has a proven track record. The changes from C were mostly born out of necessity after rubbing out the features of C that I did not want in a scripting language: no pointers and no “typing” system.
PAWN, being a typeless language, needed a different means to declare variables.
In the course of modifying this, I also dropped the C requirement
that all variables should be declared at the top of a compound statement. PAWN is a
little more like C⁺⁺ in this respect.
C language functions can pass “output values” via pointer arguments. The
standard function scanf, for example, stores the values or strings that it reads
from the console into its arguments. You can design a function in C so that
it optionally returns a value through a pointer argument; if the caller of the
function does not care for the return value, it passes NULL as the pointer
value. The standard function strtol is an example of a function that does this. This
technique frequently saves you from declaring and passing dummy variables.
PAWN replaces pointers with references, but references cannot be NULL. Thus,
PAWN needed a different technique to “drop” the values that a function returns
via references. Its solution is the use of an “argument placeholder”
that is written as an underscore character (“ ”); Prolog programmers will recognize
it as a similar feature in that language. The argument placeholder
reserves a temporary anonymous data object (a “cell” or an array of cells)
that is automatically destroyed after the function call.
The temporary cell for the argument placeholder should still have a value, be- cause the function may see a reference parameters as input/output. Therefore, a function must specify for each passed-by-reference argument what value it will have upon entry when the caller passes the placeholder instead of an ac- tual argument. By extension, I also added default values for arguments that are “passed-by-value”. The feature to optionally remove all arguments with default values from the right was copied from C++.
When speaking of BCPL and B, Dennis Ritchie said that C was invented in part to provide a plausible way of dealing with character strings when one begins with a word-oriented language. PAWN provides two options for working with strings, packed and unpacked strings. In an unpacked string, every character
fits in a cell. The overhead for a typical 32-bit implementation is large: one
character would take four bytes. Packed strings store up to four characters
in one cell, at the cost of being significantly more difficult
to handle if you could only access full cells. Modern BCPL implementations provide two array
indexing methods: one to get a word from an array and one to get a character
from an array. PAWN copies this concept, although the syntax differs from that
of BCPL. The packed string feature also led to the new operator char.
Unicode applications often have to deal with two characters sets:
8-bit for legacy file formats and standardized transfer formats (like many of the Internet
protocols) and the 16-bit Unicode character set (or the 31-bit UCS-4 character
set). Although the PAWN compiler has an option that makes characters 16-bit
(so only two characters fit in a 32-bit cell), it is usually more
convenient to store single-byte character strings in packed strings and multi-byte strings in
unpacked strings. This turns a weakness in PAWN —the need to distinguish
packed strings from unpacked strings— into a strength: PAWN can make that
distinction quite easily. And instead of needing two implementations for every function that deals with strings (an ascii version and a Unicode version —look at the Win32 API, or even the standard C library), PAWN enables functions to handle both packed and unpacked strings with ease.
Notwithstanding the above mentioned changes, plus those in the chapter “Pit- falls: differences from C” (page 134), I have tried to keep PAWN close to C. A final point, which is unrelated to language design, but important nonetheless, is the license: PAWN is distributed under a liberal license allowing you to use and/or adapt the code with a minimum of restrictions —see appendix D.
Support for Unicode string literals: 139
License
The software toolkit “PAWN” (the compiler, the abstract machine and
the documentation) are copyright c 1997–2006 by ITB CompuPhase. The Intel
assembler implementation of the abstract machine and the just-in-time com-
piler (specifically the files amxexec.asm, amxjitr.asm and amxjits.asm)
are copyright c 1998-2003 Marc Peter. The file amxjitsn.asm is translated
from amxjits.asm and is partially c 2004 G.W.M. Vissers. The file amxex-
ecn.asm is translated from amxexec.asm and is partially c 2004–2005 ITB CompuPhase.
PAWN is distributed under the “zLib/libpng” license, which is reproduced be- low:
This software is provided “as-is”, without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software.
Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:
1 The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required.
2 Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
3 This notice may not be removed or altered from any source distribution.
The zLib/libpng license has been approved by the “Open Source Initiative” organization.