note Help Needed

This wiki is the result of an ongoing community effort — thank you all for helping!

If you want to provide changes to this page then please click here.

APPENDICES

Error and warning messages

When the compiler finds an error in a file, it outputs a message giving, in this order:

the name of the file
the line number were the compiler detected the error between parentheses, directly behind the filename
the error class (“error”, “fatal error” or “warning”)
an error number
a descriptive error message

For example:

demo.p(3) : error 001: expected token: ";", but found "{"

Note: the line number given by the compiler may specify a position behind the actual error, since the compiler cannot always establish an error before having analyzed the complete expression.

After termination, the return code of the compiler is:

0   no errors —there may be warnings, though
1   errors found
2   reserved
3   aborted by user

These return codes may be checked within batch processors (such as the “make” utility).

• Error categories

Errors are separated into three classes:

Type	Description
Errors	Describe situations where the compiler is unable to generate appropriate code. Errors messages are numbered from 1 to 99.
Fatal errors	Fatal errors describe errors from which the compiler cannot recover. Parsing is aborted. Fatal error messages are numbered from 100 to 199.
Warnings	Warnings are displayed for unintended compiler assumptions and common mistakes. Warning messages are numbered from 200 to 299.

• Errors

Number	Description
001	expected token: token, but found token
	A required token is omitted.
002	only a single statement (or expression) can follow each “case”
	Every case in a switch statement can hold exactly one statement. To put multiple statements in a case, enclose these statements between braces (which creates a combound statement).
003	declaration of a local variable must appear in a compound block
	The declaration of a local variable must appear between braces (“{. . . }”) at the active scope level.
	When the parser flags this error, a variable declaration appears as the only statement of a function or the only statement below an if, else, for, while or do statement. Note that, since local variables are accessible only from (or below) the scope that their declaration appears in, having a variable declaration as the only statement at any scope is useless.
004	function name is not implemented
	There is no implementation for the designated function. The function may have been “forwardly” declared —or prototyped— but the full function definition including a statement, or statement block, is missing.
005	function may not have arguments
	The function main() is the program entry point. It may not have arguments.
006	must be assigned to an array
	String literals or arrays must be assigned to an array. This error message may also indicate a missing index (or indices) at the array on the right side of the “=” sign.
007	operator cannot be redefined
	Only a select set of operators may be redefined, this operator is not one of them. See page 86 for details.
008	must be a constant expression; assumed zero
	The size of arrays and the parameters of most directives must be constant values.
009	invalid array size (negative or zero)
	The number of elements of an array must always be 1 or more.
010	illegal function or declaration
	The compiler expects a declaration of a global variable or of a function at the current location, but it cannot interpret it as such.
011	invalid outside functions
	The instruction or statement is invalid at a global level. Local labels and (compound) statements are only valid if used within functions.
012	invalid function call, not a valid address
	The symbol is not a function.
013	no entry point (no public functions)
	The file does not contain a main function or any public function. The compiled file thereby does not have a starting point for the execution.
014	invalid statement; not in switch
	The statements case and default are only valid inside a switch statement.
015	“default” must be the last clause in switch statement
	pawn requires the default clause to be the last clause in a switch statement.
016	multiple defaults in “switch”
	Each switch statement may only have one default clause.
017	undefined symbol symbol
	The symbol (variable, constant or function) is not declared.
018	initialization data exceeds declared size
	Initialization: 65 An array with an explicit size is initialized, but the number of initiallers exceeds the number of elements specified. For example, in “arr[3]={1,2,3,4};” the array is specified to have three elements, but there are four initiallers.
019	not a label: name
	A goto statement branches to a symbol that is not a label.
020	invalid symbol name
	A symbol may start with a letter, an underscore or an “at” sign (“@”) and may be followed by a series of letters, digits, underscore characters and “@” characters.
021	symbol already defined: identifier
	The symbol was already defined at the current level.
022	must be lvalue (non-constant)
	The symbol that is altered (incremented, decremented, assigned a value, etc.) must be a variable that can be modified (this kind of variable is called an lvalue). Functions, string literals, arrays and constants are no lvalues. Variables declared with the “const” attribute are no lvalues either.
023	array assignment must be simple assignment
	When assigning one array to another, you cannot combine an arithmetic operation with the assignment (e.g., you cannot use the “+=” operator).
024	“break” or “continue” is out of context
	The statements break and continue are only valid inside the context of a loop (a do, for or while statement). Unlike the languages C/C⁺⁺ and Java, break does not jump out of a switch statement.
025	function heading differs from prototype
	The number of arguments given at a previous declaration of the function does not match the number of arguments given at the current declaration.
026	no matching “#if...”
	The directive #else or #endif was encountered, but no matching #if directive was found.
027	invalid character constant
	One likely cause for this error is the occurrence of an unknown escape sequence, like “\x”. Putting multiple characters between single quotes, as in ’abc’ also issues this error message. A third cause for this error is a situation where a character constant was expected, but none (or a non-character expression) were provided.
028	invalid subscript (not an array or too many subscripts): identifier
	The subscript operators “[” and “]” are only valid with arrays. The number of square bracket pairs may not exceed the number of dimensions of the array.
029	invalid expression, assumed zero
	The compiler could not interpret the expression.
030	compound statement not closed at the end of file
	An unexpected end of file occurred. One or more compound statements are still unfinished (i.e. the closing brace “ ” has not been found).
031	unknown directive
	The character “#” appears first at a line, but no valid directive was specified.
032	array index out of bounds
	The array index is larger than the highest valid entry of the array.
033	array must be indexed (variable name)
	An array as a whole cannot be used in a expression; you must indicate an element of the array between square brackets.
034	argument does not have a default value (argument index)
	You can only use the argument placeholder when the function definition specifies a default value for the argument.
035	argument type mismatch (argument index)
	The argument that you pass is different from the argument that the function expects, and the compiler cannot convert the passed-in argument to the required type. For example, you cannot pass the literal value “1” as an argument when the function expects an array or a reference.
036	empty statement
	The line contains a semicolon that is not preceded by an expression. pawn does not support a semicolon as an empty statement, use an empty compound block instead.
037	invalid string (possibly non-terminated string)
	A string was not well-formed; for example, the final quote that ends a string is missing, or the filename for the #include directive was not enclosed in double quotes or angle brackets.
038	extra characters on line
	There were trailing characters on a line that contained a directive (a directive starts with a # symbol, see page 117).
039	constant symbol has no size
	A variable has a size (measured in a number of cells), a constant has no size. That is, you cannot use a (symbolic) constant with the sizeof operator, for example.
040	duplicate “case” label (value value)
	A preceding “case label” in the list of the switch statement evaluates to the same value.
041	invalid ellipsis, array size is not known
	You used a syntax like “arr[] = { 1, ... };”, which is invalid, because the compiler cannot deduce the size of the array from the declaration.
042	invalid combination of class specifiers
	A function or variable is denoted as both “public” and “native”, which is unsupported. Other compinations may also be unsupported; for example, a function cannot be both “public” and “stock” (a variable may be declared both “public” and “stock”).
043	character constant exceeds range for packed string
	Usually an attempt to store a Unicode character in a packed string where a packed character is 8-bits.
044	mixing named and positional parameters
	You must either use named parameters or positional parameters for all parameters of the function.
045	too many function arguments
	The maximum number of function arguments is currently limited to 64.
046	unknown array size (variable name)
	For array assignment, the size of both arrays must be explicitly defined, also if they are passed as function arguments.
047	array sizes do not match, or destination array is too small
	For array assignment, the arrays on the left and the right side of the assignment operator must have the same number of dimensions. In addition:
	- for multi-dimensional arrays, both arrays must have the same size;
	- for single arrays with a single dimension, the array on the left side of the assignment operator must have a size that is equal or bigger than the one on the right side.
	When passing arrays to a function argument, these rules also hold for the array that is passed to the function (in the function call) versus the array declared in the function definition.
	When a function returns an array, all return statements must specify an array with the same size and dimensions.
048	array dimensions do not match
	For an array assignment, the dimensions of the arrays on both sides of the “=” sign must match; when passing arrays to a function argument, the arrays passed to the function (in the function call) must match with the definition of the function arguments.
	When a function returns an array, all return statements must specify an array with the same size and dimensions.
049	invalid line continuation
	A line continuation character (a backslash at the end of a line) is at an invalid position, for example at the end of a file or in a single line comment.
050	invalid range
	A numeric range with the syntax “n1 .. n2”, where n1 and n2 are numeric constants, is invalid. Either one of the values in not a valid number, or n1 is not smaller than n2.
051	invalid subscript, use “[ ]” operators on major dimensions
	You can use the “array character index” operator (braces: “{ }” only for the last dimension. For other dimensions, you must use the cell index operator (square brackets: “[ ]”).
052	multi-dimensional arrays must be fully initialized
	If an array with more than one dimension is initialized at its declaration, then there must be equally many literal vectors/subarrays at the right of the equal sign (“=”) as specified for the major dimension(s) of the array.
053	exceeding maximum number of dimensions
	The current implementation of the pawn compiler only supports arrays with one or two dimensions.
054	unmatched closing brace
	A closing brace (“}”) was found without matching opening brace (“{”).
055	start of function body without function header
	An opening brace (“{”) was found outside the scope of a function. This may be caused by a semicolon at the end of a preceding function header.
056	local variables and function arguments cannot be public
	A local variable or a function argument starts with the character “@”, which is invalid.
057	Unfinished expression before compiler directive
	Compiler directives may only occur between statements, not inside a statement. This error typically occurs when an expression statement is split over multiple lines and a compiler directive appears between the start and the end of the expression. This is not supported.
058	duplicate argument; same argument is passed twice
	In the function call, the same argument appears twice, possibly through a mixture of named and positional parameters.
059	function argument may not have a default value (variable name)
	All arguments of public functions must be passed explicitly. Public functions are typically called from the host application, who has no knowledge of the default parameter values. Arguments of user defined operators are implied from the expression and cannot be inferred from the default value of an argument.
060	multiple “#else” directives between “#if . . . #endif
	Two or more #else directives appear in the body between the matching #if and #endif.
061	“#elseif” directive follows an “#else” directive
	All #elseif directives must appear before the #else directive. This error may also indicate that an #endif directive for a higher level is missing.
062	number of operands does not fit the operator
	When redefining an operator, the number of operands that the operator has (1 for unary operators and 2 for binary operators) must be equal to the number of arguments of the operator function.
063	operator requires that the function result has a “bool” tag
	Logical and relational operators are defined as having a result that is either true (1) or false (0) and having a “bool” tag. A user defined operator should adhere to this definition.
064	cannot change predefined operators
	One cannot define operators to work on untagged values, for example, because pawn already defines this operation.
065	function argument may only have a single tag (argument number)
	In a user defined operator, a function argument may not have multiple tags.
066	function argument may not be a reference argument or an array (argument number)
	In a user defined operator, all arguments must be cells (non-arrays) that are passed “by value”.
067	variable cannot be both a reference and an array (variable name)
	A function argument may be denoted as a “reference” or as an array, but not as both.
068	invalid rational number precision in #pragma
	The precision was negative or too high. For floating point rational numbers, the precision specification should be omitted.
069	rational number format already defined
	This #pragma conflicts with an earlier #pragma that specified a different format.
070	rational number support was not enabled
	A rational literal number was encountered, but the format for rational numbers was not specified.
071	user-defined operator must be declared before use (function name)
	Like a variable, a user-defined operator must be declared before its first use. This message indicates that prior to the declaration of the user-defined operator, an instance where the operator was used on operands with the same tags occurred. This may either indicate that the program tries to make mixed use of the default operator and a user-defined operator (which is unsupported), or that the user-defined operator must be “forwardly declared”.
072	“sizeof ” operator is invalid on “function” symbols
	You used something like “sizeof MyCounter” where the symbol “MyCounter” is not a variable, but a function. You cannot request the size of a function.
073	function argument must be an array (argument name)
	The function argument is a constant or a simple variable, but the function requires that you pass an array.
074	#define pattern must start with an alphabetic character
	Any pattern for the #define directive must start with a letter, an underscore (“_”) or an “@”-character. The pattern is the first word that follows the #define keyword.
075	input line too long (after substitutions)
	Either the source file contains a very long line, or text substitutions make a line that was initially of acceptable length grow beyond its bounds. This may be caused by a text substitution that causes recursive substitution (the pattern matching a portion of the replacement text, so that this part of the replacement text is also matched and replaced, and so forth).
076	syntax error in the expression, or invalid function call
	The expression statement was not recognized as a valid statement (so it is a “syntax error”). From the part of the string that was parsed, it looks as if the source line contains a function call in a “procedure call” syntax (omitting the parentheses), but the function result is used —assigned to a variable, passed as a parameter, used in an expession. . .
077	malformed UTF-8 encoding, or corrupted file: filename
	The file starts with an UTF-8 signature, but it contains encodings that are invalid UTF-8. If the source file was created by an editor or converter that supports UTF-8, the UTF-8 support is non-conforming.
078	function uses both “return” and “return ¡value¿”
	The function returns both with and without a return value. The function should be consistent in always returning with a function result, or in never returning a function result.
079	inconsistent return types (array & non-array)
	The function returns both values and arrays, which is not allowed. If a function returns an array, all return statements must specify an array (of the same size and dimensions).
080	unknown symbol, or not a constant symbol (symbol name)
	Where a constant value was expected, an unknown symbol or a non-constant symbol (variable) was found.
081	cannot take a tag as a default value for an indexed array parameter (symbol name)
	The tagof operator was used on an array parameter where the array also had an index. This is unsupported.
082	user-defined operators and native functions may not havestates
	Only standard and public functions may have states.
083	a function or variable may only belong to a single automaton (symbol name)
	There are multiple automatons in the state declaration for the indicated function or variable, which is not supported. In the case of a function: all instances of the function must belong to the same automaton. In the case of a variable: it is allowed to have several variables with the same name belonging to different automatons, but only in separate declarations —these are distinct variables.
084	state conflict: one of the states is already assigned to another implementation (symbol name)
	The specified state appears in the state specifier of two implementations of the same function.
085	no states are defined for symbol name
	When this error occurs on a function, this function has a fall-back implementation, but no other states. If the error refers to a variable, this variable does not have a list of states between the < and > characters. Use a state-less function ir variable instead.
086	unknown automaton name
	The “state” statement refers to an unknown automaton.
087	unknown state name for automaton name
	The “state” statement refers to an unknown state (for the specified automaton).
088	public variables and local variables may not have states (symbol name)
	Only standard (global) variables may have a list of states (and an automaton) at the end of a declaration.
089	state variables may not be initialized (symbol name)
	Variables with a state list attached may not have initializers. State variables should always be explicitly initialized, as their initial value is indeterminate.
090	public functions may not return arrays (symbol name)
	A public function may not return an array. Returning arrays is allowed only for normal functions.

• Fatal Errors

Number	Description
100	cannot read from file: filename
	The compiler cannot find the specified file or does not have access to it.
101	cannot write to file: filename
	The compiler cannot write to the specified output file, probably caused by insufficient disk space or restricted access rights (the file could be read-only, for example).
102	table overflow: table name
	An internal table in the pawn parser is too small to hold the required data. Some tables are dynamically growable, which means that there was insufficient memory to resize the table. The “table name” is one of the following:
	“staging buffer”: the staging buffer holds the code generated for an expression before it is passed to the peephole optimizer. The staging buffer grows dynamically, so an overflow of the staging buffer basically is an “out of memory” error.
	“loop table”: the loop table is a stack used with nested do, for, and while statements. The table allows nesting of these statements up to 24 levels.
	“literal table”: this table keeps the literal constants (numbers, strings) that are used in expressions and as initiallers for arrays. The literal table grows dynamically, so an overflow of the literal table basically is an “out of memory” error.
	“compiler stack”: the compiler uses a stack to store temporary information it needs while parsing. An overflow of this stack is probably caused by deeply nested (or recursive) file inclusion. The compiler stack grows dynamically, so an overflow of the compiler stack basically is an “out of memory” error.
	“option table”: in case that there are more options on the command line or in the response file than the compiler can cope with.
103	insufficient memory
	General “out of memory” error.
104	invalid assembler instruction symbol
	An invalid opcode in an #emit directive.
105	numeric overflow, exceeding capacity
	A numeric constant, notably a dimension of an array, is too large for the compiler to handle. For example, when compiled as a 16-bit application, the compiler cannot handle arrays with more than 32767 elements.
106	compiled script exceeds the maximum memory size (number bytes)
	The memory size for the abstract machine that is needed to run the script exceeds the value set with #pragma amxlimit. This means that the script is too large to be supported by the host. You might try reducing the script’s memory requirements by:
	- setting a smaller stack/heap area —see #pragma dynamic at page 121;
	- using packed strings instead of unpacked strings —see pages 99 and 137;
	- putting repeated code in separate functions;
	- putting repeated data (strings) in global variables;
	- trying to find more compact algorithms to perform the same task.
107	too many error/warning messages on one line
	A single line that causes several error/warning messages is often an indication that the pawn parser is unable to “recover” from an earlier error. In this situation, the parser is unlikely to make any sense of the source code that follows —producing only (more) inappropriate error messages. Therefore, compilation is halted.
108	codepage mapping file not found
	The file for the codepage translation that was specified with the -c compiler option or the #pragma codepage directive could not be loaded.
109	invalid path: path name
	A path, for example for include files or codepage files, is invalid.
110	assertion failed: expression
	Compile-time assertion failed.
111	user error: message
	The parser fell on an #error directive.

• Warnings

Number	Description
200	symbol is truncated to number characters
	The symbol is longer than the maximum symbol length. The maximum length of a symbol depends on wether the symbol is native, public or neither. Truncation may cause different symbol names to become equal, which may cause error 021 or warning 219.
201	redefinition of constant/macro (symbol name)
	The symbol was previously defined to a different value, or the text substitution macro that starts with the prefix name was redefined with a different substitution text.
202	number of arguments does not match definition
	At a function call, the number of arguments passed to the function (actual arguments) differs from the number of formal arguments declared in the function heading. To declare functions with variable argument lists, use an ellipsis (...) behind the last known argument in the function heading; for example: print(formatstring,...); (see page 80).
203	symbol is never used: identifier
	A symbol is defined but never used. Public functions are excluded from the symbol usage check (since these may be called from the outside).
204	symbol is assigned a value that is never used: identifier
	A value is assigned to a symbol, but the contents of the symbol are never accessed.
205	redundant code: constant expression is zero
	Where a conditional expression was expected, a constant expression with the value zero was found, e.g. “while (0)” or “if (0)”.
	The the conditional code below the test is never executed, and it is therefore redundant.
206	redundant test: constant expression is non-zero
	Where a conditional expression was expected, a constant expression with a non-zero value was found, e.g. if (1). The test is redundant, because the conditional code is always executed.
207	unknown “#pragma”
	The compiler ignores the pragma. The #pragma directives may change between compilers of different vendors and between different versions of a compiler of the same version.
208	function with tag result used before definition, forcing reparse
	When a function is “used” (invoked) before being declared, and that function returns a value with a tag name, the parser must make an extra pass over the source code, because the presence of the tag name may change the interpretation of operators (in the presence of user-defined operators). You can speed up the parsing/compilation process by declaring the relevant functions before using them.
209	function should return a value
	The function does not have a return statement, or it does not have an expression behind the return statement, but the function’s result is used in a expression.
210	possible use of symbol before initialization: identifier
	A local (uninitialized) variable appears to be read before a value is assigned to it. The compiler cannot determine the actual order of reading from and storing into variables and bases its assumption of the execution order on the physical appearance order of statements an expressions in the source file.
211	possibly unintended assignment
	Where a conditional expression was expected, the assignment operator (=) was found instead of the equality operator (==). As this is a frequent mistake, the compiler issues a warning. To avoid this message, put parentheses around the expression, e.g. if ( (a=2) ).
212	possibly unintended bitwise operation
	Where a conditional expression was expected, a bitwise operator (& or \|) was found instead of a Boolean operator (&& or \|\|). In situations where a bitwise operation seems unlikely, the compiler issues this warning. To avoid this message, put parentheses around the expression.
213	tag mismatch
	A tag mismatch occurs when:
	- assigning to a tagged variable a value that is untagged or that has a different tag
	- the expressions on either side of a binary operator have different tags
	- in a function call, passing an argument that is untagged or that has a different tag than what the function argument was defined with
	- indexing an array which requires a tagged index with no tag or a wrong tag name
214	possibly a “const” array argument was intended: identifier
	Arrays are always passed by reference. If a function does not modify the array argument, however, the compiler can sometimes generate more compact and quicker code if the array argument is specifically marked as “const”.
215	expression has no effect
	The result of the expression is apparently not stored in a variable or used in a test. The expression or expression statement is therefore redundant.
216	nested comment
	PAWN does not support nested comments.
217	loose indentation
	Statements at the same logical level do not start in the same column; that is, the indents of the statements are different. Although pawn is a free format language, loose indentation frequently hides a logical error in the control flow.
	The compiler can also incorrectly assume loose indentation if the tab size with which you indented the source code differs from the assumed size, see #pragma tabsize on page 122 or the compiler option -t on page 169.
218	old style prototypes used with optional semicolon
	When using “optional semicolons”, it is preferred to explicitly declare forward functions with the forward keyword than using terminating semicolon.
219	local variable identifier shadows a symbol at a preceding level
	A local variable has the same name as a global variable, a function, a function argument, or a local variable at a lower precedence level. This is called “shadowing”, as the new local variable makes the previously defined function or variable inaccessible.
	Note: if there are also error messages further on in the script about missing variables (with these same names) or brace level problems, it could well be that the shadowing warnings are due to these syntactical and sematical errors. Fix the errors first before looking at the shadowing warnings.
220	expression with tag override must appear between parentheses
	In a case statement and in expressions in the conditional operator (“ ? : ”), any expression that has a tag override should be enclosed between parentheses, to avoid the colon to be misinterpreted as a separator of the case statement or as part of the conditional operator.
221	label name identifier shadows tag name
	A code label (for the goto instruction) has the same name as a previously defined tag. This may indicate a faultily applied tag override; a typical case is an attempt to apply a tag override on the variable on the left of the = operator in an assignment statement.
222	number of digits exceeds rational number precision
	A literal rational number has more decimals in its fractional part than the precision of a rational number supports. The remaining decimals are ignored.
223	redundant “sizeof ”: argument size is always 1 (symbol name)
	A function argument has a as its default value the size of another argument of the same function. The “sizeof” default value is only useful when the size of the referred argument is unspecified in the declaration of the function; i.e., if the referred argument is an array.
224	indeterminate array size in “sizeof ” expression (symbol name)
	The operand of the sizeof operator is an array with an unspecified size. That is, the size of the variable cannot be determined at compile time. If used in an “if” instruction, consider a conditionally compiled section, replacing if by #if.
225	unreachable code
	The indicated code will never run, because an instruction before (above) it causes a jump out of the function, out of a loop or elsewhere. Look for return, break, continue and goto instructions above the indicated line.
226	a variable is assigned to itself (symbol name)
	There is a statement like “x = x” in the code. The parser checks for self assignments after performing any text and constant substitutions, so the left and right sides of an assignment may appear to be different at first sight. For example, if the symbol “TWO” is a constant with the value 2, then “var[TWO] = var[2]” is also a self-assignment.
	Self-assignments are, of course, redundant, and they may hide an error (assignment to the wrong variable, error in declaring constants).
	Note that the pawn parser is limited to performing “static checks” only. In this case it means that it can only compare array assignments for self-assignment with constant array indices.
227	more initiallers than enum fields
	An array whose size is declared with an enum symbol contains more values/fields as initiallers than the enumeration defines.
228	length of initialler exceeds size of the enum field
	An array whose size is declared with an enum symbol, and the relevant enumeration field has a size. The initialler in the array contains more values than the size of the enumeration field allows.
229	index tag mismatch (symbol name)
	When indexing an array, the expression used as the index has a different tag than what the one in the declaration of the array. See pages 29 and 68 for an explanation and examples.
230	no implementation for state name in function name , no fall-back
	A function is lacking an implementation for the indicated state. The compiler cannot (statically) check whether the function will ever be called in that state, and therefore it issues this warning. When the function would be called for the state for which no implementation exists, the abstract machine aborts with a run time error.
	See page 83 on how to specify a fall-back function, and page 44 for a description and an example.
231	state specification on forward declaration is ignored
	A state specification is redundant on forward declarations. The function signature must be equal for all states. Only the implementations of the function are state-specific.
232	compaction buffer overflow
	Compact encoding may in some particular cases result in files that would actually be bigger than the non-compact encoding. The abstract machine cannot handle this, as it unpacks the P-code “in place”. When the compiler deticts this situation, it re-builds the file with compact encoding switched off. To avoid this warning, force building the file with plain (“non-compact”) encoding —see page 120.
233	state variable name shadows a global variable
	The state variable has the same name as a global variable (without state specifiers). This means that the global variable is inaccessible for a function with one of the same states as those of the variable.
234	function is depricated (symbol name)
	The script uses a function which as marked as “depricated”. The host application can mark (native) functions as depricated when better alternatives for the function are available or if the function may not be supported in future versions of the host application.
235	call to undeclared public function (symbol name)
	The script defines a public function, but no forward declaration of this function is present. Possibly the function name was written incorrectly. The requirement for forward declarations of public functions guards against a common error.
236	unknown parameter in substitution (incorrect #define pattern)
	A #define pattern contains a parameter in the replacement (e.g. “%1”, but one in the match pattern. See page 93 for the preprocessor syntax.

Pitfalls: 134

Compound statement: 112

Forward declaration: 82

Symbol name syntax: 97

Escape sequence: 99

Empty compound block: 112

Single line comment: 97

Named versus positional parameters: 74

#pragma rational: 121

Forward declaration: 82

State specifiers: 83

Fall-back: 83

See also #pragma

amxlimit on page 119

#pragma codepage: 120

#assert directive: 117

#error directive: 117

User-defined operators: 86

Forward declaration: 82

Tags are discussed on page 68

#if . . . #else . . . #endif: 117

State specifiers: 83

The compiler

Many applications that embed the PAWN scripting language use the stand- alone compiler that comes with the PAWN toolkit. The PAWN compiler is a command-line utility, meaning that you must run it from a “console window”, a terminal/shell, or a “DOS box” (depending on how your operating system calls it).

• Usage

Assuming that the command-line PAWN compiler is called “pawncc” (Unix/ Linux) or “pawncc.exe” (DOS/Windows), the command line syntax is:

pawncc <filename> [more filenames...] [options]

The input file name is any legal filename. If no extension is given, “.pawn” or “.p” is assumed. The compiler creates an output file with, by default, the same name as the input file and the extension “.amx”.

After switching to the directory with the sample programs, the command:

pawncc hello

should compile the very first “hello world” example (page 5). Should, because the command implies that:

the operating system can locate the “pawncc” program —you may need to add it to the search path;
the PAWN compiler is able to determine its own location in the file system so that it can locate the include files —a few operating systems do not support this and require that you use the -i option (see below).

• Input file

The input file for the PAWN compiler, the “source code” file for the
script/program, must be a plain text file. All reserved words and all symbol names (names for variables, functions, symbolic constants, tags, . . . ) must
use the ascii character set. Literal strings, i.e text between quotes, may be in extended ascii, such as one of the sets standardized in the ISO 8859 norm —ISO 8859-1 is the well known “Latin 1” set.

The PAWN compiler also supports UTF-8 encoded text files, which are practical in an environment based on Unicode or UCS-4. The PAWN compiler only recognizes UTF-8 encoded characters inside unpacked strings and character constants. The compiler interprets the syntax rules for UTF-8 files
strictly; non-conforming UTF-8 files are not recognized. The input file may have, but does not require, a “Byte Order Mark” signature; the compiler recognizes the UTF-8 format based on the file’s content.

• Options

Options start with a dash (“-”) or, on Microsoft Windows and DOS, with a forward slash (“/”). In other words, all platforms accept an option written as “-a” (see below for the purpose of this option) and the
DOS/Windows platforms accept “/a” as an alternative way to write “-a”.

All options should be separated by at least one space.

Many options accept a value —which is sometimes mandatory. A value may be separated from the option letter by a colon or an equal sign (a “:” and a “=” respectively), or the value may be glued to the option letter. Three equivalent options to set the debug level to two are thus:

-d2
-d:2
-d=2

The options are:

Option	Description
-a	Assembler: generate a text file with the pseudo-assembler code for the PAWN abstract machine, instead of binary code.
-C+/-	Compact encoding of the binary file, which reduces the size a the output file typically to less than half the original size. Use -C+ to enable it and -C- to revert to “plain” encoding. The option -C (without + or − suffix) toggles the current setting.
-cname	Codepage: set the codepage for translating the source file from extended ascii to Unicode/UCS-4. The default is no translation. The name parameter can specify a full path to a “mapping file” or just the identifier of the codepage —in the latter case, the compiler prefixes the identifier with the letters “cp”, appends the extension “.txt” and loads the mapping file from a system directory.
-Dpath	Directory: the “active” directory, where the compiler should search for its input files and store its output files.
	This option is not supported on every platform. To verify whether the PAWN compiler supports this option, run the compiler without any option or filename on the command line. The compiler will then list its usage syntax and all available options in alphabetical order. If the -D switch is absent, the option is not available.
-dlevel	Debug level: 0 = none, 1 = bounds checking and assertions only, 2 = full symbolic information, 3 = full symbolic information and optimizations disabled (same as the combination -d2 and -O0).
	When the debug level is 2 or 3, the PAWN compiler also prints the estimated number of stack/heap space required for the program.
-efilename	Error file: set the name of the file into which the compiler must write any warning and error messages; when set, there is no output to the screen.
-Hvalue	“HWND” (Microsoft Windows version only): the compiler can optionally post a message to the specified window handle upon completion of the P-code generation. Host applications that invoke the PAWN compiler can wait for the arrival of this message or signal the user of the completion of the compile.
	The message number that is sent to the window is created with the Microsoft Windows SDK function RegisterWindowMessage using the name “PawnNotify”. The wParam of the message holds the compiler return code: 0 = success, 1 = warnings, 2 = errors (plus possibly warnings), 3 = compilation aborted by the user.
-ipathname	Include path: set the path where the compiler can find the include files. This option may appear multiple times at the command line, to allow you to set several include paths.
-l	Listing: perform only the file reading and preprocessing steps; for example, to verify the effect of the text substitution macros and the conditionally compiled/skipped sections.
-Olevel	Optimization level: 0 = no optimizations; 1 = JIT compatible optimizations only (JIT = “Just In Time” compiler, a high performance abstract machine); 2 = full optimizations.
-ofilename	Output file: set the name and path of the binary output file.
-pfilename	Prefix file: the name of the “prefix file”, this is a file that is parsed before the input file (as a kind of implicit “include file”). If used, this option overrides the default include file “default.inc”. The -p option on its own (without a filename) disables the processing of any implicit include file.
-rfilename	Report: enable the creation of the report and optionally set the filename to which the extracted documentation and a crossreference report will be written.
	The report is in “XML” format. The filename parameter is optional; if not specified, the report file has the same name as the input file with the extension “.XML”.
-Svalue	Stack size: the size of the stack and the heap in cells.
-svalue	Skip count: the number of lines to skip in the input file before starting to compile; for example, to skip a “header” in the source file which is not in a valid PAWN syntax.
-tvalue	tab size: the number of space characters to use for a tab character. When set to zero (i.e. option -t0) the compiler will no longer issue warning 217 (loose indentation).
-vvalue	Verbose: display imformational messages during the compilation. The value can be 0 (zero) for “quiet” compile, 1 (one) for the normal output and 2 for a code/data/stack usage report.
-wvalue+/-	Warning control: the warning number following the “-w” is enabled or disabled, depending on whether a “+” or a “-” follows the number. When a “+” or “-” is absent, the warning status is toggled. For example, -w225- disables the warning for “unreachable code”, -w225+ enables it and -w225 toggles between enabled/disabled.
	Only warnings can be disabled (errors and fatal errors cannot be disabled). By default, all warnings are enabled.
-Xvalue	Limit for the abstract machine: the maximum memory requirements that a compiled script may have, in bytes. This value is is useful for (embedded) environments where the maximum size of a script is bound to a hard upper limit. If there is no setting for the amount of RAM for the data and stack, this refers to the total memory requirements; if the amount of RAM is explicitly set, this value only goves the amount of memory needed for the code and the static data.
-XDvalue	RAM limit for the abstract machine: the maximum memory requirements for data and stack that a compiled script may have, in bytes. This value is is useful for (embedded) environments where the maximum data size of a script is bound to a hard upper limit. Especially in the case where the PAWN script runs from ROM, the sizes for the code and data sections need both to be set.
-\	Control characters start with “\” (for the sake of similarity with C, C++ and Java)
-^	Control characters start with “ˆ” (for compatibility with earlier versions of pawn).
-;+/-	With -;+ every statement is required to end with a semicolon; with -;-, semicolons are optional to end a statement if the statement is the last on the line. The option -; (without + or − suffix) toggles the current setting.
sym=value	define constant “sym” with the given (numeric) value, the value is optional;
@filename	read (more) options from the specified “response file”.

• Response file

To support operating systems with a limited command line length (e.g., Mi- crosoft DOS), the PAWN compiler supports “response files”. A response file is a text file that contains the options that you would otherwise put at the command line. With the command:

pawncc @opts.txt prog.pawn

the PAWN compiler compiles the file “prog.pawn” using the options that are listed in the response file “opts.txt”.

• Configuration file

On platforms that support it (currently Microsoft DOS, Microsoft Windows and Linux), the compiler reads the options in a “configuration file” on startup.

The configuration file must have the name “pawn.cfg” and it must reside in the same directory as the compiler executable program.

In a sense, the configuration file is an implicit response file. Options specified on the command line may overrule those in the configuration file.

Packed/unpacked strings: 99

Character constants: 99

#pragma dynamic: 121

Warnings: 161

See also #pragma amxlimit on page 119

See also #pragma amxram on page 120

Rationale

The first issue in the presentation of a new computer language should be: why a new language at all?

Indeed, I did look at several existing languages before I designed
my own. Many little languages were aimed at scripting the command shell (TCL, Perl, Python). Other languages were not designed as extension languages, and put the burden to embedding solely on the host application.

As I initially attempted to use Java as an extension language (rather than build my own, as I have done now), the differences between PAWN and Java are illus- trative for the almost reciprocal design goals of both languages. For example,

Java promotes distributed computing where “packages” reside on diverse ma- chines, PAWN is designed so that the compiled applets can be easily stored in a compound file together with other data. Java is furthermore designed to be architecture neutral and application independent, inversely PAWN is designed to be tightly coupled with an application; native functions are a taboo to some extent in Java (at least, it is considered “impure”), whereas native functions are “the reason to be” for PAWN. From the viewpoint of PAWN, the intended use of Java is upside down: native functions are seen as an auxiliary library that the application —in Java— uses; in PAWN, native functions are part of “the application” and the PAWN program itself is a set of auxiliary functions that the application uses.

A language for scripting applications: PAWN is targeted as an exten- sion language, meant to write application-specific macros or subprograms with. PAWN is not the appropriate language for implementing business applications or operating systems in. PAWN is designed to be easily integrated with, and embedded in, other systems/applications.

As an extension language, PAWN programs typically manipulate objects of the host application. In an animation system, PAWN scripts deal with sprites, events and time intervals; in a communication application, PAWN scripts handle packets and connections. I assume that the host application will make (a subset of) its resources and functionality available via functions, handles, magic cookies. . . in a similar way that a contemporary operating system provides an interface to processes written in C/C++ —e.g., the Win32 API (“handles everywhere”) or GNU/Linux’ “glibc”. To that end, PAWN has a simple and efficient interface to the “native” functions of the host application. A PAWN script manipulates data objects in the host application through function calls, but it cannot access the data of the host application directly.

The first and foremost criterions for the PAWN language were execution speed and reliability. Reliability in the sense that a PAWN program should not be able to crash the application or tool in which it is embedded —at least, not easily. Although this limits the capabilities of the language significantly, the advantages are twofold:

the application vendor can rest assured that its application will not crash due to user additions or macros,
the user is free to experiment with the language with no (or little) risk of damaging the application files.

Speed is essential: PAWN programs would probably run in an abstract ma- chine, and abstract machines are notoriously slow. I had to make a language that has low overhead and a language for which a fast abstract machine can be written. Speed should also be reliable, in the sense that a PAWN script should not slow down over time or have an occasional performance hiccup. Conse- quently, PAWN excludes any required “background process”, such as garbage collection, and the core of the abstract machine does not implicitly allocate any system or application resources while it runs. That is, PAWN does not allocate memory or open files, not without the help of a native function that the script calls explicitly.

As Dennis Ritchie said, by intent the C language confines itself to facilities that can be mapped relatively efficiently and directly to machine instructions. The same is true for PAWN, and this is also a partial explication why PAWN looks so much like C. Even though PAWN runs on an abstract machine, the goal is to keep that abstract machine small and quick. PAWN is used in tiny embedded systems with ram sizes of 32 kiB or less, as well as in high-performance games that need every processor cycle for their graphics engine and game-play.
In both environments, a heavy-weight scripting support is difficult to swallow.

A brief analysis showed that the instruction decoding logic for an abstract ma- chine would quickly become the bottleneck in the performance of the abstract machine. To keep the decoding simple, each opcode should have the
same size (excluding operands), and the opcode should fully specify the instruction (including the addressing methods, size of the operands, etc.). That meant that for each operation on a variable, the abstract machine needed a separate opcode for every combination of variable type, storage class and access method (direct, or dereferenced). For even three types (int, char and unsigned int), two storage classes (global and local) and three access methods (direct, indi- rect or indexed), a total of 18 opcodes (323) are needed to simply fetch the value of a variable.

At the same time, to keep the abstract machine small and manageable, I set the goal at approximately 100 instructions.∗ With 18 opcodes to load a variable in a register, 18 more to store a register into a variable, another 18 to get the address of a variable, etc. . . I was quickly exceeding my self-imposed limit of a hundred opcodes.

The languages bob and rexx inspired me to design a typeless language. This saved me a lot of opcodes. At the same time, the language could no longer be called a “subset of C”. I was changing the language. Why, then, not go a foot further in changing the language? This is where a few more design guidelines came into play:

give the programmer a general purpose tool, not a special purpose solution
avoid error prone language constructs; promote error checking
be pragmatic

A general purpose tool: PAWN is targeted as an extension language, with- out specifying exactly what it will extent. Typically, the application or
the tool that uses PAWN for its extension language will provide many, optimized routines or commands to operate on its native objects, be it text, database records or animated sprites. The extension language exists to permit the user to do what the application developer forgot, or decided not to include. Rather than providing a comprehensive library of functions to sort data, match reg- ular expressions, or draw cubic B´ezier splines, PAWN should supply a (general purpose) means to use, extend and combine the specific (“native”) functions that an application provides.

PAWN lacks a comprehensive standard library. By intent, PAWN also lacks fea- tures like pointers, dynamic memory allocation, direct access to the operating system or to the hardware, that are needed to remain competitive in the field of general purpose application or system programming. You cannot build linked lists or dynamic tree data structures in PAWN, and neither can you access any memory beyond the boundaries of the abstract machine. That is not to say that a PAWN program can never use dynamic, sorted symbol tables, or change a parameter in the operating system; it can do that, but it needs
to do so by calling a “native” function that an application provides to the abstract machine.

∗ 136 Opcodes are defined at this writing, plus 20 “macro” opcodes. To exploit performance gains by forcing proper alignment of memory words (essential on ARM microprocessors), the current abstract machine uses 32-bit opcodes. There is no technical limit on the number of opcodes, but in the interest of a small footprint, the number of opcodes should be restricted.

In other words, if an application chooses to implement the well known peek and poke functions (from BASIC) in the abstract machine, a PAWN program can access any byte in memory, insofar the operating system permits this. Likewise, an application can provide native functions that insert, delete or search symbols in a table and allows several operations on them. The proposed core functions getproperty and setproperty are an example of native functions that build a linked list in the background.

Promote error checking: As you may have noticed, one of the foremost design criterions of the C language, “trust the programmer”, is absent from my list of design criterions. Users of script languages may not be experienced programmers; and even if they are, PAWN will probably not be their primary language. Most PAWN programmers will keep learning the language as they go, and will even after years not have become experts. Enough reason, hence, to replace error prone elements from the C language (pointers) with
saver, albeit less general, constructs (references).† References are copied from C++. They are nothing else than pointers in disguise, but they are
restricted in various, mostly useful, ways. Turn to a C⁺⁺ book to find more justification for references.

I find it sad that many, even modern, programming languages have so
little built-in, or easy to use, support for confirming that programs do
as the programmer intended. I am not referring to theoretical correctness (which is too costly to achieve for anything bigger than toy programs), but
practical, easy to use, verification mechanisms as a help to the programmer. PAWN provides both compile time and execution time assertions to use for preconditions, postconditions and invariants.

The typing mechanism that most programming languages use is also an auto- matic “catcher” of a whole class of bugs. By virtue of being a typeless language, PAWN lacked these error checking abilities. This was clearly a weakness, and I created the “tag” mechanism as an equivalent for verifying function parameter passing, array indexing and other operations.

† You should see this remark in the context of my earlier assertion that many “PAWN” programmers will be novice programmers. In my (teaching) experience, novice programmers make many pointer errors, as opposed to experienced C/C++ programmers

The quality of the tools: the compiler and the abstract machine, also have a great impact on the robustness of code —whatever the language. Although this is only very loosely related to the design of the language, I set out to build the tools such that they promote error checking. The warning system of PAWN goes a step beyond simply reporting where the parser fails to
interpret the data according to the language grammar. At several occasions, the compiler runs checks that are completely unrelated to generating code and that are im- plemented specifically to catch possible errors. Likewise, the “debugger hook” is designed right into the abstract machine, it is not an add-on implemented as an after-thought.

Be pragmatic: The object-oriented programming paradigm has not entirely lived up to its promise, in my opinion. On the one hand, OOP solves many tasks in an easier or cleaner way, due to the added abstraction layer. On the other hand, contemporary object-oriented languages leave you struggling with the language as much as with the task at hand. Object-oriented languages are attractive mainly because of the comprehensive class libraries that they come with —but leaning on a standard library goes against one of the design goal for PAWN. Object-oriented programming is not a solution for a
non-expert programmer with little patience for artificial complexity. The criterion “be pragmatic” is a reminder to seek solutions, not elegancy.

• Practical design criterions

The fact that PAWN looks so much like C cannot be a coincidence, and it isn’t. PAWN started as a C dialect and stayed that way, because C has a proven track record. The changes from C were mostly born out of necessity after rubbing out the features of C that I did not want in a scripting language: no pointers and no “typing” system.

PAWN, being a typeless language, needed a different means to declare variables. In the course of modifying this, I also dropped the C requirement
that all variables should be declared at the top of a compound statement. PAWN is a little more like C⁺⁺ in this respect.

C language functions can pass “output values” via pointer arguments. The standard function scanf, for example, stores the values or strings that it reads from the console into its arguments. You can design a function in C so that it optionally returns a value through a pointer argument; if the caller of the function does not care for the return value, it passes NULL as the pointer value. The standard function strtol is an example of a function that does this. This technique frequently saves you from declaring and passing dummy variables. PAWN replaces pointers with references, but references cannot be NULL. Thus, PAWN needed a different technique to “drop” the values that a function returns via references. Its solution is the use of an “argument placeholder” that is written as an underscore character (“ ”); Prolog programmers will recognize it as a similar feature in that language. The argument placeholder
reserves a temporary anonymous data object (a “cell” or an array of cells)
that is automatically destroyed after the function call.

The temporary cell for the argument placeholder should still have a value, be- cause the function may see a reference parameters as input/output. Therefore, a function must specify for each passed-by-reference argument what value it will have upon entry when the caller passes the placeholder instead of an ac- tual argument. By extension, I also added default values for arguments that are “passed-by-value”. The feature to optionally remove all arguments with default values from the right was copied from C++.

When speaking of BCPL and B, Dennis Ritchie said that C was invented in part to provide a plausible way of dealing with character strings when one begins with a word-oriented language. PAWN provides two options for working with strings, packed and unpacked strings. In an unpacked string, every character

fits in a cell. The overhead for a typical 32-bit implementation is large: one character would take four bytes. Packed strings store up to four characters in one cell, at the cost of being significantly more difficult
to handle if you could only access full cells. Modern BCPL implementations provide two array indexing methods: one to get a word from an array and one to get a character from an array. PAWN copies this concept, although the syntax differs from that of BCPL. The packed string feature also led to the new operator char.

Unicode applications often have to deal with two characters sets:
8-bit for legacy file formats and standardized transfer formats (like many of the Internet protocols) and the 16-bit Unicode character set (or the 31-bit UCS-4 character set). Although the PAWN compiler has an option that makes characters 16-bit (so only two characters fit in a 32-bit cell), it is usually more convenient to store single-byte character strings in packed strings and multi-byte strings in unpacked strings. This turns a weakness in PAWN —the need to distinguish packed strings from unpacked strings— into a strength: PAWN can make that

distinction quite easily. And instead of needing two implementations for every function that deals with strings (an ascii version and a Unicode version —look at the Win32 API, or even the standard C library), PAWN enables functions to handle both packed and unpacked strings with ease.

Notwithstanding the above mentioned changes, plus those in the chapter “Pit- falls: differences from C” (page 134), I have tried to keep PAWN close to C. A final point, which is unrelated to language design, but important nonetheless, is the license: PAWN is distributed under a liberal license allowing you to use and/or adapt the code with a minimum of restrictions —see appendix D.

Support for Unicode string literals: 139

License

The software toolkit “PAWN” (the compiler, the abstract machine and
the documentation) are copyright c 1997–2006 by ITB CompuPhase. The Intel assembler implementation of the abstract machine and the just-in-time com- piler (specifically the files amxexec.asm, amxjitr.asm and amxjits.asm) are copyright c 1998-2003 Marc Peter. The file amxjitsn.asm is translated from amxjits.asm and is partially c 2004 G.W.M. Vissers. The file amxex- ecn.asm is translated from amxexec.asm and is partially c 2004–2005 ITB CompuPhase.

PAWN is distributed under the “zLib/libpng” license, which is reproduced be- low:

This software is provided “as-is”, without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software.

Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:

1 The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required.

2 Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.

3 This notice may not be removed or altered from any source distribution.

The zLib/libpng license has been approved by the “Open Source Initiative” organization.

Go Back to Contents