Next Chapter | Previous Chapter | Contents | Index
Like most assemblers, each NASM source line contains (unless it is a macro, a preprocessor directive or an assembler directive: see chapter 4 and chapter 6) some combination of the four fields
label: instruction operands ; comment
As usual, most of these fields are optional; the presence or absence of any combination of a label, an instruction and a comment is allowed. Of course, the operand field is either required or forbidden by the presence and nature of the instruction field.
NASM uses backslash (\) as the line continuation character; if a line ends with backslash, the next line is considered to be a part of the backslash-ended line.
NASM places no restrictions on white space within a line: labels may
have white space before them, or instructions may have no space before
them, or anything. The colon after a label is also optional. (Note that
this means that if you intend to code 
Valid characters in labels are letters, numbers,
The instruction field may contain any machine instruction: Pentium and
P6 instructions, FPU instructions, MMX instructions and even undocumented
instructions are all supported. The instruction may be prefixed by
An instruction is not required to use a prefix: prefixes such as
In addition to actual machine instructions, NASM also supports a number of pseudo-instructions, described in section 3.2.
Instruction operands may take a number of forms: they can be registers,
described simply by the register name (e.g. 
For x87 floating-point instructions, NASM accepts a wide range of syntaxes: you can use two-operand forms like MASM supports, or you can use NASM's native single-operand forms in most cases. For example, you can code:
        fadd    st1             ; this sets st0 := st0 + st1 
        fadd    st0,st1         ; so does this 
        fadd    st1,st0         ; this sets st1 := st1 + st0 
        fadd    to st1          ; so does this
Almost any x87 floating-point instruction that references memory must
use one of the prefixes 
Pseudo-instructions are things which, though not real x86 machine
instructions, are used in the instruction field anyway because that's the
most convenient place to put them. The current pseudo-instructions are
DB 
      db    0x55                ; just the byte 0x55 
      db    0x55,0x56,0x57      ; three bytes in succession 
      db    'a',0x55            ; character constants are OK 
      db    'hello',13,10,'$'   ; so are string constants 
      dw    0x1234              ; 0x34 0x12 
      dw    'a'                 ; 0x61 0x00 (it's just a number) 
      dw    'ab'                ; 0x61 0x62 (character constant) 
      dw    'abc'               ; 0x61 0x62 0x63 0x00 (string) 
      dd    0x12345678          ; 0x78 0x56 0x34 0x12 
      dd    1.234567e20         ; floating-point constant 
      dq    0x123456789abcdef0  ; eight byte constant 
      dq    1.234567e20         ; double-precision float 
      dt    1.234567e20         ; extended-precision float
RESB 
For example:
buffer: resb 64 ; reserve 64 bytes wordvar: resw 1 ; reserve a word realarray resq 10 ; array of ten reals ymmval: resy 1 ; one YMM register
INCBIN 
    incbin  "file.dat"             ; include the whole file 
    incbin  "file.dat",1024        ; skip the first 1024 bytes 
    incbin  "file.dat",1024,512    ; skip the first 1024, and 
                                   ; actually include at most 512
EQU 
message db 'hello, world' msglen equ $-message
defines 
TIMES The 
zerobuf: times 64 db 0
or similar things; but 
buffer: db      'hello, world' 
        times 64-$+buffer db ' '
which will store exactly enough spaces to make the total length of
        times 100 movsb
Note that there is no effective difference between
The operand to 
Note also that 
An effective address is any operand to an instruction which references memory. Effective addresses, in NASM, have a very simple syntax: they consist of an expression evaluating to the desired address, enclosed in square brackets. For example:
wordvar dw      123 
        mov     ax,[wordvar] 
        mov     ax,[wordvar+1] 
        mov     ax,[es:wordvar+bx]
Anything not conforming to this simple system is not a valid memory
reference in NASM, for example 
More complicated effective addresses, such as those involving more than one register, work in exactly the same way:
        mov     eax,[ebx*2+ecx+offset] 
        mov     ax,[bp+di+8]
NASM is capable of doing algebra on these effective addresses, so that things which don't necessarily look legal are perfectly all right:
    mov     eax,[ebx*5]             ; assembles as [ebx*4+ebx] 
    mov     eax,[label1*2-label2]   ; ie [label1+(label1-label2)]
Some forms of effective address have more than one assembled form; in
most such cases NASM will generate the smallest form it can. For example,
there are distinct assembled forms for the 32-bit effective addresses
NASM has a hinting mechanism which will cause
However, you can force NASM to generate an effective address in a
particular form by the use of the keywords 
The form described in the previous paragraph is also useful if you are trying to access data in a 32-bit segment from within 16 bit code. For more information on this see the section on mixed-size addressing (section 10.2). In particular, if you need to access data with a known offset that is larger than will fit in a 16-bit value, if you don't specify that it is a dword offset, nasm will cause the high word of the offset to be lost.
Similarly, NASM will split 
In 64-bit mode, NASM will by default generate absolute addresses. The
NASM understands four different types of constant: numeric, character, string and floating-point.
A numeric constant is simply a number. NASM allows you to specify
numbers in a variety of number bases, in a variety of ways: you can suffix
Numeric constants can have underscores (
Some examples (all producing exactly the same code):
        mov     ax,200          ; decimal 
        mov     ax,0200         ; still decimal 
        mov     ax,0200d        ; explicitly decimal 
        mov     ax,0d200        ; also decimal 
        mov     ax,0c8h         ; hex 
        mov     ax,$0c8         ; hex again: the 0 is required 
        mov     ax,0xc8         ; hex yet again 
        mov     ax,0hc8         ; still hex 
        mov     ax,310q         ; octal 
        mov     ax,310o         ; octal again 
        mov     ax,0o310        ; octal yet again 
        mov     ax,0q310        ; octal yet again 
        mov     ax,11001000b    ; binary 
        mov     ax,1100_1000b   ; same binary constant 
        mov     ax,1100_1000y   ; same binary constant once more 
        mov     ax,0b1100_1000  ; same binary constant yet again 
        mov     ax,0y1100_1000  ; same binary constant yet again
A character string consists of up to eight characters enclosed in either
single quotes (
The following escape sequences are recognized by backquoted strings:
      \'          single quote (') 
      \"          double quote (") 
      \`          backquote (`) 
      \\          backslash (\) 
      \?          question mark (?) 
      \a          BEL (ASCII 7) 
      \b          BS  (ASCII 8) 
      \t          TAB (ASCII 9) 
      \n          LF  (ASCII 10) 
      \v          VT  (ASCII 11) 
      \f          FF  (ASCII 12) 
      \r          CR  (ASCII 13) 
      \e          ESC (ASCII 27) 
      \377        Up to 3 octal digits - literal byte 
      \xFF        Up to 2 hexadecimal digits - literal byte 
      \u1234      4 hexadecimal digits - Unicode character 
      \U12345678  8 hexadecimal digits - Unicode character
All other escape sequences are reserved. Note that
Unicode characters specified with 
      db `\u263a`            ; UTF-8 smiley face 
      db `\xe2\x98\xba`      ; UTF-8 smiley face 
      db 0E2h, 098h, 0BAh    ; UTF-8 smiley face
A character constant consists of a string up to eight bytes long, used in an expression context. It is treated as if it was an integer.
A character constant with more than one byte will be arranged with little-endian order in mind: if you code
          mov eax,'abcd'
then the constant generated is not 
String constants are character strings used in the context of some
pseudo-instructions, namely the 
A string constant looks like a character constant, only longer. It is treated as a concatenation of maximum-size character constants for the conditions. So the following are equivalent:
      db    'hello'               ; string constant 
      db    'h','e','l','l','o'   ; equivalent character constants
And the following are also equivalent:
      dd    'ninechars'           ; doubleword string constant 
      dd    'nine','char','s'     ; becomes three doublewords 
      db    'ninechars',0,0,0     ; and really looks like this
Note that when used in a string-supporting context, quoted strings are
treated as a string constants even if they are short enough to be a
character constant, because otherwise 
The special operators 
For example:
%define u(x) __utf16__(x) 
%define w(x) __utf32__(x) 
      dw u('C:\WINDOWS'), 0       ; Pathname in UTF-16 
      dd w(`A + B = \u206a`), 0   ; String in UTF-32
The UTF operators can be applied either to strings passed to the
Floating-point constants are acceptable only as arguments to
Floating-point constants are expressed in the traditional form: digits,
then a period, then optionally more digits, then optionally an
NASM also support C99-style hexadecimal floating-point:
Underscores to break up groups of digits are permitted in floating-point constants as well.
Some examples:
      db    -0.2                    ; "Quarter precision" 
      dw    -0.5                    ; IEEE 754r/SSE5 half precision 
      dd    1.2                     ; an easy one 
      dd    1.222_222_222           ; underscores are permitted 
      dd    0x1p+2                  ; 1.0x2^2 = 4.0 
      dq    0x1p+32                 ; 1.0x2^32 = 4 294 967 296.0 
      dq    1.e10                   ; 10 000 000 000.0 
      dq    1.e+10                  ; synonymous with 1.e10 
      dq    1.e-10                  ; 0.000 000 000 1 
      dt    3.141592653589793238462 ; pi 
      do    1.e+4000                ; IEEE 754r quad precision
The 8-bit "quarter-precision" floating-point format is sign:exponent:mantissa = 1:4:3 with an exponent bias of 7. This appears to be the most frequently used 8-bit floating-point format, although it is not covered by any formal standard. This is sometimes called a "minifloat."
The special operators are used to produce floating-point numbers in
other contexts. They produce the binary representation of a specific
floating-point number as an integer, and can use anywhere integer constants
are used in an expression. 
For example:
      mov    rax,__float64__(3.141592653589793238462)
... would assign the binary representation of pi as a 64-bit floating
point number into 
      mov    rax,0x400921fb54442d18
NASM cannot do compile-time arithmetic on floating-point constants. This is because NASM is designed to be portable - although it always generates code to run on x86 processors, the assembler itself can run on any system with an ANSI C compiler. Therefore, the assembler cannot guarantee the presence of a floating-point unit capable of handling the Intel number formats, and so for NASM to be able to do floating arithmetic it would have to include its own complete set of floating-point routines, which would significantly increase the size of the assembler for very little benefit.
The special tokens 
%define Inf __Infinity__ 
%define NaN __QNaN__ 
      dq    +1.5, -Inf, NaN         ; Double-precision constants
The 
x87-style packed BCD constants can be used in the same contexts as
80-bit floating-point numbers. They are suffixed with
As with other numeric constants, underscores can be used to separate digits.
For example:
      dt 12_345_678_901_245_678p 
      dt -12_345_678_901_245_678p 
      dt +0p33 
      dt 33p
Expressions in NASM are similar in syntax to those in C. Expressions are evaluated as 64-bit integers which are then adjusted to the appropriate size.
NASM supports two special tokens in expressions, allowing calculations
to involve the current assembly position: the 
The arithmetic operators provided by NASM are listed here, in increasing order of precedence.
| The 
^ 
& 
<< >> 
+ - The 
* / // % %% 
NASM, like ANSI C, provides no guarantees about the sensible operation of the signed modulo operator.
Since the 
The highest-priority operators in NASM's expression grammar are those
which only apply to one argument. These are 
A set of additional operators with leading and trailing double
underscores are used to implement the integer functions of the
SEG WRT When writing large 16-bit programs, which must be split into multiple
segments, it is often necessary to be able to refer to the segment part of
the address of a symbol. NASM supports the 
The 
        mov     ax,seg symbol 
        mov     es,ax 
        mov     bx,symbol
will load 
Things can be more complex than this: since 16-bit segments and groups
may overlap, you might occasionally want to refer to some symbol using a
different segment base from the preferred one. NASM lets you do this, by
the use of the 
        mov     ax,weird_seg        ; weird_seg is a segment base 
        mov     es,ax 
        mov     bx,symbol wrt weird_seg
to load 
NASM supports far (inter-segment) calls and jumps by means of the syntax
        call    (seg procedure):procedure 
        call    weird_seg:(procedure wrt weird_seg)
(The parentheses are included for clarity, to show the intended parsing of the above instructions. They are not necessary in practice.)
NASM supports the syntax 
To declare a far pointer to a data item in a data segment, you must code
        dw      symbol, seg symbol
NASM supports no convenient synonym for this, though you can always invent one using the macro processor.
STRICT When assembling with the optimizer set to level 2 or higher (see
section 2.1.22), NASM will use
size specifiers (
        push dword 33
is encoded in three bytes 
        push strict dword 33
is encoded in six bytes, with a full dword immediate operand
With the optimizer off, the same code (six bytes) is generated whether
the 
Although NASM has an optional multi-pass optimizer, there are some expressions which must be resolvable on the first pass. These are called Critical Expressions.
The first pass is used to determine the size of all the assembled code and data, so that the second pass, when generating all the code, knows all the symbol addresses the code refers to. So one thing NASM can't handle is code whose size depends on the value of a symbol declared after the code in question. For example,
        times (label-$) db 0 
label:  db      'Where am I?'
The argument to 
        times (label-$+1) db 0 
label:  db      'NOW where am I?'
in which any value for the 
NASM rejects these examples by means of a concept called a critical
expression, which is defined to be an expression whose value is
required to be computable in the first pass, and which must therefore
depend only on symbols defined before it. The argument to the
NASM gives special treatment to symbols beginning with a period. A label beginning with a single period is treated as a local label, which means that it is associated with the previous non-local label. So, for example:
label1  ; some code 
.loop 
        ; some more code 
        jne     .loop 
        ret 
label2  ; some code 
.loop 
        ; some more code 
        jne     .loop 
        ret
In the above code fragment, each 
This form of local label handling is borrowed from the old Amiga
assembler DevPac; however, NASM goes one step further, in allowing access
to local labels from other parts of the code. This is achieved by means of
defining a local label in terms of the previous non-local label:
the first definition of 
label3  ; some more code 
        ; and some more 
        jmp label1.loop
Sometimes it is useful - in a macro, for instance - to be able to define
a label which can be referenced from anywhere but which doesn't interfere
with the normal local-label mechanism. Such a label can't be non-local
because it would interfere with subsequent definitions of, and references
to, local labels; and it can't be local because the macro that defined it
wouldn't know the label's full name. NASM therefore introduces a third type
of label, which is probably only useful in macro definitions: if a label
begins with the special prefix 
label1:                         ; a non-local label 
.local:                         ; this is really label1.local 
..@foo:                         ; this is a special symbol 
label2:                         ; another non-local label 
.local:                         ; this is really label2.local 
        jmp     ..@foo          ; this will jump three lines up
NASM has the capacity to define other special symbols beginning with a
double period: for example,