[ Pobierz całość w formacie PDF ]

offsets listed in the second column are very likely not the true offsets that
the data will be placed at in the complete program. Each module may define
its own labels in the data segment (and the other segments, too). In the link
step (see section 1.4.5), all these data segment label definitions are combined
to form one data segment. The new final offsets are then computed by the
linker.
Here is a small section (lines 54 to 56 of the source file) of the text
segment in the listing file:
94 0000002C A1[00000000] mov eax, [input1]
95 00000031 0305[04000000] add eax, [input2]
96 00000037 89C3 mov ebx, eax
The third column shows the machine code generated by the assembly. Often
the complete code for an instruction can not be computed yet. For example,
in line 94 the offset (or address) ofinput1is not known until the code is
linked. The assembler can compute the op-code for themovinstruction
(which from the listing is A1), but it writes the offset in square brackets
because the exact value can not be computed yet. In this case, a temporary
offset of 0 is used becauseinput1is at the beginning of the part of the bss
segment defined in this file. Remember this does not mean that it will be
at the beginning of the final bss segment of the program. When the code
is linked, the linker will insert the correct offset into the position. Other
instructions, like line 96, do not reference any labels. Here the assembler
can compute the complete machine code.
Big and Little Endian Representation
If one looks closely at line 95, something seems very strange about the
offset in the square brackets of the machine code. Theinput2label is at
offset 4 (as defined in this file); however, the offset that appears in memory
is not 00000004, but 04000000. Why? Different processors store multibyte
integers in different orders in memory. There are two popular methods of
Endian is pronounced like storing integers: big endian and little endian. Big endian is the method
indian.
1.5. SKELETON FILE 25
that seems the most natural. The biggest (i.e. most significant) byte is
stored first, then the next biggest, etc. For example, the dword 00000004
would be stored as the four bytes 00 00 00 04. IBM mainframes, most RISC
processors and Motorola processors all use this big endian method. However,
Intel-based processors use the little endian method! Here the least significant
byte is stored first. So, 00000004 is stored in memory as 04 00 00 00. This
format is hardwired into the CPU and can not be changed. Normally, the
programmer does not need to worry about which format is used. However,
there are circumstances where it is important.
1. When binary data is transfered between different computers (either
from files or through a network).
2. When binary data is written out to memory as a multibyte integer
and then read back as individual bytes or vice versa.
Endianness does not apply to the order of array elements. The first
element of an array is always at the lowest address. This applies to strings
(which are just character arrays). Endianness still applies to the individual
elements of the arrays.
1.5 Skeleton File
Figure 1.7 shows a skeleton file that can be used as a starting point for
writing assembly programs.
26 CHAPTER 1. INTRODUCTION
skel.asm
1 %include "asm_io.inc"
2 segment .data
3 ;
4 ; initialized data is put in the data segment here
5 ;
6
7 segment .bss
8 ;
9 ; uninitialized data is put in the bss segment
10 ;
11
12 segment .text
13 global _asm_main
14 _asm_main:
15 enter 0,0 ; setup routine
16 pusha
17
18 ;
19 ; code is put in the text segment. Do not modify the code before
20 ; or after this comment.
21 ;
22
23 popa
24 mov eax, 0 ; return back to C
25 leave
26 ret
skel.asm
Figure 1.7: Skeleton Program
Chapter 2
Basic Assembly Language
2.1 Working with Integers
2.1.1 Integer representation
Integers come in two flavors: unsigned and signed. Unsigned integers
(which are non-negative) are represented in a very straightforward binary
manner. The number 200 as an one byte unsigned integer would be repre-
sented as by 11001000 (or C8 in hex).
Signed integers (which may be positive or negative) are represented in a
more complicated ways. For example, consider -56. +56 as a byte would be
represented by 00111000. On paper, one could represent -56 as -111000,
but how would this be represented in a byte in the computer s memory. How
would the minus sign be stored?
There are three general techniques that have been used to represent [ Pobierz całość w formacie PDF ]

  • zanotowane.pl
  • doc.pisz.pl
  • pdf.pisz.pl
  • domowewypieki.keep.pl