CS 2011 - Introduction to Machine Organization and Assembly Language
- Bits, Bytes, and Integers
- Binary representations of information

- Electronic devices: represent bits with different voltage levels on a circuit
- Binary to decimal
- 1011 = 1*2^3 + 0*2^2 + 1*2^1 + 1*2^0 = 11
- Conversion from all other number bases to decimal works the same
- Hexadecimal
- Useful in modern systems
- Extends past 9 to A, B, C, D, E, and F
- 1 hex character = ½ byte = 1 nibble = 4 bits
- 0 - F would mean 0000 (0) to 1111 (15)
- 2 hex characters = 1 byte = 8 bits
- Write FA1D37B(16) in C as
- 0xFA1D37B
- 0xfa1d37b
- 0x prefix denotes a hex number — hexadecimal literal
- 0 prefix — octal
- No prefix — decimal
- 0b prefix — binary
- Programmers - 64 Bit Calculator
- Byte-oriented memory organization

-
Programs refer to virtual addresses
- Conceptually, very large array of bytes
- Actually, implemented with hierarchy of different memory types
- System provides address space private to particular “process”
- Program being executed
- Program can clobber its own data, but not that of others
-
Compiler + run-time system control allocation
- Where different program objects should be stored
- All allocation within single virtual address space
-
Machine words
- Machine has “word size”
- A unit of data that a machine processes (transfers between CPU and RAM) in one operation
- Nominal size of pointer data
- Determinant of the max size of the address space
- Hardware deals with the memory a word size at a time
- Older machines use 32 bits (4 bytes) words
- Limits addresses to 4GB
- Becoming too small for memory-intensive applications
- Many systems use 64 bits (8 bytes) words including X86-64
- Potential address space ≈ 1.8 X 1019 bytes
- x86-64 machines support 48-bit addresses: 256 Terabytes (2015)
- Machines support multiple data formats
- Fractions or multiples of word size
- Always integral number of bytes
- Machine has “word size”
-
Word-oriented memory organization

-
Addresses specify byte locations
- Address of first byte in word
- Addresses of successive words differ by 2 (16-bit), 4 (32-bit), or 8 (64-bit)
-
Data representations

- Byte ordering
- How should bytes within a multi-byte data format / word be ordered in memory?
- Every byte has a unique address is memory. Endianness is the order that these bytes are read in
- Conventions - Big Endian: Sun, PPC Mac, Internet - Least significant byte has highest address - Example: x has 4-byte representation 0x01234567 - Address given by &x is 0x100

- Little Endian: x86
- Least significant byte has lowest address

-
Convergence between network stack’s byte ordering and consumers’ chips (x86) → network stack handle conversion
-
Reading byte-reversed listings
- Assembly language instructions are human-friendly names — specific CPU operations
- Machine code is the binary representation of those instructions
- Every assembly instruction maps to a corresponding machine code rendition
- Disassembly: machine code → assembly
- Text representation of binary machine code
- Generated by program that reads the machine code
- Example Fragment

-
Deciphering Numbers
- Value: 0x12ab
- Pad to 32 bits: 0x000012ab
- Split into bytes: 00 00 12 ab
- Reverse: ab 12 00 00
-
Representing integers

- Representing pointers — memory addresses of other variables

- Representing strings
- Strings in C
- Represented by array of characters
- Each character encoded in ASCII format
- Standard 7-bit encoding of character set
- Character “0” has code 0x30
- Digit i has code 0x30+i
- String should be null-terminated
- Final character = 0
- Compatibility - Byte ordering not an issue
- Strings in C

- Bit-level manipulations
- Boolean algebra - Operate on bit vectors - Operations applied bitwise

-
All of boolean algebra properties apply
-
Representation
- Width w bit vector represents subsets of {0, …, w—1}
- aj = 1 if j ∈ A
- 01101001 { 0, 3, 5, 6 }
- 76543210
- 01010101 { 0, 2, 4, 6 }
- 76543210
-
Operations
- Bit-level
- & Intersection 01000001 { 0, 6 }
- | Union 01111101 { 0, 2, 3, 4, 5, 6 }
- ^ Symmetric difference 00111100 { 2, 3, 4, 5 }
- ~ Complement 10101010 { 1, 3, 5, 7 }
- Available in C
- Apply to any “integral” data type—
- long, int, short, char, unsigned
- View arguments as bit vectors
- Arguments applied bit-wise
- Apply to any “integral” data type—
- Logical
- &&, ||, !
- View 0 as “False”
- Anything nonzero as “True”
- Always return 0 or 1
- &&, ||, !
- Shift - Left Shift: x << y - Shift bit-vector x left y positions - Throw away extra bits on left - Fill with 0’s on right - Right Shift: x >> y - Shift bit-vector x right y positions - Throw away extra bits on right - Logical shift - Fill with 0’s on left - Arithmetic shift - Replicate most significant bit on right - Undefined Behavior - Shift amount < 0 or ≥ word size - Different machines behave differently
- Bit-level
-
Integers
- Representation: unsigned and signed - Encoding integers - Unsigned

- Sign magnitude

- Two’s complement
- Sign bit — The most significant bit - 0 for non-negative - 1 for negative


- Numeric ranges

-
Unsigned
- UMin = 0 — 000…0
- UMax = 2^w - 1 — 111…1
-
Two’s complement
- TMin = -2^(w - 1) — 100…0
- TMax = 2^(w-1) -1 — 011…1
- -1 — 111…1
-
Observations
- |TMin| = TMax + 1
- Asymmetric range
- UMax = 2 * TMax + 1
- |TMin| = TMax + 1
-
C Programming
- #include <limits.h>
- Declares constants
- ULONG_MAX
- LONG_MAX
- LONG_MIN
- Values platform specific
-
Properties
- Equivalence
- Same encodings for nonnegative values
- Uniqueness
- Every bit pattern represents unique integer value
- Each representable integer has unique bit encoding
- Can Invert Mappings
- U2B(x) = B2U-1(x)
- Bit pattern for unsigned integer
- T2B(x) = B2T-1(x)
- Bit pattern for two’s comp integer
- U2B(x) = B2U-1(x)
- Equivalence
-
Conversion, casting
- Keep same bit representations and reinterpret



- C Programming
- Constants
- Default is signed integers
- Unsigned if have “U” as suffix
- 4294967259U
- Casting - Explicit casting \
- Constants

- Implicit casting

- Expression containing signed and unsigned int → int is cast to unsigned !!
- Common source of bugs

- Expanding, truncating
- Expanding - Usigned: zeroes added - Sign extension: Convert w-bit signed integer x to w+k bit integer with same value - Make k copies of sign bit

-
C automatically performs sign extension
-
Truncating
- For unsigned numbers
- Equivalent to dividiving by 2^k and keeping the remainder
- truncate(x, k) = x mod 2^k
- Equivalent to dividiving by 2^k and keeping the remainder
- For signed numbers - Same bit result but truncated number may have different sign (!)
- For unsigned numbers

- Addition, negation, multiplication, shifting
- Negation: complement & increment
- Claim: following holds for two’s complement
- ~x + 1 == -x
- Complement
- Observation: ~x + x == 1111…111 == -1
- Claim: following holds for two’s complement
- Unsigned addition - Operands: w bits - True sum: w+1 bits - Discard carry: w bits
- Negation: complement & increment

- Standard addition function
- Ignores carry output
- Implement modular arithmetic

- Mathematical properties
- Two’s complement addition
- Operands: w bits
- True sum: w+1 bits
- Discard carry: w bits

- TAdd and UAdd have Identical Bit-Level Behavior
- Signed vs. unsigned addition in C:
int s, t, u, v;
s = (int) ((unsigned) u + (unsigned) v);
t = u + v
Will give s == t
- Signed vs. unsigned addition in C:
- Multiplication — shifting & adding

-
Works the same for signed and unsigned
- Same bit pattern
- Different interpretation
- Different overflows
-
Performance
- 10 or more machine cycles
- Multiply-and-Add instruction
- Easily pipelined
- Split into separate operations, processed in parallel
-
Compiler optimizations
- Small, constant multipliers
- E.g., array indexes
- Shift instructions followed by adds
- Specialized instructions
- Small, constant multipliers
-
Power-of-2 Multiply with Shift
- Operation
- u << k gives u * 2^k
- Both signed and unsigned
- Most machines shift and add faster than multiply - gcc generates this code automatically - x*12 → (x+x*2) << 2
- Operation
-
Division — subtracting and shifting
- Too many edge cases if using integers → limitations of integers → float
-
Floating Point
- Background: Fractional binary numbers
- Background: Fractional binary numbers

- Representation
- Bits to right of “binary point” represent fractional powers of 2
- Represents rational number:

-
Observations
- Divide by 2 by shifting right
- Multiply by 2 by shifting left
- Numbers of form 0.111111…2 are just below 1.0
- 1/2 + 1/4 + 1/8 + … + 1/2i + … ➙ 1.0
- Use notation 1.0 — ε
-
Representable numbers
- Limitations - Can only exactly represent numbers of the form x/2k - Other rational numbers have repeating bit representations - Many, many bits needed for very large or small numbers with fixed binary point
-
IEEE floating point standard: Definition
- A way to approximate real and most rational numbers in computers
- Examples:
- 3.14159265358979323846 --- pi
- 2.99792458 108 m/s --- c, the velocity of light
- 6.62606885 10-27 erg sec --- h, Planck’s constant
- In C (and most other programming languages):—
- 3.14159265358979323846
- 2.99792458e8
- 6.62606885e-27
- IEEE Standard 754
- Established in 1985 as uniform standard for floating point arithmetic
- Now supported by all major processors
- Driven by numerical concerns
- Nice standards for rounding, overflow, underflow
- Difficult to make fast in hardware
- Numerical analysts predominated over hardware designers in defining standard
- Representation - Numerical form -
- Sign bit s determines whether
number is negative or positive - Significand M normally a fractional value
in range [1.0,2.0) (in implicit normalizations). - Exponent E weights value
by power of two - Encoding - MSB s is sign bit s - exp field encodes E (but is
not equal to E) - frac field encodes M (but is not equal to M)

- Precisions
- Single precision: 32 bits
- 1 — 8 — 23
- Double precision: 64 bits
- 1 — 11 — 52
- Extended precision: 80 bits (Intel only)
- 1 — 15 — 63/64
- Single precision: 32 bits
- Normalized values
- Condition: exp != 000…0 and exp != 111…1
- Exponent coded as biased value: E = Exp - Bias
- Exp: unsigned value exp
- Bias = 2^(k - 1) - 1, where k is the number of exponent bits
- Single precision: 127 (Exp: 1…254, E: -126…127)
- Double precision: 1023 (Exp: 1…2046, E: -1022…1023)
- Significand coded with implied leading 1: M = 1.xxx…x2 - xxx…x: bits of frac - Minimum when 000…0 (M = 1.0) - Maximum when 111…1 (M = 2.0 — ε) - Get extra leading bit for “free”

- Denormalized values
- Condition: exp = 000…0
- Exponent value: E = —Bias + 1 (instead of E = 0 — Bias)
- Significand encoded with implied leading 0: M = 0.xxx…x2
- xxx…x: bits of frac
- Cases - exp = 000…0, frac = 000…0 - Represents zero value - Note distinct values: +0 and —0 (why?) - exp = 000…0, frac ≠ 000…0 - Numbers very close to 0.0 - Lose precision as get smaller - Equispaced

- Special values
- Condition: exp = 111…1
- Case: exp = 111…1, frac = 000…0
- Represents value (infinity)
- Operation that overflows
- Both positive and negative
- E.g., 1.0/0.0 = −1.0/−0.0 = +, 1.0/−0.0 = −infinity
- Case: exp = 111…1, frac ≠ 000…0 - Not-a-Number (NaN) - Represents case when no numeric value can be determined - E.g., sqrt(—1), infinity, −infinity, infinity * 0

- Properties
- FP zero same as integer zero
- All bits = 0
- Can (almost) use unsigned integer comparison
- Must first compare sign bits
- Must consider −0 = 0
- NaNs problematic
- Will be greater than any other values
- What should comparison yield?
- Otherwise OK
- Denorm vs. normalized
- Normalized vs. infinity
- FP zero same as integer zero
- Rounding, addition, multiplication
- Floating point operations - Basic idea - First compute exact result - Make it fit into desired precision - Possibly overflow if exponent too large - Possibly round to fit into frac

- Rounding
- Rounding modes (illustrate with $ rounding)
- Towards zero, round down, round up, nearest even (default)
- Round-to-even
- Default Rounding Mode
- Hard to get any other kind without dropping into assembly
- All others are statistically biased
- Sum of set of positive numbers will consistently be over- or under-estimated
- Rounding binary numbers
- Binary Fractional Numbers
- “Even” when least significant bit is 0
- “Half way” when bits to right of rounding position = 100…(2)
- Binary Fractional Numbers
- Rounding modes (illustrate with $ rounding)
- Multiplication

-
Exact Result: (—1)s M 2E
- Sign s: s1 ^ s2
- Significand M: M1 x M2
- Exponent E: E1 + E2
-
Fixing
- If M ≥ 2, shift M right, increment E
- If E out of range, overflow
- Round M to fit frac precision
-
Implementation
- Biggest chore is multiplying significands
-
Addition


-
Fixing
- If M ≥ 2, shift M right, increment E
- if M < 1, shift M left k positions, decrement E by k
- Overflow if E out of range
- Round M to fit frac precision
-
Floating point in C
- C guarantees two levels
- float single precision
- double double precision
- Conversions/casting
- Casting between int, float, and double changes bit representations
- double/float → int
- Truncate fractional part
- Not defined when out-of-range, NaN, etc.;
- int → double
- Exact conversion for numbers that fit into ≤ 53 bits
- int → float - Round according to rounding mode
- C guarantees two levels
-
Machine-Level Programming
- Architecture
- Design of the computer
- Instruction Set Architecture: The parts of a processor design that one
needs to understand or write assembly/machine code
- Also known as the ISA
- I.e., specification of instruction formats, actions, registers, etc.
- Microarchitecture: Implementation of the architecture
- Examples: cache sizes and core frequency.
- Code Forms
- Machine Code: The byte-level programs that a processor executes
- Assembly Code: A text representation of machine code
- Example ISAs
- Intel: x86, IA32, Itanium, x86-64
- ARM: Used in almost all mobile phones, many tablets
- Assembly/Machine Code View
- Architecture

- Programmer-visible state
- PC: Program counter
- Address of next instruction
- Called “RIP” (x86-64 instruction pointer)
- Register file
- Heavily used program data
- Fast, easily accessible storage for the data the program is currently working with
- Condition codes
- Store status information about most recent arithmetic or logical operation
- Used for conditional branching
- Determine which instruction to run next based on the result of a previous operation
- Memory
- Larger-scale storage (than registers) for the program’s code and data.
- Byte addressable array
- Code and user data
- Code stores the instructions to be executed
- Data stores the information the program is working with
- Stack to support procedures (aka functions)
- CPU
- Archaic term for “Processor”
- “Central Processing Unit”
- Archaic term for “Processor”
- PC: Program counter
- The registers, while memory provides Execution Model for Modern Processors — von Neumann cycle (fetch-decode-execute)

-
Fetch: The program counter (PC) points to the memory address of the next instruction to be executed. The CPU fetches this instruction from memory.
-
Decode instruction and get any data it needs (possibly from memory).
-
Access Memory: If the instruction requires data from memory (e.g. a load or store operation), the CPU uses the address specified in the instruction to access the desired memory location.
-
Write to Register: Once the data is retrieved from memory, the CPU writes that data into one of the registers in the register file. This allows the program to use that data as part of the current operation.
-
Repeat.
-
Turning C into Object Code

- Compiling Into Assembly

- Assembly Characteristic: Data Types
- “Integer” data of 1, 2, 4, or 8 bytes
- Data values
- Addresses (untyped pointers)
- Floating point data of 4, 8, or 10 bytes
- Code: Byte sequences encoding series of instructions
- No aggregate types such as arrays or structures
- Just contiguously allocated bytes in memory
- “Integer” data of 1, 2, 4, or 8 bytes
- Assembly Characteristics: Operations
- Move/copy data between memory and register
- Load data from memory into register
- Store register data into memory
- Perform arithmetic or logical function on register or memory data
- Transfer control
- Unconditional jumps to/from procedures
- Conditional branches
- Move/copy data between memory and register
- Object Code
- Assembler
- Translates .s into .o
- Binary encoding of each instruction
- Nearly-complete image of executable code
- Missing:— linkages between code of different files
- Linker
- Resolves references between files
- Combines with static run-time libraries
- E.g., code for malloc, printf
- Some libraries are dynamically linked
- Linking occurs when program begins execution
- Machine Instruction Example
- Assembler

- Disassembling Object Code

-
Disassembler
- objdump -d sum
- Useful tool for examining object code
- Analyzes bit pattern of series of instructions
- Produces approximate rendition of assembly code
- Can dump either a.out file (complete executable) or.o file (single module)
-
What can be disassembled?
- Anything that can be interpreted as executable code
- Disassembler examines bytes and reconstructs assembly source
-
Assembly basics: registers, operands, move


- Moving Data
- movq Source, Dest
- Allows CPU to
- load values from memory (can be code, data, stack, or heap memory) into registers
- store register values back into memory
- Operand types
- Immediate: Constant integer data
- Example: $0x400, $-533
- Like C constant, but prefixed with ’$’
- Encoded with 1, 2, or 4 bytes
- Register: One of 16 integer registers
- Example: %rax, %r13
- But %rsp reserved for special use
- Others have special uses for particular instructions
- Memory: 8 consecutive bytes of memory at the address given by the register
- Simplest example: (%rax)
- Various other “address modes”
- Direct addressing: movq 0x1000, %rax (moves the 64-bit value at address 0x1000 into rax)
- Indirect addressing: movq (%rax), %rbx (moves the 64-bit value at the address stored in rax into rbx)
- Base-plus-offset addressing: movq 8(%rax), %rbx (moves the 64-bit value at the address rax + 8 into rbx)
- Immediate: Constant integer data
- Operand Combinations

- Simple Memory Addressing Modes
- Normal (R) — Mem[Reg[R]]
- Register R specifies memory address
- Aha! Pointer dereferencing in C
- movq (%rcx),%rax
- Displacement D(R) — Mem[Reg[R]+D]
- Register R specifies start of memory region
- Constant displacement D specifies offset
- movq 8(%rbp),%rdx
- Normal (R) — Mem[Reg[R]]
- Complete Memory Addressing Modes
- Most General Form
- D(Rb,Ri,S) — Mem[Reg[Rb]+S*Reg[Ri]+ D]
- D: Constant “displacement” 1, 2, 4, 8 … bytes
- Rb: Base register: Any of 16 integer registers
- Ri: Index register: Any, except for %rsp
- S: Scale: 1, 2, 4, or 8 (why these numbers?)
- Special Cases
- (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]]
- D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D]
- (Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]
- Examples
- Most General Form

- Arithmetic & logical operations
- leaq Src, Dst (Load Effective Address)
- Src is address mode expression
- Set Dst to address denoted by expression
- Uses
- Computing addresses without a memory reference
- E.g., translation of p = &x[i];
- Computing arithmetic expressions of the form x + k*y
- k = 1, 2, 4, or 8
- Computing addresses without a memory reference
- Example
- leaq Src, Dst (Load Effective Address)

-
Some arithmetic instructions
- Format Computation
- addq Src,Dest Dest = Dest + Src
- subq Src,Dest Dest = Dest − Src
- imulq Src,Dest Dest = Dest * Src
- salq Src,Dest Dest = Dest << Src Also called shlq
- sarq Src,Dest Dest = Dest >> Src Arithmetic
- shrq Src,Dest Dest = Dest >> Src Logical
- xorq Src,Dest Dest = Dest ^ Src
- andq Src,Dest Dest = Dest & Src
- orq Src,Dest Dest = Dest | Src
-
Watch out for argument order!
-
No distinction between signed and unsigned int
-
One-operand instructions
- incq Dest Dest = Dest + 1
- decq Dest Dest = Dest − 1
- negq Dest Dest = − Dest
- notq Dest Dest = ~Dest
-
Assembly instructions - suffixes
- “b” applies to byte quantities (i.e., 8-bit operands)
- “w” applies to word quantities (i.e., 16-bit operands)
- “l” applies to long quantities (i.e., 32-bit operands)
- “q” applies to quad-word quantities (i.e., 64-bit operands)
- Used to load and store data of the corresponding sizes --- i.e., char, short, int, long int (in both signed and unsigned versions)
-
Machine-Level Programming II: Control
- Control: Condition codes - Processor State (x86-64, Partial)

-
Condition Codes (Implicit Setting)
- Single bit registers
- Carry Flag (for unsigned)
- SF Sign Flag (for signed)
- ZF Zero Flag
- OF Overflow Flag (for signed)
- An “implicit” side effect of (most) arithmetic or logical operations -
Example: addq Src,Dest ↔ t = a+b - CF set if carry out from most
significant bit (unsigned overflow) - ZF set if t == 0 - SF set if t < 0
(as signed) - OF set if two’s-complement (signed) overflow
(a>0 && b>0 && t<0) || (a<0 && b<0 && t>=0)
- Single bit registers
-
Not set by leaq instruction
-
Condition Codes (Explicit Settings)
- Explicit Setting by Compare Instruction - cmpq Src2, Src1 - cmpq b,a like
computing a-b without saving difference in any destination - CF set if carry
out from most significant bit (used for unsigned comparisons) - SF set if
(a-b) < 0 (as signed) - ZF set if a == b - OF set if two’s-complement
(signed) overflow
(a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b)>0)
- Explicit Setting by Compare Instruction - cmpq Src2, Src1 - cmpq b,a like
computing a-b without saving difference in any destination - CF set if carry
out from most significant bit (used for unsigned comparisons) - SF set if
(a-b) < 0 (as signed) - ZF set if a == b - OF set if two’s-complement
(signed) overflow
-
Explicit Setting by Test Instruction
- testq Src2, Src1
- testq b,a like computing a&b without setting destination
- Sets condition codes based on value of Src1 & Src2
- Useful to have one of the operands be a mask
- ZF set when a&b == 0
- SF set when a&b < 0
- testq Src2, Src1
-
Reading Condition Codes
- SetX Instructions
- Set low-order byte of destination to 0 or 1 based on combinations of condition codes
- Does not alter remaining 7 bytes


- One of addressable byte registers
- Does not alter remaining bytes
- Typically use movzbl to finish job - 32-bit instructions also set upper 32 bits to 0

- Conditional branches
- jX instructions - Jump to different part of code depending on condition codes

- Example (Old Style)

- Expressing with Goto Code
- C allows goto statement
- Jump to position designated by label

- General Conditional Expression Translation (Using Branches)
- C Code
- val = Test ? Then_Expr : Else_Expr;
- val = x > y ? x-y : y-x;
- Goto Version
- C Code

-
Create separate code regions for then & else expressions
-
Execute appropriate one
-
Visualizing pipeline behavior of CPU


-
Loops
-
Switch Statements
-
Machine-Level Programming III: Procedures (aka Functions)
- Mechanisms in Functions
- Passing control
- To beginning of function code
- Back to return point
- Passing data
- function arguments
- Return value
- Memory management
- Allocate during function execution
- Deallocate upon return
- Mechanisms all implemented with special machine instructions, and a set of conventions
- x86-64 implementation of a function uses only those mechanisms required
- Passing control
- Every running program has its own address space
- A key component of that address spaces is The Stack
- Functions
- Stack Structure
- Calling Conventions
- Passing control
- Passing data
- Managing local data
- Illustration of Recursion
- Mechanisms in Functions
-
Machine-Level Programming IV: Data (Arrays, Structures, Alignment)
- Arrays - One-dimensional - A collection of objects of the same type stored -
contiguously in memory under one name - May be any type of object - May be
objects of the same class (C++) - May even be collection of arrays of the
same types! - For ease of access to any member of array - For passing to
functions as a collection - Array Allocation - Basic Principle
T A[L];
- Arrays - One-dimensional - A collection of objects of the same type stored -
contiguously in memory under one name - May be any type of object - May be
objects of the same class (C++) - May even be collection of arrays of the
same types! - For ease of access to any member of array - For passing to
functions as a collection - Array Allocation - Basic Principle
-
Array of data type T and length L
-
Contiguously allocated region of L * sizeof(T) bytes in memory

-
Array Access
- Basic Principle
T A[L];
- Basic Principle
-
Array of data type T and length L
-
Identifier A can be used as a pointer to array element 0: Type T*
-
Example

- Array Accessing Example

- Array Loop Example

-
Multi-dimensional (nested)
- Declaration
T A[R][C];
- Declaration
-
2D array of data type T
-
R rows, C columns
-
Type T element requires K bytes
-
Array Size
- R * C * K bytes
-
Arrangement
- Row-Major Ordering
- Row-Major Ordering


- Example

- Nested Array Row Access
- Row Vectors
- A[i] is array of C elements
- Each element of type T requires K bytes
- Starting address A + i * (C * K)
- Row Vectors
- Concurrency model in assembly?
- Multi-level
- Structures/Unions
- Allocation
- Access
- Alignment
- Floating Point