CS 2303: System Programming Concepts

Virtual machines
- An emulation of a computer system
  - Running a computer OS (the guest) inside another computer’s OS (the host)
    - The host runs the VM software application
    - The VM software application emulates the guest computer’s hardware
    - The guest OS thinks it is running directly on the underlying hardware
Compiled / Interpreted / Just-In-Time compiled languages
- Compiled languages convert directly to machine code
- Interpreters run through a program line by line and execute each command
- JIT compilation improves the performance of interpreted programs
  - While the interpreted program is being run, the JIT compiler determines the most frequently used code and compiles it to machine code
Programming language paradigms
- Imperative
  - Program is often a sequence of function calls
    - Functions accept data (not consume it), can modify data, can return results
  - Variables hold data - Which can persist
- Functional
  - Feed data into a function, feed result into another function
    - Functions consume data, functions produce new data, might have side-effects
- Object-Oriented
  - Objects have data and behavior: invoke object to do something to itself
- Scripting
  - Easy to throw together a program
C language
- Designed for systems programming
  - OS, compilers, utilities, assemblers, text editors, databases, network drivers,…
- Was invented to write the OS UNIX
- Features
  - Runs fast
  - Mix of high-level and low-level capabilities
    - Sometimes called a mid-level language
  - Good for interacting with hardware
- Data types determine how much space the variable occupies in storage and how the bit pattern stored is interpreted
  - Fundamental types
    - char
    - int
    - float
    - double
    - long
    - short
      - short <= int <= long
  - The type void
    - function return
    - function argument
    - pointers of type void * represents the address of an object, but not its type, can be casted to any data type
  - Reference types
    - pointers
    - arrays
    - structures
    - unions
    - functions
  - Control-flow constructions
    - if-else, switch, while, for, do, break
Purposes of systems programming
- Systems programs
  - Programs to make computers work efficiently
  - Tools for building and running applications
  - Building other programs and applications
- Applications
  - Programs to make people / businesses productive
  - Programs to do useful work using the computer
  - Acquiring, managing, analyzing information
Priorities
- Systems programs
  - Speed of operation — programs need to run fast to be productive
  - Compactness — minimize resources to keep costs low
  - Flexibility in use — get stuff done
  - Manipulation of low-level resources — make the computer useful
- Application programs
  - Speed of development — create a solution to an existing problem
  - Ease of modification — keep up with evolving needs / expectations
  - Suitability for the problem domain — be intuitive to program users
Language evolution
- Machine language
- Assembly
- High-level languages
C++ vs Java or Python
- C/C++ run-time vs. Java or Python
  - C/C++ are compiled languages, Java JIT, Python interpreted
    - Better low-level control & performance
- Developer responsible for ‘housekeeping’ such as memory management
  - Less run-time checking
    - Minimal run-time checking for C
- Newer releases of C++
  - Strive to address many run-time issues
    - Sometimes at cost of performance
      - Some run-time errors so common it makes sense to detect
  - Compiler technology has advanced significantly
    - More able to determine possible run-time errors during compilation
File & directory permissions

Linux permissions

Filename conventions
- code files (source code)
  - source.c common for C language
  - header.h for C language header (include) files
  - source.cpp or source.cxx common for C++ language
  - header.hpp or hxx for C++ header (include) files
- object files (output from compiler)
  - source.o same name as source file with different extension
- library files
  - library.a in Unix/Linix
  - library.lib in Microsoft Windows
- dynamically-linked library
  - libary.so in Unix/Linux
  - libary.dll in Microsoft Windows
- executable files─ filename (no extension) in Unix/Linux
  - filename.exe in Microsoft Windows
Editing, compiling & linking

Linking and compiling

The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control
Compiler: gcc (g++)
Linker: ld
Best practice: in C or C++ programs, keep all the constants, macros, system wide global variables, and function prototypes in the header files and include that header file wherever it is required
#include <calcGrades.h> tells the preprocessor to search for the header file in the system include directories. These directories are typically set during the installation of the compiler and include the standard C library headers. This is used for system-wide libraries and headers that come with the compiler.
- Most are in /usr/include
#include “calcGrades.h” tells the preprocessor to search for the header file in the current directory and then in the directories specified by the -I option of the compiler. This is used for headers that are specific to the project and not a part of the system.
- Put in sibling dir to source files ../include
Best practice: Use double quotes for user-defined headers and angle brackets for system headers, this way it’s easier to see where the headers are coming from.
Compiling with GCC
- gcc knows filetypes by extension
  - Extensions helpful for context/tools even though not required by Linux
- Simplest:
- gcc file.c
- Produces executable called a.out
- Override default output file name:
  - gcc file.c —o file
  - e.g. gcc hello.c —o hello
- Both invoke the compiler and then the linker.
- Just compile to object:
- gcc -c file.c
  - Produces object file called file.o
- Link multiple files:
  - ld [opts] files
  - ld —o newhello file1.o file2.o
    - Also: ld hello.o foo.o —o newhello
    - Without —o, executable would be in a.out .
  - gcc file1.o file2.o —o file
- How to use?
  - For simple files:
    - just override the default output file name.
  - For multiple files:
    - First, compile each source file with —c
    - Then, link all object files with overridden executable file name.
- This can be automated with the make utility.
Primitive data types
- int, bool, float, void, char
- Match what the hardware supports
- A bit is the smallest unit storage, stores a 0 or 1
- Group of 8 bits is 1 byte
- 1 bit holds 1 or 0 → binary representation → 0 to 255 maps to ASCII → chars
- Size: short <= int <= long < long long
Symbolic constants
- Textual substitution by the preprocessor
- Different from typed constants, which are handled during the compilation phase
Storage classes
- Defines the scope and lifetime of variables and/or functions
- auto, register, static, extern
- auto
  - Default storage class for all local variables
- register - Define local variables that should be stored in a register instead of RAM

Storage classes

Variables MIGHT be stored in a register depending on hardware and implementation restrictions
static
- Keep a local variable in existence during the lifetime of the program
- Global static variables’ scope is restricted to the file in which it is declared
extern
- Give a reference of a global variable that is visible to all program files
- Used when there are two or more files sharing the same global variables or functions
- Define a global variable or function in one file and use extern in other files
Binary operators
- Operates on bit strings
- A = 60 (0011 1100 binary) and B = 13 (0000 1101 binary) → a bit holds a
- &
  - Binary AND copies a bit to the result if it exists in both operands
  - A & B = 12 (0000 1100)
- |
  - Binary OR copies a bit if it exists in either operand
  - A | B = 61 (0011 1101)
- ^
  - Binary XOR (exclusive OR) copies the bit if it is set in one operand but not both
  - A ^ B = 49 (0011 0001)
- ~
  - Binary One’s Complement flip bits
  - ~A = ~(60) = (1100 0011)
- <<
  - Binary Left Shift moves the left operand value by the number of bits specified by the right operand
  - A << 2 = 240 (1111 0000)
- >>
  - Binary Right Shift
C program input and output
- 3 I/O streams are provided by the OS to any program
  - Stream is a sequence of bytes
  - stdin - Standard Input Stream from terminal by default
  - stdout - Standard Output Stream defaults to terminal
  - stderr - Standard Error Stream defaults to terminal
Header files
- Maintain consistency
  - All src files have same information
- Reduce effort and duplication
- Increase flexibility
- Increase code clarity
- What to put
  - Symbolic constants
  - Function prototypes
  - Data structure definitions
  - External variable definitions
  - Guards
  - NOT executable code
    - Macros border on exceptions
- Using header files
  - In the source file (.c, .cpp, .cxx)
    - Use #include directive
  - Compiler (preprocessor) inserts the contents of the specified file
    - At that point of the source file
    - Textual insertion of the entire #included file
Guards in header files
- Guard against multiple inclusions of same header file
  - Multiple inclusions would result in multiple definitions of same information
  - We use ‘guards’ to prevent more than one inclusion to a given source file
- Works by defining a symbol first time the header is included
  - Definition acts as a guard (flag) to know if the file has already been included
- To derive the symbol:
  - Convert to file name ALL CAPS.
  - Replace dot with underscore.
make utility
- Build executable and library files from multiple sources
- Multiple files go into each program
  - Modularity, Organization, Reuse other programs, Multiple Developers, etc.
- When compiled, each .c, .cxx file produces one .o file.
  - Each C/C++ source compiles to one object file
  - Typically, object file has same base name as source file (can be renamed)
- Multiple .o files can be linked to form one executable of library file
- Source files can include other files
  - e.g., Header files (.h, .hxx, .hpp) with declarations and prototypes
  - Possible to include other source files, but frowned upon
- Does things based upon dependencies
  - Usually compiling, linking, copying or moving…
- make tries to do the minimum needed
  - Ensures everything is up-to-date with minimal work
- Each project needs a makefile
  - Named makefile by default - sometimes Makefile
- Source files depend on header files
  - make utility uses timestamp of files to see what’s newer and determine if something needs rebuilding
- Each section of the makefile (aka rule) specifies
  - Target (the file to be generated)
  - Dependencies (files the target depends on)
  - Commands (what to do if dependency has changed)
- Each section starts at the left-most column
  - Each line in the rest of the section is indented with a tab
- First section in makefile is the default “rule”
  - Used if you run make with no parameters
    - The make utility looks to default section to figure out what rules to perform
- makefile formatting - General format for each section / rule

Makefile

Commands indented with a tab character
The programmer’s responsibility to specify dependencies

Makefile example

The programmer’s responsibility to specify dependencies
Doxygen
- Must have config file Doxyfile in project directory
- A software program (tool) that will generate documentation from annotated C/C++ sources (as well as several other programming languages.) Doxygen program will scan source files looking for keywords and analyze program structure to:
- Generate on-line documentation or off-line reference documentation. Doxygen can produce output documentation in a number of different protocols including, but not limited to, html, LaTex, RTF, PstScript, PDF, and Unix/Linux man pages.
- Analyze and extract the code structure for undocumented source files.
- Analyze keywords and structure to generate normal program documentation such as user manuals.
- Accessing Doxygen HTML
  - Must be somewhere under your public_html directory
  - All files must be publicly readable
  - All directories above must be publicly executable
Casting
- Instruct the compiler to do a conversion
  - Convert data from one type to another type for an assignment
- Instruct the compiler to interpret data differently
  - How to interpret the memory location
  - How to interpret the bits in memory
- Some type of conversion (implicit) is done automatically - especially promotion
  - No loss of information
  - float → double
  - char → int
  - int → long
  - int → double
- Explicit type conversion can lose data due to truncation
  - int i = (int) 1.5;
- Weaknesses of C conversion
  - Multiple meanings in different contexts.
    - Convert the data
    - Look at data differently
  - No way to specify conversion algorithm
    - Possible to write a function that does the conversion
  - Non-portable across platforms
    - Data storage sizes and representations can be different
  - Note: C++ adds more advanced casting
typedef
- Allows creation of “aliases” for types
  - Greater readability
  - Greater portability
    - Since primitive types can be implementation-dependent
    - i.e., can have conditional compilation to create int_16 always 2 bytes
- Dependence
  - Sometimes header file declarations depend on architecture
  - Some declarations must agree with underlying system / architecture
- Abstraction
  - Often we want the data storage to be consistent between platforms
    - Some declarations want to be independent of underlying system / architecture
- Syntax
  - Typedef existing_type new_type;
- New type just like any other type to compiler
  - Use it as such
- Examples
  - typedef unsigned short uint_16;
  - ─ typedef unsigned long uint_32;
  - ─ Typedef unsigned long size_t;
- Will often be done using conditional compilation in header file so alias based on appropriate primitive type
- Can give a name to user defined data types (unions, structs)
- typedef vs #define
  - typedef is limited to giving symbolic names to types only where as #define can be used to define alias for values as well, q., you can define 1 as ONE etc.
  - typedef interpretation is performed by the compiler whereas #define statements are processed by the pre-processor.
Enums - User-defined data type
- Symbolic names for integral constants
  - Limited set of values for enumerated type
- Significantly improves readability
  - Good code style / hygiene
- Provides type safety
  - Constrains valid ranges of integral values
- Declaration
  - enum enum_name {val1, val2, valN};
- Instantiation
  - enum enum_name myEnumVar;
- Best practice: In general an enum is going to be used as a type definition and should always be in the header file
Array - sequence of data in contiguous memory locations
- An array object is a linear collection (sequence) of storage locations for elements of the same type
- C array is NOT an object
  - Only data, no behavior
- Defining an array
  - Address of a[i] = &array_start + index_offset * size_data_type
- Declaring arrays - For global scope - Arrays size must be constant known at compile time - So compilers know how much memory must be allocated at runtime - Must use constant literal or const variable for the size - For local scope (function, block) - Arrays can be of variable size - Size determined at runtime when they come into scope

Array in memory

Dynamic allocation of global arrays is possible using malloc()
An array name is a pointer to its first element
Accessing array element
- The address is calculated
  - Address of a[i] = (address of start of array) + ((size of an element) * index)
    - Cost
      - 4 memory accesses
      - 1 multiplication
      - 1 addition
- The value is read and/or written
Two-dimensional arrays
- Similar to Python
Passing arrays as function arguments
- Pointer as a formal parameter
  - void myFn(int *param)
- Sized array as a formal parameter
  - void myFn(int param[10])
- Unsized array as formal parameter
  - void myFn(int param[])
Return array from function
- C does not allow returning an entire array, instead return a pointer to an array
- The array’s name can be used in place of its pointer
- C does not allow returning the address of a local variable, so the returned array would have to be static
Pointers, memory iteration, sorting
- Pointer is a variable whose value is the address of another variable - The address of a location in memory that stores a known data type

Pointer

A pointer is also a group of bytes that can hold an address
A pointer p points to a char c
Thus, the variable is a pointer to an object (as in C, not OOP) of some type
Pointer declaration
- Must declare the type of object variable will point to
- Must declare name of variable that will be a pointer
- Must declare the variable is a pointer
- Declared with ”*” operator
  - type * pointer
Memory address is obtained with the & operator
- & operator returns the memory address of some object

Memory address

De-referenced with ”*” operator

Pointer

Dereferencing is following the pointer indirection
- Follow the pointer to the memory address it references
Best practice: assign a NULL value to a pointer variable in case there is no exact address
- Programs are not permitted to access memory at address 0 (NULL value)
- By convention, a NULL pointers are assumed to point to nothing
- Check for null for pointer using if
Usage
- Beware uninitialized pointers - Don’t know where they’re pointing
- Pointers used incorrectly corrupt run-time environment
- High-level languages don’t provide pointers
Increment pointer to move to the next memory address
- Number of bytes moved is based on type
Decrement pointer to move to previous memory address
- Number of bytes moved is based on type
Why do you want to do it?
- Efficiency — fast to sequence through arrays in memory
- Expressiveness — communicates code intent to reader
Do you want to increment/decrement the pointer or object pointed at?
Use parentheses to remove ambiguity
- Increment object referenced: (*x)++; ++(*x);
- Increment pointer before or after dereference: *(x++); *(++x);
Legal and illegal pointers
- Safe To Point At
  - To static data — it won’t go away
  - To memory (variables) on the heap
    - But not after they are deallocated
  - To local variables and parameters
    - But only in same scope
- Unsafe To Point At
  - To memory (variables) on the heap after the memory has been freed
  - To local variables and parameters after exit from the function or block
    - After their lifetime ends
Casting C-style pointers
- All pointers are variables that store addresses to a type in memory
- Compiler needs to know the type pointed at so it can:
  - Check for type mismatch in operations using pointer dereferences
  - Find data fields and functions in structured (non-primitive) objects
  - Stay aligned as pointers move through arrays
  - Calculate offsets for array elements
  - Do pointer arithmetic appropriately
  - Interpret bits pointed at correctly
- Programmer responsible for casting pointer to right type - If you want to interpret bits in memory in different way - NOTE: this will not change the way bits exist in memory

Pointer casting

Iterating arrays - iterating with index vs with pointers

Iterating through array

Program (compiler) does not need to address of array index each access
- Pointer of matching data type can be incremented / decremented to iterate array
Programmer must ensure pointer doesn’t run past array bounds
The name of an array is automatically a pointer, does not need to declare another pointer variable
- To the zeroth element of the array - start of the array in memory

Array indexing

The latter is slightly more efficient
- Compiler doesn’t need to calculate array offset for each access
Memory segment different uses in system

Process memory layout

Code segment
- Machine instructions generated from your source code
- Loaded at execution time
- Typically write-protected
  - Read access but not write access
- Shared by multiple processes & threads
  - Shared libraries
  - Shared functions
  - Complete programs
- Fixed size
  - Compiler/linker knows how big
Data segment
- Global & static scope variables declared in program
- Loaded at start of execution
- Can’t be write protected
  - Programs need to write to variables
- Every process has its own copy
  - Don’t want it changed by other programs
- Fixed size
  - Compiler/linker knows how big
Heap segment
- Dynamic memory allocations served from the heap
- Assigned by OS at start of execution
- Every process has its own
- Chunks allocated and released at run time
  - Allocated using malloc()
  - Deallocated using free()
  - Best practice: Set values of freed pointers to NULL immediately after free to avoid undefined behavior when accessing those pointers after freeing
- Dynamic size - Can grow & shrink
  - Compiler/link cannot know how big
- C/C++ objects
  - Allocated from a heap at run time
    - When code creates object using C++ new
Stack segment
- Automatic (local) variables grow and shrink like stack data struct
- Sometimes multiple stack segments runtime environment
- Assigned by OS at start of execution.
- Each process has its own
- Works just like a stack data structure
- Holds automatic variables & housekeeping:
  - Function return address & values
  - Parameters passed to functions
  - Local (non-static) block variables
    - Functions, Code blocks
- Dynamic size
- Each entry into a function or block entry uses memory from stack
  - Pushed onto stack — grows to provide variable memory
  - Variable memory is not initialized unless you explicitly do it
- Each function exit (return) or block exit releases stack memory
  - Stack pops and stack decreases in size. In reverse order to usage.
  - Stack pushed when entering function / block, Popped upon exiting
- Memory values not cleared when released from stack
- Stack memory values not cleared upon next function block entry - Uninitialized local / automatic variables will have leftover value - garbage

Used and freed memory

Accessing variables
- When a local variable is accessed
  - The address is calculated
  - The value is read and/or written
- For single variables this is fairly simple
  - Address = Start of stack frame + constant_offset
- If you are dereferencing a pointer, you already have the address
Memory alignment

Memory alignment

Hardware often requires data and/or instructions to be aligned
- Located at an addresses having certain limitations
Might be absolutely required or just preferred for efficiency
Commands & Data moved through computer on ‘bus’
- These have number of bits in width
- Think of lanes of a highway
Aligning data with bus characteristics greatly improve efficiency of data moves
Overall performance of computer significantly increased
Low-level memory allocation
- As C system programmers we can allocate memory for our needs
  - As no automatic internal structure exists for raw (low-level) chunks of memory from the heap segment
- Functions to allocate/deallocate memory - malloc(), calloc(), realloc(), free()

Memory allocation functions

void *malloc(size_t size)
- Cast pointer returned to appropriate type
- Returns pointer to block of memory big enough to hold size bytes.
  - size_t is a platform-dependent type guaranteed large enough to address anything
  - Usually typedef’d to an unsigned integer but not always
    - Big enough to hold the largest possible address or memory size.
- Allocated memory is aligned to handle any type of primitive data
- Returns NULL if error occurs
- Memory is not cleared so it might have garbage values
void *calloc(size_t nmeb, size_t size)
- Returns ptr to memory big enough to hold nmemb elements of size bytes each.
- Enough space so nmemb elements can be properly aligned
  - The order of parameters are not interchangeable
  - First is number of elements, second is size (in bytes) of a single element
- Entire chunk of memory is initialized to zero values before return
- Returns NULL if error occurs
void *realloc(void *ptr, size_t size)
- Can only be called for previously allocated memory
  - Memory allocated by malloc(), calloc(), or previous realloc()
- Changes memory block to new size
  
  Will move to new address to different location in heap if necessary
Copies any existing data to the new location
Any additional memory added to the new chunk is not cleared
NULL ptr if memory cannot be reallocated
- Existing memory is NOT freed or changed if reallocation fails
void free(void *ptr)
- Releases memory back to heap that was previously allocated
  - Memory allocated using malloc(), calloc() or realloc()
- You can only free memory pointed to by pointer ptr if it:
  - Points to memory that was dynamically allocated from the heap
  - Points to the start of the block that was dynamically allocated
- No value or error results are returned
Parameters passing
- Parameters are passed from calling code to function by pushing copies onto the stack segment of memory
- In C, all parameters are passed by value
  - The calling code passes actual parameters
  - The receiving code gets formal parameters
- How pass by value works:
  - Each actual parameter is evaluated, and a copy is made
  - Each formal parameter is bound to the copy just made
  - The function does its thing using the copy and other automatic variables
  - Any return value is made available to calling code
    - Calling code responsible for storing a copy of the return value
    - If return value is not saved by calling code, the value goes out of scope
  - Formal parameters & return value go out of scope and memory reused
sizeof()
- Determine the storage size of things
- Size is computed at compile time
- Enhaces portability
  - Different systems have different type sizes and different alignment
  - Programmer doesn’t need to guess what system code will compile / run on
- Avoids magic numbers
  - Can determine size at compile time for any system
  - Programmer doesn’t need to hard code values with conditional compilation
- Context-aware for arrays
  - If the size of array is known at compile time
    - Returns the size of the entire array
  - Otherwise, returns the size of the pointer to the first element
C-Strings
- C-Strings are just arrays of char

C string representation in memory

The ‘\0’ at the end is the null terminator - flags the end of a string
Each element is sizeof(char) - 1 byte
Each element is the ASCII value of bits representing the character
One character is enclosed in single-quotes
- To specify ASCII value of character
- char val1 = 'A';
  - Occupies 1 byte // no null termination
- int val2 = ‘\n’
  - Occupies 4 bytes with ASCII value of newline (10) in LSB
Null-terminated string is enclosed in double-quotes
- To specify ASCII values with binary zero appended
- It is const static data type
- char hello_string[] = Hello;
  - Occupies 6 bytes — 5 chars + null terminator
Non-printable characters
- ‘\n’ = New Line (actually ASCII Line Feed)
- ‘\r’ = Carriage Return (usually not needed — unless Windows)
- ‘\t’ = Tab

Strings in C

stdin, stdout, stderr

stdin, stdout, stderr

Structures
- A C-style struct is a way to group related data of different types together
  - In a user-defined data type - kind of like a C++ object without behavior
- Declare in one place in your code
  - Use typedef to create an alias for clarity of use
  - Give the definition a name for reuse - most common.
- Accessing struct fields
  - Dot notation
- Structures as function arguments
  - Pass struct as function arguments/use struct as formal parameters/return values same way as variable/pointer
- Pointers to structures
  - Define pointers to structures the same way as defining pointers to any other variable
  - Access a structure’s members using a pointer to that structure
    - struct_ptr->member
- Bit fields
  - Allow the automatic packing of data in a structure as compactly as possible
  - Useful when memory is expensive
  - Examples
    - Packing several objects into a machine word
      - 1 bit flags can be compacted
    - Reading external file formats
  - Put :<bit width> after the variable
  - E.g: use unsigned int age : 3; to save variable of range [0, 2^3)
Unions
- Allows storage of different data types (members) in the same memory location
- Only one member can contain a value at any time
- The memory occupied by a union will be large enough to hold the largest member of the union
- Acess union members using dot notation
Hashing
- Hash tables/hash maps

Hashing

Key gets hashed, i.e. converted to an index
Hash function is fast operation to create hash value
- Very hard to get back to original key
- Often used in crypto
Maps the key value to fixed size and range

Hash function qualities

Collisions will happen and must be handled
- List list of student record for each hash bucket
- Move colliding student record to first free bucket
- Rehash with some deterministic change
Hash function
- Must turn the key into an index
  - Essence: Turn larger number of bits to smaller number of bits
- Must be efficient
- Indexes must be well scattered
  - Low chance of collision
- Algorithm choice is very dependent on type and values of key.\
- Good for random numeric data:
  - Modulo
  - Multiply by a constant, add another constant.
- Good for character strings or long data:
  - Sum of bytes
  - Shift and add, bit twiddling
  - Weighted sum of bytes
  - Polynomial
- How to deal with upper / lower case on strings
Data structures
- Object-oriented languages (C++ Language)
  - Each stack, queue, list, tree, etc. is a separate object.
  - Fields: Stack data area, size, pointer.
  - Methods: push(), pop(), etc.
  - Methods operate upon this stack.
- Non-Object-oriented languages (C Language)
  - Data structure in its own (header) file
  - Series of functions: create(), push(), pop(), etc. in C source file
  - Need identifier if more than one instance of data structure
  - Must pass it in as a parameter.
  - Store the state of the data structure in a unique struct instance
System commands - fork and exec and child processes

system() library function

Shell: a program that reads commands and runs other programs
Invoke the OS to execute specified system command
- As if you had typed command at the terminal command prompt
If the command is NULL
- Then a nonzero value if a shell is available, or zero if no shell is available
If a child process couldn’t be created, or its status not retrieved
- The return value is -1 and errno is set to indicate the error
If a shell couldn’t be executed in the child process,
- Then the return value is as though the child shell terminated by calling _exit(2)
If all system calls succeed
- Then the return value is the termination status of the child shell used to execute command
Forking - duplicate current task
- OS’s handle spawning new tasks in different ways
- Unix pioneered the approach of forking
  - The current process is duplicated to make a full copy
  - Everything is duplicated
    - All data is duplicated — including pointers
    - Open files and devices are duplicated
- One process calls fork(), two processes return! - Parent process & Child process
The C++ language
- Visibility - Namespace - a declarative region (scope) that defines the boundary of - names (types, functions, variables, etc.) inside the namespace - Namespaces organize code into logical areas - Prevents name collisions that can occur on large systems or multiple libraries - As C++ developers we control the scope of names - Not just files or blocks of structured code - We can define a logical region in which the name exists - C++ standard library has the namespace of std - std::string mystring = “C++ is great!”;

C++ namespace

What is the difference between using namespace std and pulling the string.h header file?
- A header file is a file that is intended to be included by source files. They typically contain declarations of certain classes and functions.
- A namespace enables code to categorize identifiers. That is, classes, functions, etc

C++ methods and class scope

Best practice: right-hand side
Classes

C++ classes

Constructor (ctor) invoked when class instantiated
- Before instantiating code has reference to created object
- May involve memory allocation and init of objects
Destructor (dtor) invoked when class destroyed
- When object goes out of scope and resources returned to system
- May involve cleanup and deallocation of memory for objects
Setters / Getters
- Support data abstraction
- Avoid direct access to data members
Copy constructor
- Initializes an object from another instance of same type by copying the member values from an object of the same type

const keyword

Inheritance
- C++ supports multiple inheritance
- Must qualify which superclass when dealing with diamond inheritance problems
Polymorphism

C polymorphism

C++ collections
- Templates
  - Parameterized classes and functions
  - Allow the class, algorithm or function to be applied generally
    - Implemented independently from the data type
- Standard Template Library
  - Generalized library of parameterized components
  - Contains
    - Containers
      - Sequential Containers: can be accessed in a sequential manner
        
        Vector, List, Deque, Arrays, Forward_list
      - Container Adaptors: modify the interface for sequential containers
        
        Queue, Priority_queue, Stack
      - Associative Containers: sorted data structures for quick searching
        
        Set, Multiset, Map, Multimap
      - Unordered Associative Containers: unsorted with quick access
        
        Unordered_set, Unordered_multiset, Unordered_map, Unordered_multimap
      - Iterators — access to all elements in a collection
      - Algorithms — a process to determine something
      - Functions — a relationship between parameters and results
- Sequential containers
  - Storage which allows sequential access to members
  - array: Static contiguous array
    - Improvement over the C-Style array
  - vector: Dynamic contiguous array
    - Sequential storage that can grow and shrink as needed
  - deque: Double-ended queue
    - Better insertion & deletion performance than vectors
  - forward_list: Singly-linked list
    - Very high performance with low overhead
  - list: Doubly-linked list
    - Slower traversal than vectors
- Iterators - An object that points to an element inside the container - Allows users to ‘iterate’ over the contents of the container - Iterator moves from beginning (front) to end (back) - Reverse Iterator move from end (back) to beginning (front) - Able to access element at current position by dereferencing iterator - Same syntax as dereferencing a pointer - Connect algorithm with containers - Allow algorithms access to elements - Without need to write code logic specific to the type of data in the container - Iterator: ++ moves forward collection, — moves backward - Reverse Iterator: ++ moves backward in collection, — forward - Some logic difference at the beginning or end of the collection - 5 types - Random Access Iterators: - Access elements at random position in collection — just like pointers - Bidirectional Iterators: - Can only move sequentially in collection — both forward and backward - Forward Iterators: - Can only move sequentially forward in collection — not backward - Output Iterators: - Can write current position but not read - Input Iterators: - Can read current position but not write

Iterator

C++ References
- C++ support references to variables
  - Primitive data types or user-defined data types
  - C does not support references
- C++ references are like Python or Java object references
  - Python and Java do not support pointers
- Both are variables storing addresses of other variables
  - Pointers store address of which is dereferenced using *
  - References are alias to address which do not need to be dereferenced
    - The compiler does so implicitly
- Pointer
  - Stores memory address
  - Programmer must dereference
  - Pointer can be reassigned
    - Point to different location
  - Has its own memory address
  - Can be assigned null value
  - Can be dereferenced
  - Supports arithmetic operations
    - Increment / Decrement
- Reference
  - Alias for memory address
  - Compiler implicitly dereferences
  - Reference cannot be reassigned
    - Bound when comes into scope
  - Alias to existing memory address
  - Cannot be assigned null value
  - Cannot be dereferenced
  - Doesn’t support arithmetic operations