Skip to content

CS 2103: Accelerated Object Oriented Design Concepts

  • 10/24/22
    • Event-driven programming
      • Control flow is variable, it depends on user input
      • It is desirable to have the screen update at a regular rate (60Hz)
    • Software testing
      • Tests suite: a group of tests
      • Module (Code) → Test suite → Report
      • Black-box testing: implementation unknown, only examine I/O behavior
      • White-box testing: implementation known
  • 10/25/22
    • Java
      • Compiled, “mid-level”
      • Native language of a CPU is its assembly language (e,g: different for Intel, ARM, Motorola,…)
      • .java → .class (bytecode, native language of the Java VM) → Java VM → CPU
      • Automatic garbage collection
      • Secure
        • Type checking
        • Array-bounds checking
      • Slower
        • Runs on a VM
        • Run-time security features
    • Elements of Good Programming Style
      • Refactoring
        • Inheritance
        • Ownership: Instantiate an object of the helper class inside other classes and use that object’s methods
  • 10/27/22
    • Overloading: Multiple “versions” of the same method taking in different types and/or number of parameters
    • Abstract class: a class that cannot be instantiated; only its subclasses can be instantiated
    • Java is single-inheritance
    • Ownership is by design, better than abstraction if targeted classes have no relationship when refactoring
  • 10/28/22
    • Coding style
      • Java class name: mixed-case (ImageAnalyzer)
      • Instance variables, local variables, method parameters, and instance methods: camel-case
      • Instance variables: _ before name
      • Constants: declare static final, all caps with underscores
      • White-spaces and braces: consistent
    • Access modifiers
      • private > (default) > protected > public
      • private: only methods within the same class can access, even subclasses cannot
      • default: package-private variables/methods/classes can be accessed by every class within the same package
      • protected: can bee accessed from classes within the same package and by subclasses
      • public: can be accessed from any clas
    • Interfaces
      • A collection of methods signatures, and descriptions of what they do, but no function bodies
      • Interfaces are more flexible in terms of design than classes, could group more loosely related methods
      • Name often ends with -able
      • Methods with no bodies are called abstract
      • Implementing an interface: create a body for every method in the interface
      • Java is multiple-implementation
      • Once an interface has been implemented, you can:
        • Declare a variable of the interface type
        • Declare a parameter of the interface type
        • Return a variable of the interface type
        • Declare an array of variables of the interface type
        • Run-time vs declared data types?
      • Interfaces cannot be instantiated, because they do not have function bodies to execute
    • Subinterfaces
      • Similar to inheritance, the subinterface inherits all the methods from the parent interface
      • Can have multiple parent interfaces
    • Abstract classes
      • Abstract classes cannot be instantiated
      • The inherited subclass from the abstract class must implement all the abstract methods in the abstract class
      • Can implement the function bodies in the abstract class itself in concrete methods
      • Abstract classes are allowed to contained
        • Variables
        • Concrete methods
        • Abstract methods
  • 10/31/22
    • Interfaces as contracts
      • Interfaces facilitate the division of labor between members of a team
  • 11/1/22
    • Common practice: Have an abstract class implement some of the methods in an interface
    • Type-safety and casting
      • Objects can have multiple types
      • Declaring a variable of a certain type != instantiating the object itself
        • DoubleDate ddata != DoubleDate ddta= new DoubleDate();
      • The name before a variable is the declared type
      • An object can have many related-typed variables pointing to it
      • Two kinds of type-checking
        • Compile-time (Java compiler)
        • Run-time (Java Virtual Machine)
      • The declared type dictates which methods can be called on the object
      • Casting is a “promise” to the compiler that variables are assigned to the correct more “specific” type
      • Upcast (from more specific to less specific)
      • Downcast (from less specific to more specific)
      • The run-time type of a variable depends on the constructor that was called to instantiate it
      • The run-time type dictates which implementation of a method is executed
      • At run-time, the JVM determines which method implementation based on the run-time type → process called dynamic dispatch
    • Collections
      • Arrays: convenient, efficient, fixed size for adding, changing, and accessing
        • More elements? Impossible
        • Quick find? Linear time
      • ArrayList: similar to Java arrays, but “grow” automatically
        • Implement an ArrayList
          • Under the hood, ArrayList consists of a regular Java array of objects that “grows” as necessary
          • isFull(), growArray(), add(), remove(), insert(),…
          • In practice, doubling the size of the new array is balanced in terms of speed and memory
  • 11/4/22
    • Type parameter: restrict the type in a Java collection
      • Specify the desired type as a type parameter in both the declaration and instantiation
      • Can also be an interface
    • Type-safe collections
      • An ArrayList<T> will only accept objects of type T or any subclass of T
    • Hash maps/hash tablespro
      • HashMap<K, V>
        • K: key type
        • V: value type
        • hashMap.get(key) → value associated with key
        • hashMap.put(key, value) → adds new entry
    • Graphs
      • Edges have weights associated with them to represent distance, cost, etc.
      • Representation in memory
        • Adjacency matrix: for the whole graph
          • N x N matrix, N # of nodes in graph
          • The (u, v)th slot is the weight of that connection
          • Fast way of assessing connections
        • Adjacency list: for every node
          • Every node maintains a list of other connected nodes
      • Two fundamental problems
        • Node discovery
          • Determine the set of reachable nodes from a starting node
          • 2 principal algorithms: BFS & DFS
        • Finding shortest path
    • Stacks
      • On modern computers, a program’s code and data are stored together in memory (Von Neumann architecture)
      • To keep track of which instruction in memory is being executed, the CPU maintains an Instruction Pointer (IP)
      • LIFO
      • “Return address” stack keeps track of where the IP returns to after finishing executing some function
    • Queues
      • FIFO
      • Used for temporary data storage
      • One use case: inter-process communication (IPC)
        • Cross-program messaging
        • Program B needs to receive the message sent from A in the same order it was sent
  • 11/8/22
    • BFS
      • Maintain some data structures
        • A collection of visited nodes from s
        • A queue of nodes have yet to visit
    • DFS
      • Just replace the queue with a stack
    • Finding shortest paths
      • BFS can be extended to give the shortest path between s and t
      • Whenever we add node n’ (neighbor) to nodes ToVisit, we keep track of which node it was visited from (using Map<Node, Node> gotHereFrom)
      • As soon as BFS discovers node t, the BFS terminates
      • Once t is reached, we backtrack using gotHereFrom
    • Abstract data types (ADTs)
      • ArrayList
        • Purpose: store a variable amount of data in a linear order
        • Operations: add, get, remove,…
        • Time costs
          • Adding is “fast” except when the internal array is doubled
          • The time needed to search for an element grows as the array becomes longer
      • Dictionary/Hashtable
        • Purpose: associate keys with values
        • Operations: put, get, remove,…
        • Time costs
          • All operations take about the same amount of time no matter how many elements
    • Algorithmic analysis
      • Instead of measuring time cost in terms of seconds, milliseconds, etc., we will count the number of elementary operations
      • CPUs implement the following operations and more
        • Arithmetic
        • Comparison
        • Branching
        • Assignment
        • Array access
      • When analyzing time cost, there are three cases
        • Best case
          • The best-case time cost is usually the least interesting since it’s very optimistic
        • Worst case
        • Average case
          • Usually very difficult to compute in practice
    • Big-”O” notation
      • Express the asymptotic time cost of an algorithm
      • Focuses on the dominant term in the function as n grows very large
      • Types
        • O(1) — constant
        • O(log n) — logarithmic
        • O(n) — linear
        • O(n log n) — loglinear
        • O(n^2) — quadratic
        • O(2^n) — exponential
        • O(n!) — factorial
  • 11/11/22
    • Problems with ArrayList
      • Forces all data to be stored in one contiguous array
    • Memory management in Java
      • Some primitive types: int, float, char, etc.
      • Some reference (object) types
      • ArrayList’s are of reference type
      • Primitive values store their values directly
      • Reference/object variables store the location of the object they point to
      • The object that name points to is stored in a separate block of memory
    • LinkedList
      • Break the list into “chain” of Node objects
      • Each node
        • Stores one piece of data the user wants to store
        • Contains a pointer to the next node in the chain, the last one points to a null object
      • Nodes do not need to be stored contiguously, but incur a overhead memory cost
      • Doubly-linked lists have previous and next pointers, singly-linked lists only have next pointers
      • Add, remove operations at the ends of the list are O(1)
      • Variants
        • Circular lists
        • Doubly-linked lists
        • Lists with sentinel/dummy nodes (Nodes with no data that always exist even if list is empty)
      • With sentinel/dummy nodes, handling corner cases can be simplified
  • 11/14/22
    • Encapsulation
      • Variables are hidden from outside access
      • Helper methods are hidden from outside access
      • Only public methods can be called from outside
      • Good practice: use weakest possible type when possible for maximum flexibility
      • To hide a helper class from outside access, we can embed Node as a private static inner-class of LinkedList
  • 11/15/22
    • Linear data structures
      • Useful when we want to preserve the order in which the data was added
      • An element is adjacent to at most 2 elements
    • Binary search (O(logn) for a sorted list)
      • Requires the data to be already sorted
      • Adding new data into a list is slowL: O(n)
    • Trees
      • A full binary tree with height h has 2^h leaf nodes and n = 2^(h+1)-1 nodes in total
      • A full binary tree with n nodes total has height h=log2(n+1)-1
      • More generally, a binary tree T with n nodes has
        • Minimum height: log2(n+1) - 1 (T is full)
        • Maximum height: n - 1 (T is a chain)
      • The maximum height of a tree with n nodes will impact the time-cost of operations of ADTs based on trees
    • Tree-based data structures
      • Array-based binary trees
        • Each node in the tree is assigned a unique index at which its data should be stored
        • Applies only to complete trees
          • Every level of the tree is completely filled except possibly the last and the last level is (partially) filled from left to right
        • The index of the left-child of a node with index idx? 2*idx + 1
        • The index of the right-child of a node with index idx? 2*idx + 2
        • The index of the parent of a node with index idx? (idx - 1)/2
      • Heaps
        • A max-heap is a complete binary tree that satisfies the heap condition
          • The root of every sub-tree is no smaller than any node in the sub-tree
        • Can be implemented using arrays or LinkedList
        • To add a new object o to the heap
          • Add o to the “bottom right” of the heap
            • This may violate the heap condition
          • Repeatedly “bubble up” o up the tree whenever o > parent(o) (Swap the node and its parent)
          • Time cost: O(logn)
        • Remove the largest element from a heap
          • Remove the root
          • Remove the last node n in the heap and make it the new root
            • This may violate the heap condition
          • Trickling down: Recursively swap n with one of its children until the heap condition is restored
          • Time cost: O(logn)
  • 11/17/22
    • Type-bounds
      • To put a restriction on the type parameter T, we can use a generic type-bound
      • HeapImpl<T extends Comparable>
      • This means that the T must be of a type that implements the Comparable interface
      • Comparable is an upper bound on type T
    • Javadoc
      • Supports automatic documentation via writing code comments in the Javadoc style
      • /** */, @param, @return
  • 11/22/22
    • Amazon Mechanical Turk: crowdsourcing human data
  • 11/28/22
    • Hash tables
      • A hash table requires
        • A hash function
        • A method of handling collisions
      • In order to ensure good performance, m must be bigger than n, the number of data the user will want to store (usually by a factor of 20 or 50)
      • Hashing function
        • Map data of arbitrary size to fixed-size values
        • A good hash function satisfies two basic properties
          • Fast: Worst case O(1) to compute
          • Uniform: Minimize duplication of output values (collisions).
          • Deterministic: Given the same key, it must always return the same value
      • Handling collisions
        • Open addressing (linear probing)
          • Many strategies for “finding another slot”
          • Simplest: linear probing
            • If hash function maps into an index i that is occupied, put it into i + 1
            • Handling deletions
              • Suppose we delete an element
              • We need to add a “bridge” element so that subsequent accesses to that bucket will “keep looking”
        • Chaining
          • Keep the head of a LinkedList at each index (kinda similar to segmented list)
          • Fast when the linked list is still short
      • Usually, keys are not integers
        • Java includes a hashCode() method in the Object class → “integer representation”/address of the object
        • Overriding of default hashCode() has two requirements
          • Deterministic
          • Consistent across equal instances
            • If o1.equals(o2) then hashCode(o1) == hashCode(o2), the opposite might not be true
            • If your class overrides equals(), it must also override hashCode()
    • equals() vs ==
      • Comparing primitive variables: ==
      • With object-type variables, there are 2 ways of testing for equality
        • ==: tests if the variables point to the exact same object in memory
        • equals(): tests if the objects are “semantically equivalent”
  • 12/1/22
    • Exceptions
      • Java provides a language construct called exceptions
      • A mechanism to interrupt the normal control flow to immediately return to the caller and pass to it information on what went wrong
      • try {<block of code that can throw an exception>} catch (<Exception type>) (<block of code to handle>)
    • Generating math expressions with context-free grammars (CFGs)
      • Parsing structured data is very similar to generating structured data
      • Branching (stochastic) process to generate random mathematical expression
        • The probability of the base case must be bigger than recursive case so it terminates without stack overflow
        • E → x | E + E | E - E | E * E | (E)
        • Each condition is called a production rule
        • We can write all the production rules for one non-terminal
      • A grammar that can have more than one parse tree generating the same string is called ambiguous
      • For mathematical expressions, the grammar needs to be unambiguous
  • 12/5/22
    • Anonymous classes
      • Defining a class that will only be used exactly once (like lambda functions)
    • Lambda expressions
      • Allow method to be passed as an “object” to another method
      • Only valid when the interface has one method with the right signature
    • Iterators
      • The type of data structure might change, don’t want to expose the inner workings of the data structures
      • A design pattern that provides a fast, safe, uniform way of traversing through the data structure
      • Iterator<type bound> iterator = <data structure>.iterator();
      • while (iterator.hasNext()) {
        • System.out.println(iterator.next());
      • }
    • Arrays in Java
      • Java is row-major
      • For an array stored in column-major layout, the elements of the columns are contiguous in memory. In row-major layout, the elements of the rows are contiguous
      • Cache locality
        • The order in which memory locations are accessed impacts performance due to how the CPU, main memory, and cache interact
        • Cache stores expensive and fast access of a small fraction of the main memory
        • Whenever the CPU reads a location in memory, it actually reads an entire “chunk” (cache line) into the cache
        • Once the chunk is in the cache, reading from the same chunk is fast (cache hit)
      • Parallelism and multi-threading
        • With multiple CPUs on the same computer, each thread can be executed independently
        • The first thread has to wait for the second thread to finish; this is called a join
        • Can “improve” performance (multithreading has overhead)
        • Data race: Two people withdraw money from the same account at the same time, who will get the money?