class: center, middle, title-slide # CSCI-UA 102 ## Data Structures
## Java Programs (Under the Hood) .author[ Instructor: Joanna Klukowska
] .license[ Copyright 2020 Joanna Klukowska. Unless noted otherwise all content is released under a
[Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/).
Background image by Stewart Weiss
] --- layout:true template: default name: section class: inverse, middle, center --- layout:true template: default name: breakout class: breakout, middle --- layout:true template:default name:slide class: slide .bottom-left[© Joanna Klukowska. CC-BY-SA.] --- template: section # From Source Code (Text) to a Running Program --- ## Program's Source Code - The program's __source code__ is plain text. - We can use any text editor or an IDE (Integrated Development Environment) to write it.
.small[Atom editor]
.small[Eclipse IDE]
.right[
.small[.left[VSCode IDE]]
.small[.left[vi editor]]
] --- ## Byte Code and Java Virtual Machine .right-column2[
] .smaller[ - A __Java _Virtual_ Machine__ (JVM) is not a real machine. It is a program that can be installed on a computer. Java programs are interpreted by a JVM. ] -- .smaller[ - A compiler translates our source code into Java bytecode. The __bytecode__ consists of instructions that can be executed by our JVM. This is done using ``` javac SOURCE_CODE_FILE ``` ] -- .smaller[ - When we run a Java program, a JVM interprets the bytecode instructions by translating them one by one into the actual machine instructions (specific to the computer on which the program is executed). The CPU then executes the actual machine instruction. This is done using ``` java CLASS_FILE ``` ] -- __The consequence of this is that Java code that we write is independent of the actual computer on which we run it. As long as there is a JVM installed, the Java bytecode can be interpreted and executed on the actual hardware.__ --- template: section # Program's Memory:
Stack and Heap --- template: slide ## Why Do Programs Need Memory - Every time a program needs to store a piece of data, it uses memory to do so. - When we create a variable to store information in the program, ex: ``` int x; ``` that variable has four things associated with it: - __name__; that's `x` in the above case - __value__; that is not really defined above, but could be easily set with `x=5` - __location__ or memory address; the exact memory address is not really relevant (and, depending on the programming language may not be easily determined); whenever the program needs to retrieve the value of the variable, it needs to find the memory address and obtain the value from the bytes at that address - __type__; that's `int` in the above case; the type determines how many bytes in memory should be allocated; the most commonly used type sizes these days are shown below .center50[.small[ | type | number of bytes | |:---|:---:| |`byte`| 1 byte | |`char`, `short`| 2 bytes | |`int`, `float`| 4 bytes | |`long`, `double`| 8 bytes | |`boolean` | may vary | |memory address | 8 bytes | ]] --- name: primitive-variables ## Primitive Types Storage in Memory - For the primitive type variables (`int`, `long`, `float`, `double`, `char`, `boolean`) the value stored at the memory address allocated to that variable is the actual value of the variable (in binary format). -- .center[
] --- template: primitive-variables .center[
] .center[ when a variable is declared, it does not have any value (its value is undefined) ] --- template: primitive-variables .center[
] .center[ after it is initialized, its value is set ] --- name: reference-variables ## Reference Types Storage in Memory - All reference type variables store as their value the memory address of an object or an array that they refer to. -- .center[
] .center[ when we create a reference, its value is undefined ] --- template: reference-variables .center[
] .center[ executing `new Circle(15)` creates an object in memory;
that object has NO name ] --- template: reference-variables .center[
] .center[ _assigning that object to `c`_ means that the value of `c` variable is set to
the memory address of the newly created `Circle` object ] --- template: reference-variables .center[
] .center[ since we do not really care about the numerical value of that memory address,
we often use an arrow to indicate that
_`c` points to the object_, or
_`c` refers to the object_ ] --- ## Memory - When a program is executing on a computer, it is given a pool of memory to work with. -- - The program does not need to _worry_ about any other program accessing that memory. (Stay tuned for the operating systems class to learn why/how.) -- - The program organizes its _things_ in two different memory areas: - stack - heap --- ## Stack __Stack__ is where all the local variables and temporary information for functions/methods are stored. - It is organized in a collection of memory blocks, called __stack frames__. Each block belongs to a function/method. The block of a function that is currently executing is on top. Right below it is a block of a function that called the currently executing function, etc. The block for `main` is always at the bottom of the stack. -- name:stack __Example__ --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { public static void main (String [] args ) { foo () ; } public static void foo (){ bar( 15 ); bar( 8 ); } public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
before the first method starts the stack is empty ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { foo () ; } public static void foo (){ bar( 15 ); bar( 8 ); } public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
program starts: `main` is called ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { * foo () ; } * public static void foo (){ bar( 15 ); bar( 8 ); } public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`main` calls function `foo` ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { * foo () ; } * public static void foo (){ * bar( 15 ); bar( 8 ); } * public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`foo` calls function `bar` passing value of 15 to it ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { * foo () ; } * public static void foo (){ * bar( 15 ); bar( 8 ); } * public static void bar ( int x ) { * bat(x); } * public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`bar` calls function `bat` passing its parameter to `bat` ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { * foo () ; } * public static void foo (){ * bar( 15 ); bar( 8 ); } * public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`bat` finishes and its stack frame is removed from the stack ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { * foo () ; } * public static void foo (){ bar( 15 ); bar( 8 ); } public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`bar` finishes and its stack frame is removed from the stack ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { * foo () ; } * public static void foo (){ bar( 15 ); * bar( 8 ); } * public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`foo` calls function `bar` again with 8 as the parameter ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { * foo () ; } * public static void foo (){ bar( 15 ); * bar( 8 ); } * public static void bar ( int x ) { * bat(x); } * public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`bar` calls function `bat` passing its parameter to `bat` ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { * foo () ; } * public static void foo (){ bar( 15 ); * bar( 8 ); } * public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`bat` finishes and its stack frame is removed from the stack ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { * foo () ; } * public static void foo (){ bar( 15 ); bar( 8 ); } public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`bar` finishes and its stack frame is removed from the stack ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { * public static void main (String [] args ) { foo () ; } public static void foo (){ bar( 15 ); bar( 8 ); } public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`foo` finishes and its stack frame is removed from the stack ]]] --- template:stack .smaller[ .left-column2[ ```Java public class StackExample { public static void main (String [] args ) { foo () ; } public static void foo (){ bar( 15 ); bar( 8 ); } public static void bar ( int x ) { bat(x); } public static void bat ( int x ) { ... } } ``` ] .right-column2[ .center[
`main` finishes and its stack frame is removed from the stack; the program ends ]]] --- name: stack_frame ## Stack Frame Content ### Very Simplified View Each stack frame contains information about local variables that the function creates: -- .left-column2[ - the function arguments - all locally created variables - the return value (if applicable) .smaller[ {{content}} ]] -- Consider this function: ```Java double charStats ( String str, char c ) { double ratio; int count = 0; for (int i = 0; i < str.length(); i++ ) { if (str.charAt(i) == c ) { count++; } } ratio = count / str.length(); return ratio; } ``` How many local variables are there? -- .right-column2[
.small[ The stack frame for `charStats` function should contain memory for five different local variables: - two parameters: `str` and `c` - the `ratio` variable - the `count` variable - the loop counter variable `i` ] ] --- name: heap ## Heap - Whenever the program uses the keyword `new` the memory for that object is allocated on the heap. - Heap is not as organized as the stack. The chunks of memory that are allocated to different arrays and objects can be _all over the place_. (Well, there is some logic in it, but we will not get into it and it is not relevant for our discussions.) -- .center[
] --- template:heap .center[
.smaller[ Again, we do not really care what exactly is the memory address stored in `c`. ]] --- template:heap .center[
.smaller[ And the memory addresses at which `c` and the actual `Circle` objects are locted, do not matter for us either. ]] --- name: array-heap ## Arrays and the Heap Arrays in Java are always stored on the heap in consecutive memory locations. Example: If our program tries to allocate an array of 10 integers, we will need 40 consecutive bytes of memory on the heap (because each `int` needs 4 bytes of memory on current computers). ```Java int [] array = new int[10]; ``` -- .center[
] --- template:array-heap .center[
] .center70[ - We'll assume that the memory address of the array is `100` (we'll use decimal numbers instead of hexadecimal numbers for addresses in this example to make things a bit easier ). - This means that the address of the first element is also `100`. ] --- template:array-heap .center[
] .center70[ - The address of an element at index 1 is exactly 4 bytes after the element at index 0 (because arrays are always allocated in consecutive memory locations). - Therefore the address of the element at index 1 is `104` (this is `100` + 1 * 4bytes ). ] --- template:array-heap .center[
] .center70[ - With a bit of arithmetic we can figure out the address of the element at index 5:
.center[ initial_address + index * size_of_int
] or .center[ `100` + 5 * 4 = `120`
] ] --- template:array-heap .center[
] .center[ - What do you think is the address of the last element? ] --- template:array-heap .center[
] --- template:section # Examples and Things to Think About --- ## Example 1: What Happens in Memory Assume that there is a class called `Circle`. It has a public data field called `radius`. It has a one parameter constructor that takes a radius of a circle as its argument and creates a `Circle` object with that radius. ```Java Circle c1 = new Circle (10); Circle c2 = new Circle (20); Circle c3 = c1; System.out.println(c1.radius + " " + c2.radius + " " + c3.radius); c1.radius = 30; System.out.println(c1.radius + " " + c2.radius + " " + c3.radius); c1 = c2; System.out.println(c1.radius + " " + c2.radius + " " + c3.radius); c2.radius = 5; c2 = c3; System.out.println(c1.radius + " " + c2.radius + " " + c3.radius); c1 = c2; c1.radius = 15; System.out.println(c1.radius + " " + c2.radius + " " + c3.radius); ``` What is the output of the above code? --- ## Example 2: What Happens in Memory ```Java Circle [] circles = new Circle[10]; for (int i = 0; i < 10; i++ ) { circles[i] = new Circle(i * 5); } ``` - How is the above array laid out in memory? - Assuming the code fragment is part of a `main` function, what is stored on the stack and what is stored on the heap? - Consider the following code fragment: ```Java circles[2] = circles[8]; circles[8].radius = 1; ``` - how does the memory image change? - what happens to the `Circle` object that used to be stored in `circles[2]`? - how many different `Circle` objects are accessible through the array? --- name:stack_frame_swap ## Think About: `swap` function Given the `swap` function .left-column2[ .smaller[ ```Java void swap ( double d1, double d2 ) { double tmp; tmp = d1; d1 = d2; d2 = tmp; } ``` consider the following code fragment: ```Java double num1 = 3.1415; double num2 = 2.1718; swap ( num1, num2 ); ``` ]] -- .right-column2[ the values of `d1` and `d2` and initialized to the arguments that are passed to the function:
] --- template:stack_frame_swap .right-column2[ the value stored in `d1` is copied to `tmp`:
] --- template:stack_frame_swap .right-column2[ the value stored in `d2` is copied to `d1`:
] --- template:stack_frame_swap name:stack_frame_swap_final .right-column2[ the value stored in `tmp` is copied to `d2`:
] --- template:stack_frame_swap_final __and the swap is completed!__ -- __or is it?__ .smaller[ If we execute ```Java System.out.println(num1 + " " + num2 ); ``` after the call to `swap` function what will the output be? ]