class: center, middle, title-slide # CSCI-UA 102 ## Data Structures
## Introduction to the Class .author[ Instructor: Joanna Klukowska
] .license[ Copyright 2020 Joanna Klukowska. Unless noted otherwise all content is released under a
[Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/).
Background image by Stewart Weiss
] --- layout:true template: default name: section class: inverse, middle, center --- layout:true template: default name: poll class: inverse, middle ## Poll
--- layout:true template: default name: breakout class: breakout --- layout:true template:default name:slide class: slide .bottom-left[© Joanna Klukowska. CC-BY-SA.] --- template: section # The Course --- template: slide ## The Instructional Stuff __Instructor:__ - Joanna Klukowska -- __Section Leaders:__ - Satya Chilale - Daniel Jin - Vincent Xu - Chenfeiyu Wen -- __2 Course Assistants__ -- __7 Graders__ -- __8 Tutors__ --- ## About This Course - NYU Brightspace: https://brightspace.nyu.edu - course info and syllabus - slides, labs, assignments, readings, resources - Zoom links for any remote meetings - your grades and course progress - Recitations (required): - you should attend the recitation for which you are registered - if you need to miss your scheduled recitation, contact your section leader, make sure to complete a weekly lab and plan to attend office hours and tutoring hours to resolve any questions you may have - Course tools: - [Brightspace]( https://brightspace.nyu.edu) - NYU's learning management system (LMS) - [Ed](https://us.edstem.org/) - our discussion forum and tool for ALL course related communications - [Gradescope](https://www.gradescope.com/courses/413385) - autograder and submission site for assignments - PollEverywhere - ... --- ## Textbook(s) - there is no one required book, but you do __have to__ use one or two resources to supplement content that we discuss in class - there are recommendations on the syllabus - you can access them in digital format through NYU libraries with no extra fees - keep in mind that no single resource will cover everything that you need, you should plan on using multiple books/resources --- template:poll ### As you start this course (and this semester), what one word pops into your mind? - go to http://pollev.com/joannakl - enter a word (or more) to answer the question above
NOTE: if you enter a phrase consisting of multiple words, the words will be used individually --- template:slide ## Data Structures and Algorithms ### What are they? -- A __data structure__ is a collection of data items in a memory of a running program that are organized in some fashion that allows items to be stored and retrieved by some fixed methods. An __algorithm__ is a logical sequence of discrete steps that describes a complete solution to a given problem commutable in a finite amount of time and space. -- .left-column2[ .smaller[ Examples of data structures from CSCI.UA 101: - __array__: data items are stored one after another in memory and they can be accessed by their index number, - __stack__: collection of items with access (add/remove operations) only at the _top_ of the stack (think of a stack of plates). ]] .right-column2[ .smaller[ Data structures covered in CSCI.UA 102: - __list__ - __stack__ - __queue__ - __tree__ - __binary tree__ - __hash table__ - __some graphs__ - ... ]] --- template: section # Why do data strutures matter? --- template:default background-image: url("img/back3.jpg") class: middle - [**Linus Torvals **](https://en.wikiquote.org/wiki/Linus_Torvalds), 2006 .Large["I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his(/her) code or his(/her) data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships. "] --- ## Demo 1: ### Create a string with lots of 'a's. .left-column2[ ```Java String str = new String(); for(int i = 0; i < NUMBER; i++){ str = str + 'a'; } ``` ] .right-column2[ ```Java StringBuilder sb = new StringBuilder(); for(int i = 0; i < NUMBER; i++){ sb.append("a"); } ``` ] -- - When `NUMBER` is set to 400,000: - the code on the left takes __6840 milliseconds__ (i.e., almost 7 seconds) to execute - the code on the right takes __~5 milliseconds__ (i.e., you do not even notice that any time passed) -- - Why? - we'll need to look at _how_ the `String` and `StringBuilder` classes are implemented - we need to understand what data structures and algorithms each class uses --- ## Demo 2: ### Iterating over a list of numbers .left-column2-large[ Program 1: ```Java LinkedList
numbers = new LinkedList<>(); LinkedList
twiceNumbers = new LinkedList<>(); populateList(numbers, size); for (int i = 0; i < size; i++) { twiceNumbers.addLast(numbers.get(i) * 2 ); } ``` Program 2: ```Java LinkedList
numbers = new LinkedList<>(); LinkedList
twiceNumbers = new LinkedList<>(); populateList(numbers, size); for (int num : numbers ) { twiceNumbers.addLast( num * 2 ); } ``` ] -- .right-column2-small[ In this program we have a list of numbers. We want to iterate over the list, double each number and add it to a new list. What is the difference between these two loops? ---- {{content}} ] -- When `size` is set to 100,000 program 1 takes __3,391 milliseconds__ to execute and program 2 finishes in __~4 milliseconds__. --- ## Demo 3, part 1: ### Creating and searching in collections of `Circle` objects .left-column2-large[ Program 1: ```Java Collection
tree = new TreeSet<>(); //create lots of Circle objects and add them //to the tree one at a time createCircles( tree, size); ``` Program 2: ```Java Collection
list = new LinkedList<>(); //create lots of Circle objects and add them //to the list one at a time createCircles( list, size); ``` .smaller[ Common to both: ```Java public static void createCircles(Collection
col, int size) { for (int i = 0; i < size; i++) { col.add(new Circle((int)(Math.random()*Integer.MAX_VALUE)) ); } } ``` ] ] -- .right-column2-small[ In this program we first create a tree and a list of `Circle` objects. How does the _shape_ of the collection going to affect the running time? ---- {{content}} ] -- When `size` is set to 1,000,000 program 1 takes __468 milliseconds__ to execute and program 2 finishes in __161 milliseconds__. --- ## Demo 3, part 2: ### Creating and searching in collections of `Circle` objects .left-column2-large[ Program 1: ```Java Collection
tree = new TreeSet<>(); //create lots of Circle objects createCircles( tree, size); //search for a specific circle Circle c = new Circle(102); tree.contains(c); ``` Program 2: ```Java Collection
list = new LinkedList<>(); //create lots of Circle objects createCircles( list, size); //search for a specific circle Circle c = new Circle(102); list.contains(c); ``` ] -- .right-column2-small[ Once the two collections of `Circle` objects are created, we want to find out if a circle with radius 102 got created. ---- {{content}} ] -- When `size` is set to 1,000,000 program 1 takes __0.01 milliseconds__ to search and program 2 takes __5 milliseconds__ to search. --- ## Demo 4: ### Calculating Fibonacci Numbers .left-column2[ ```Java public static long fibonacci1 ( long number ) { //base case if (number == 0 ) return 0; else if (number == 1 ) return 1; //recursive case else return fibonacci1(number - 1) + fibonacci1(number - 2); } ``` ] .right-column2[ ```Java public static long fibonacci2 ( long number ) { //base case if (number == 0 ) return 0; else if (number == 1 ) return 1; //iteratively compute the result else { long tmp1 = 0; long tmp2 = 1; long result = 0; int counter = 2; while ( counter <= number) { result = tmp1 + tmp2; tmp1 = tmp2; tmp2 = result; counter++; } return result; } } ``` ] -- .smaller[ - When we try to compute the 40
th
Fibonacci number: - the code on the left takes __430 milliseconds__ - the code on the right takes __~0.003 milliseconds__ - BUT, the code on the left appears so much simpler ] --- ## Technical Interviews And if Linus Torvalds and my demos did not convince you that understanding data structures is important and exciting, there are always technical job interviews: -- - most technical interview questions are based on data structures and algorithms - examples: - design a min-stack that performs typical stack operations and is able to retrieve the smallest element at any time quickly - describe an algorithm that can parse through a mathematical expression and validate if all brackets and parenthesis are in a correct order and matched - find all anagrams of a given string that are also words in the given dictionary - determine if a given binary tree is a search tree -- - why? -- .left-column2[ - because for a software developer / programmer / computer scientist data structures and algorithms are essential tools (just like a wrench is for a plumber; would you hire a plumber without a proper toolbox? ) ] .right-column2[
.small[
Plumber in a green uniform working on a sink. (official US Government [EPA] produced image, public domain.) Source: https://openclipart.org/detail/195900/friendly-plumber
]
] --- template: breakout ### Group Discussion: Why do we care so much about programs being fast? - Turn to the people around you and introduce yourself. - In groups of 2-4 people discuss why you think it is important for the programs to run fast. - You should think about it from two different perspectives: - why do **you** want a program on your computer to work fast? - why is it beneficial to the **society** if the programs run fast? (this one is a bit more challenging) - After ~5 minutes, some groups will get a chance to report on what they came up with. --- template: section # How to succeed? --- ## Tips for the Course - __Understand the concepts and reasons for doing things__. Do not use a particular solution/approach just becase a tutor or TA (or even the instructor) said it was a good idea. Do not memorize things if you do not know what they are doing. Question things! Be curious! -- - __Use pen and paper__ (or the modern equivalents). Draw things out and experiment with the ideas. It often helps to draw out variables and objects of a piece of code that you are analyzing and update the picture as you go through the code. This type of hand-eye interaction stimulates different parts of your brain and helps you learn! -- - __Try solving problems yourself__. I mean really work them out BEFORE asking for a solution, looking at a solution posted by others or trying to search the web. -- - __Write code solutions yourself__. Don't just read the code or watch the video showing the code. Don't copy and paste, type it up yourself and then compare it to the code we wrote in class or the code that is in a book. -- - __Treat mistakes and bugs as your friends__. The more problems you encounter, the easier it is to recognize them in the future. Your brain actually rewires itself when you make a mistake and fix it - this is how learning happens. -- - __Make friends__. Make sure that you meet at least a few other people in the class and get to know them. These friendships will benefit you and other people in the class, and, in some case, may last a lifetime. --- template: section # Course Syllabus