Today we began our discussion on File Input and Output using Python. Here’s an overview of what we covered:
- All of the programs that you have been writing so far have reset themselves each time they run. This is because the data associated with the program (i.e. your variables) are stored in your computer’s memory (RAM)
- RAM is a volatile place – it serves as a computer’s short term memory. RAM that is being used by a program is cleared when the program stops running. If programs are to retain data between executions then we must find a more permanent way to store and save information for future use.
- A computer almost always has at least one type of long-term storage device at its disposal (usually some kind of hard drive) — we can use the long term storage capabilities of a computer to store data in the form of a file.
- Once we save a file it will remain on the long term storage device after the program is finished running, and can be accessed and retrieved later on. This is a pretty common technique and is used by almost all programs that need to keep track of some kind of information between executions.
- There are two types of files we can work with – “text files” and “binary files.”
- Text files are files that contain data that is encoded as text (i.e. ASCII or Unicode characters). Data stored in a text file is visible and can be read by a program that is designed to view / edit textual data (i.e. a word processing program). We will be working exclusively with Text Files this semester.
- Binary files contain data that is not encoded as a text format – they are intended to be read by other programs and not by humans directly. Binary files appear as “gibberish” when viewed via a word processing program. An example of a binary file would be a movie or image file – if you were to open it up in a word processing program you would see nothing but intelligible characters.
- Every file must have its own filename, which is nothing more than a label that allow us to uniquely identify groupings of data that exist on a long term storage device. On most operating systems a filename comes with a “file extension” which allows you to quickly tell what kind of data is stored inside that file. Extensions come after the last period in a file (i.e. “mydocument.doc” – “doc” is the file extension). File extensions tell the operating system what kind of data is stored in the file and allows the OS to select an appropriate program to open a particular file.
- To begin working with files in Python you need to first create a “File Object.” A File Object is a special variable that serves as a connection between your program and the operating system. File Objects tell the OS the name of a file that you wish to work with along with what you want to do to that file (write, read, append, etc). Here’s how to open up a file object in Python that tells the operating system that you want to write to file called “myfile.txt”:
file_object = open("myfile.txt", "w")
- The open() function takes two arguments – the name of a file and a “mode” operation. Both of these arguments are strings. Python supports three file modes:
- “w” – Write Mode. Open up a file so you can write information to that file. If the file does not exist it will be created. If the file does exist it will be overwritten.
- “a” – Append Mode. Open up a file so you can write information to that file. If the file does not exist it will be created. If the file does exist we will simply “append” any new data to the end of the file.
- “r” – Read Mode. Opens upa file so that you can read its contents.
- You can write data to a file that has been opened for writing by using the write() function. Here’s an example:
# open up a file object for writing file_object = open('myfile.txt', ‘w’) # store the string 'craig' in the file file_object.write('craig') # store the string 'hello' in the file file_object.write('hello') # close the file when you’re done file_object.close()
- If you were to look at the file that gets created by the source code above you would see that both strings (‘craig’ and ‘hello’) are concatenated on the same line. The write() function simply takes its data and stores it in the file – it does not provide any formatting or extra characters to help delimit your data.
- You want to try and avoid writing files that concatenate all of your data into one long line of unintelligible text. This is bad practice since it can be very difficult – or impossible – to extract out your data later on. One way to separate data in a text file is by splitting out your data into multiple lines using the “n” escape character. This allows you to store a single value on each line of your text file, which will make things a lot easier for you later on when you need to read your file and perform some kind of operation on the data contained within. Here’s an example program that we wrote in class that stores a username and password in a file – each item will be stored on its own line.
# ask the user for their username # and password username = input("What's your username: ") password = input("What's your password: ") # store the username and password file_object = open("security.txt", "w") # send the information into the file file_object.write(username + "n") file_object.write(password + "n") # print a confirmation print ("Your account has been set up") # close the file file_object.close()
- You can read data contained inside a file once you have created a file object and have opened a file for reading using the read() function. Here’s an example:
myvar = open("test.txt", "r”) alldata = myvar.read() print (alldata) myvar.close()
- The read() function extracts all data from a file as a string and returns a string. This string must be further processed before it can actually be used, usually through using the string “split” method.
- Formatting data that you store in an external file is up to you, and you can choose any layout convention you’d like. However, many programmers design their files such that data is laid out across multiple lines (using a “n” character) in order to separate important pieces of data.
- The readline() function can be used to read a single line of data (until the next “n” character). The function returns a string that contains the requested information (including the “n” character). Your file object keeps track of its position inside a file using a “read position” indicator – it will remember the last line you read using the readline() function and will let you read lines sequentially if you continually call the readline() function. Here’s an example:
line1 = file_object.readline()
- You can also use the readlines() method to read all lines from a file object directly into a Python list. Here’s an example:
filelist = file_object.readlines() print (filelist) >> [ 'I'm on line 1!n', 'I'm on line 2!n' ]
- You can use the special “rstrip” function that is built into the String data type to remove certain characters. “rstrip” takes one argument – a string – and returns a string version of itself with all instances of the supplied character pattern removed. You can use the “rstrip()” function to extract all instances of the “n” character by doing the following:
for x in range(len(filelist)): filelist[x] = filelist[x].rstrip("n")
- Here’s an example that puts all of this together — in this program we open up the “security.txt” file that we created earlier and extract the username and password stored inside. Then we ask the user for a username & password combo and test to see if the two are the same.
# open up a connection to our file file_object = open("security.txt", "r") # grab the username & password as a list filelist = file_object.readlines() # remove the line breaks for x in range(len(filelist)): filelist[x] = filelist[x].rstrip("n") # store the username and password in new variables to make # our program more "readable" username_from_file = filelist password_from_file = filelist # close the file file_object.close() # now that we have the username and password # ask the current user for their info # if it matches what's in the file we cna let them in # otherwise they can't log in username = input("username: ") password = input("password: ") # check to see if they match if username == username_from_file and password == password_from_file: print ("you're in!") else: print ("sorry, wrong info")
- When you are reading data from a file you will need to convert strings to ints or floats if you want to perform calculations on them. This is identical to what we have to do when using the input() function to ask the user to enter a string. Note that the int() and float() functions automatically remove line break characters for you. Here’s an example that opens up a text file called “testscores.txt” — this file contains the following information:
craig 100 82 74
And here is how we can work with the file by treating the score values as integers:
# open up the file file_object = open("testscores.txt", "r") # read in each line manually (without putting data into a # list first) studentname = file_object.readline() studentname = studentname.rstrip("n") score1 = int(file_object.readline()) score2 = int(file_object.readline()) score3 = int(file_object.readline()) # print out data print (studentname) average = (score1 + score2 + score3) / 3 print ("average: ", average)
- You can also use repetition structures to write large amounts of data into a file. Here’s an example that writes out a series of price values that are entered by the user:
# open up our file file_object = open("prices.txt", "a") keepgoing = True while keepgoing == True: # ask the user for a price price = float(input("give me a price: ")) if price > 0: # store the price file_object.write(str(price) + "n") else: # end the loop keepgoing = False # close the file file_object.close()
- Most times you won’t know how many lines of data exist in a text file. One strategy for dealing with these kinds of files is to use a while loop in combination with the readline() function. The readline() function will return an empty string (‘’) when it has reached the end of the file. Here’s an example that prints out the average price of the items entered by the previous program:
# accum variables total = 0 num_prices = 0 # open up our file file_object = open("prices.txt", "r") # continually real from the file as long as there # is data to be read keepgoing = True while keepgoing == True: # read the next line from the file nextline = file_object.readline() # test to see if we have reached the end of the file if nextline == '': keepgoing = False else: nextline = nextline.rstrip("n") nextline = float(nextline) # accum variables total = total + nextline num_prices += 1 print (nextline) # print out summary print ("average: ", total / num_prices) # close the file file_object.close()
- Here are some additional practice problems that we went over in class. Problem #1 – Write a “matchmaking” program that asks the user to enter in their favorite color and their favorite food. Store the result in a text file.
# open up the file file_object = open("match.txt", "w") # ask the user their fav color color = input("Fav color: ") # ask the user their fav food food = input("Fav food: ") # write the info to the file file_object.write(color + "n") file_object.write(food + "n") # close the file file_object.close()
- Problem #2 – Interface with your matchmaking text file and ask a second user for a favorite color and favorite food. Compare the results – if they get 0/2 questions correct, they are not a match! ½, they might be a match. 2/2, they are definitely a match!
# get the info from the file that we will # be using for comparison purposes file_object = open("match.txt", "r") # get the fav food and fav color color = file_object.readline() food = file_object.readline() # get rid of the line breaks color = color.rstrip("n") food = food.rstrip("n") # close the file file_object.close() # ask the current user what their prefs are usercolor = input("What's your fav color? ") userfood = input("What your fav food? ") # compare points = 0 if usercolor == color: points +=1 if userfood == food: points +=1 print (points/2, "percent match!")
- Problem #3 – Open up a text file named “drawing.txt” for read access. Using the turtle graphics library, extract the coordinates contained in the file and draw the resulting picture. Use the turtle.goto() function to move the turtle from point to point.
import turtle turtle.setup(500,500,0,0) # open up the file file_object = open("drawing.txt", "r") # go through each line keepgoing = True while keepgoing == True: # read the x and y coordinate x = file_object.readline() # are we at the end? if x == '': keepgoing = False else: # read the y coordinate y = file_object.readline() # convert to floats x = float(x) y = float(y) # draw this to the screen turtle.goto(x,y) file_object.close()
-120.0 -95.0 -76.0 22.0 -13.0 185.0 66.0 24.0 114.0 -103.0 24.0 -103.0 23.0 -187.0 -38.0 -186.0 -39.0 -101.0 -120.0 -96.0