\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
signal_presentsignal_typetrial_numbutton_positionsubject_number
01600leftbl1611
11101leftbl1611
21102leftbl1611
31153leftbl1611
4004leftbl1611
\n", "
\n", "\n", "
\n", " Question 1
\n", " If we compute one \$d'\$ for each participant for each stimulus conditon and want to compare performance in two different stimulus conditions, what type of t-test should we most likely use?\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Enter your plan for the analysis as a markdown cell.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Question 2
\n", " To begin we want to read in the data from each participant into a pandas data frame using `pd.read_csv()` and the used `pd.concat()` to combine them. After you read in the data, inspect the columns and check the data for any inconsistencies that might affect our analysis (e.g., subjects who didn't perform any trials, etc...).\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Enter your code here.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a reminder the following show a simple for loop (called a list comprehension) that lets you get the filenames for the lab data set. You can use this to find the file names that you need to read in with `pd.read_csv()`. If you need a reminder refer back to your mental rotation lab which should be on your Jupyterhub node still!" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "ename": "FileNotFoundError", "evalue": "[Errno 2] No such file or directory: 'sdt1-data'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# this is an example list comprehension which reads in the all the files.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;31m# the f.startswith() part just gets rid of any junk files in that folder\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mfilenames\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'sdt1-data/'\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0mf\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mf\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlistdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'sdt1-data'\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstartswith\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'.'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: 'sdt1-data'" ] } ], "source": [ "# this is an example list comprehension which reads in the all the files.\n", "# the f.startswith() part just gets rid of any junk files in that folder \n", "filenames=['sdt1-data/'+f for f in os.listdir('sdt1-data') if not f.startswith('.')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Stop and share
\n", " Did you find any interesting or important points about the data that you want to share with the class? Let's get everyone on the same page so we get similar results so if you find something important (good or bad) please take a moment and share!\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Question 3
\n", " Now that we have the analysis dataframe, let's perform our planned SDT analysis on a single subject. In the cell below, create a new dataframe by subselecting from the main frame with the data from a single subject.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Enter your code here.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Question 4
\n", " Referring back to the first notebook from the lab, compute the hits and false alarms for this subject for each stimulus present condition. One note about the false alarms: remember that false alarms are trials where the stimulus was absent but the subject said it was present. There is only one type of these false larms trials in the experiment. So when you compute the statistics, you really will create five numbers for this one participant (The hits for each of the stimulus present conditions, and an overall false alarm rate). What are these numbers for this subject? Remember also that if you take the `mean()` of a column of 0/1 values it will tell you the proportion of those trials which are 1.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Enter your code here.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Question 5
\n", " Ok now compute the d' and c values for this subject. Refer back to the previous notebook for reference if you need the equations and python code. What can you tell from looking at your computed values? If for some reason you can't compute these values in your experiment (e.g., if the subject has no hits or no false alarms) then try question 4 again with a different subject.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Enter your code here.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Question 6
\n", " Ok, so now you have performed the desired analysis for one subject. You have computed eight numbers for that subject described above. Now we want to repeat this for each subject. There are two ways to do this. One is using the pandas groupby() function. This is somewhat of the advanced mode version but if you do this you can probably perform the analysis in only 5-6 lines of code. An alternative approach is to generalize the code you just wrote to use a for loop that iterates over each subject in the data frame. Which ever way you choose you might find it helpful to refer to the Forty For Loops notebook I provided which shows some example templates for using for loops to analyze individual subject data in pandas dataframes. The end result is that you want a dataframe containing the d' and c' for each subject for each stimulus condition.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Enter your code here.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Question 7
\n", " Now that you have your summary statistics computed, our next step is to ask if the conditions are different on either the d' or c measure. For now concentrate on the easiest and hardest conditions to see if there is a effect in the most dissimilar conditions. Then compare the hardest and second hardest conditions. Conduct the appropriate t-test to tell if these conditions are different. The chapter reading from this week has all the code you need. However you might find it helpful to refer to the pingouin documentation for the t-test function.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Enter your code here.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Question 8 (BONUS)
\n", " This question is a bonus but asks you to verify the assumptions of the chosen t-test hold. Usually this involves looking to see that the distribution of scores you are analyzing are roughly \"normally\" distributed. The textbook showed a few example way to check including the histogram, the qqplot, and the Shapiro-Wilks test. Perform this test on your data. If the data seem very non-normal you might consider switching your t-test to the non-parametric alternative discussed in the chapter (the wilcoxon or Mahn-Whitney test).\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Enter your code here.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Question 9
\n", " Ok great job! You have your data, you have your t-test, you have everything you need to answer the following questions: 1. Is performance higher in the low number of dots or high number of dots condition? How do you know? How would you report the statistical evidence in favor of your conclusion in a paper? Is the criterion different between the conditions? How do you know? Overall was performance high or low in this task?\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "