{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Ultimate (Concise) Guide to t-test type things in Python"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{note}\n",
"This chapter authored by [Todd M. Gureckis](http://gureckislab.org/~gureckis) and is released under the [license for this book](../../LICENSE).\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So you have a dataset with two groups in it. Might be and \"experimental\" group and a \"control\" group, or \"treatment 1\" versus \"treatment 2\" and you want to know if they are different. First you have to define different. Obviously if this is real data unlike that both groups have the exact same values in them. Often you are interested in if the mean of the two groups is different. \n",
"\n",
"The statistical test most often used to do this is a \"Students t-test\". This notebook does not bother too much with explaining the theory behind t-tests, particularly in the traditional form. Instead this is a guide to like \"how do I do this in python?\". Other chapters of this course/book go into more of the theory. It starts off with the most common situations for standard \"null hypothesis significance testing\" t-tests. It ends with an example of an approach to replace the student t-test with the BEST test (Bayesian Estimation Supercedes the T-test) which is a fully Bayesian approach to comparing two groups published by Kruschke (2013). It mostly shows the code and libraries you need, how you typically report in a paper, and also checks to make sure you ran the right test."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Import some libraries"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First we need to load the relevant libraries. These libraries don't _all_ need to be loaded depending on what you are doing but this will generally work. See the end of this notebook for information about the versions of the packages used here."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# basic datascience/data manipulation libraries\n",
"import numpy as np\n",
"import pandas as pd\n",
"import numpy.random as npr\n",
"import scipy.stats as stats\n",
"\n",
"# graphs\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# formulat interface to statsmodels (standard linear models)\n",
"import statsmodels.formula.api as smf\n",
"\n",
"# easy-to-use traditional psychological stats (t-test, anova)\n",
"import pingouin as pg\n",
"\n",
"# hate these things\n",
"import warnings\n",
"warnings.filterwarnings(\"ignore\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define some data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For this example we'll consider several datasets."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" group | \n",
" score | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0 | \n",
" 101 | \n",
"
\n",
" \n",
" 1 | \n",
" 0 | \n",
" 100 | \n",
"
\n",
" \n",
" 2 | \n",
" 0 | \n",
" 102 | \n",
"
\n",
" \n",
" 3 | \n",
" 0 | \n",
" 104 | \n",
"
\n",
" \n",
" 4 | \n",
" 0 | \n",
" 102 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" group score\n",
"0 0 101\n",
"1 0 100\n",
"2 0 102\n",
"3 0 104\n",
"4 0 102"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"drug = np.array([101,100,102,104,102,97,105,105,98,101,100,123,105,103,100,95,102,106,\n",
" 109,102,82,102,100,102,102,101,102,102,103,103,97,97,103,101,97,104,\n",
" 96,103,124,101,101,100,101,101,104,100,101])\n",
"placebo = np.array([99,101,100,101,102,100,97,101,104,101,102,102,100,105,88,101,100,\n",
" 104,100,100,100,101,102,103,97,101,101,100,101,99,101,100,100,\n",
" 101,100,99,101,100,102,99,100,99])\n",
"\n",
"\n",
"# packing the data into a tidy dataframe can be nice\n",
"exp_df = pd.DataFrame(dict(group=[0]*len(drug)+[1]*len(placebo), score=np.r_[drug,placebo]))\n",
"\n",
"exp_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Two-group independent samples t-test (Student test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Is the mean of the drug group different than the mean of the placebo group?"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mean of drug group: 101.91489361702128\n",
"Mean of placebo group: 100.35714285714286\n",
"The difference in means is: 1.5577507598784166\n"
]
}
],
"source": [
"print(\"Mean of drug group:\", drug.mean())\n",
"print(\"Mean of placebo group:\", placebo.mean())\n",
"print(\"The difference in means is: \", drug.mean()-placebo.mean())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The look close but not identical. However, \"look\" isn't enough.\n",
"\n",
"Lets begin with a two-sample, independent samples t-test. We will assume that both groups have equal variance here."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" T | \n",
" dof | \n",
" tail | \n",
" p-val | \n",
" CI95% | \n",
" cohen-d | \n",
" BF10 | \n",
" power | \n",
"
\n",
" \n",
" \n",
" \n",
" T-test | \n",
" 1.559 | \n",
" 87 | \n",
" two-sided | \n",
" 0.122699 | \n",
" [-0.43, 3.54] | \n",
" 0.331 | \n",
" 0.642 | \n",
" 0.338 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" T dof tail p-val CI95% cohen-d BF10 power\n",
"T-test 1.559 87 two-sided 0.122699 [-0.43, 3.54] 0.331 0.642 0.338"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pg.ttest(x=drug, y=placebo,paired=False, tail='two-sided', correction=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We specified group `x` as `drug` and `y` as `placebo` (arbitrarily, you could flip that). We used 'two-sided' which is the traditionally more conservative test which you use unless you have a strong a-priori belief one group is going to have a higher mean value. We did not apply a correction known as the Welch-Satterthwaite correction for unequal variances. We will try that later."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The results show that the t-value for the mean difference is 1.599. The test has 87 degrees of freedom. The p-value is 0.122699 which is greater than the traditional \"alpha\" cut off at p=0.05. Therefore this test is not significant. The 95% confidence interval for the differences between the means is -0.43 on the low end to 3.54 with (1.5577 the center). The effect size (Cohen's d) is 0.331. The Bayes Factor in favor of the alternative hypothesis (that the means are difference) is lower than one (0.642). The post-hoc power of the test is 0.338. \n",
"\n",
"All of this is consistent with there being basically no differences between these two groups."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Example way to report this in a paper\n",
":class: tip\n",
"We conducted an independent sample t-test comparing the mean score in the drug and placebo group which failed to detect a significant difference ($t(87)=1.56, p=0.12, d=0.331$).\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Two Group Independent Samples unequal variance (Welch test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The assumption in the previous example was that the standard deviation of the data in the two groups was identical. This assumption may be inappropriate. To do a test where this is relaxed is not longer technically a t-test but a Welch test. All we do is test correction to `True`:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" T | \n",
" dof | \n",
" tail | \n",
" p-val | \n",
" CI95% | \n",
" cohen-d | \n",
" BF10 | \n",
" power | \n",
"
\n",
" \n",
" \n",
" \n",
" T-test | \n",
" 1.622 | \n",
" 63.04 | \n",
" two-sided | \n",
" 0.109754 | \n",
" [-0.36, 3.48] | \n",
" 0.331 | \n",
" 0.701 | \n",
" 0.338 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" T dof tail p-val CI95% cohen-d BF10 \\\n",
"T-test 1.622 63.04 two-sided 0.109754 [-0.36, 3.48] 0.331 0.701 \n",
"\n",
" power \n",
"T-test 0.338 "
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pg.ttest(x=drug, y=placebo,paired=False, tail='two-sided', correction=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Example way to report this in a paper\n",
":class: tip\n",
"We conducted an independent sample Welch test comparing the mean score in the drug and placebo group which failed to detect a significant difference ($t(63.04)=1.622, p=0.11, d=0.331$).\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There can be some strong arguments that this should be the preferred indendent sample t-test because the assumption of equal variance is unlikely to be met."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Paired samples t-test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the previous example the two groups were necessarily measuring different individuals (placebo versus a drug study). However sometimes we measure the same person twice (pre-test, post-test) for instance. In that case we use the \"paired\" samples t-test because there is a natural link between individual numbers in both groups (usually a person or some other unit of measurement).\n",
"\n",
"To perform this test we are going to create slightly different data set but you could copy-paste numbers into the `test_1` and `test_2` arrays as in the example above."
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" group | \n",
" score | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0 | \n",
" 101.0 | \n",
"
\n",
" \n",
" 1 | \n",
" 0 | \n",
" 100.0 | \n",
"
\n",
" \n",
" 2 | \n",
" 0 | \n",
" 102.0 | \n",
"
\n",
" \n",
" 3 | \n",
" 0 | \n",
" 104.0 | \n",
"
\n",
" \n",
" 4 | \n",
" 0 | \n",
" 102.0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" group score\n",
"0 0 101.0\n",
"1 0 100.0\n",
"2 0 102.0\n",
"3 0 104.0\n",
"4 0 102.0"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test_1 = np.array([101,100,102,104,102,97,105,105,98,101,100,123,105,103,100,95,102,106,\n",
" 109,102,82,102,100,102,102,101,102,102,103,103,97,97,103,101,97,104,\n",
" 96,103,124,101,101,100,101,101,104,100,101])\n",
"test_2 = test_1 + npr.normal(loc=2, scale=10, size=len(test_1))\n",
"\n",
"\n",
"# packing the data into a tidy dataframe can be nice\n",
"exp_df = pd.DataFrame(dict(group=[0]*len(test_1)+[1]*len(test_2), score=np.r_[test_1, test_2]))\n",
"\n",
"exp_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To perform this test we set `paired` to `True`."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" T | \n",
" dof | \n",
" tail | \n",
" p-val | \n",
" CI95% | \n",
" cohen-d | \n",
" BF10 | \n",
" power | \n",
"
\n",
" \n",
" \n",
" \n",
" T-test | \n",
" -3.3 | \n",
" 46 | \n",
" two-sided | \n",
" 0.001875 | \n",
" [-8.15, -1.97] | \n",
" 0.547 | \n",
" 16.595 | \n",
" 0.956 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" T dof tail p-val CI95% cohen-d BF10 power\n",
"T-test -3.3 46 two-sided 0.001875 [-8.15, -1.97] 0.547 16.595 0.956"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pg.ttest(x=test_1, y=test_2,paired=True, tail='two-sided')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Example way to report this in a paper\n",
":class: tip\n",
"We conducted a paired t-test comparing the scores in the first and second test which was significant ($t(46)=3.3, p<.002, d=0.547$).\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## One sample t-test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What if you only have one sample and you just want to test if it is different than some specific \"null\" hypothesis. For instance many of the scores seem to be about 100 in the `drug` group from the dataset above so we could test the hypothesis \"is the mean of that group significantly different than 100?\". A one-sample t-test is appropriate for this:"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" T | \n",
" dof | \n",
" tail | \n",
" p-val | \n",
" CI95% | \n",
" cohen-d | \n",
" BF10 | \n",
" power | \n",
"
\n",
" \n",
" \n",
" \n",
" T-test | \n",
" 2.18 | \n",
" 46 | \n",
" two-sided | \n",
" 0.034387 | \n",
" [0.15, 3.68] | \n",
" 0.318 | \n",
" 1.36 | \n",
" 0.569 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" T dof tail p-val CI95% cohen-d BF10 power\n",
"T-test 2.18 46 two-sided 0.034387 [0.15, 3.68] 0.318 1.36 0.569"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pg.ttest(x=drug, y=100, paired=True, tail='two-sided')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Example way to report this in a paper\n",
":class: tip\n",
"We conducted a one-sample t-test comparing the scores in the drug group against the null hypothesis of a mean of 100 which did not reach significance ($t(46)=2.18, p=.034, d=0.318$).\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Non-normal data (Wilcoxon or Mann-Whitney test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"All the tests above assume the data is normally distributed. But we can't just assume that. How do you know for sure? Well it is a bit of a longer topic but there are tests for `non-normality` you can uses which are summarized [here](http://gureckislab.org/courses/fall20/labincp/chapters/10/00-ttest.html#checking-the-normality-of-a-sample). \n",
"\n",
"In cases where there is strong suspicion your data are not normally distributed in one or the other group you can use a non-parametric test known as the Wilcoxon (paired samples/within subject) or Mann-Whitney (independent samples) test. It also exists in `pingouin`:"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" U-val | \n",
" tail | \n",
" p-val | \n",
" RBC | \n",
" CLES | \n",
"
\n",
" \n",
" \n",
" \n",
" MWU | \n",
" 1267.5 | \n",
" two-sided | \n",
" 0.019609 | \n",
" -0.284 | \n",
" 0.578 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" U-val tail p-val RBC CLES\n",
"MWU 1267.5 two-sided 0.019609 -0.284 0.578"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pg.mwu(x=drug, y=placebo, tail='two-sided')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Example way to report this in a paper\n",
":class: tip\n",
"Since our data were non-normal, we conducted a two-side Mann-Whitney test comparing the scores in the drug group against the placebo group which was significant ($U=1267.5, p<.02$).\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The one paired samples version of this is the Wilcoxon test:"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" W-val | \n",
" tail | \n",
" p-val | \n",
" RBC | \n",
" CLES | \n",
"
\n",
" \n",
" \n",
" \n",
" Wilcoxon | \n",
" 305.0 | \n",
" two-sided | \n",
" 0.006229 | \n",
" -0.459 | \n",
" 0.638 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" W-val tail p-val RBC CLES\n",
"Wilcoxon 305.0 two-sided 0.006229 -0.459 0.638"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pg.wilcoxon(x=test_1, y=test_2, tail='two-sided')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Example way to report this in a paper\n",
":class: tip\n",
"Since our data were non-normal, we conducted a two-sided Wilcoxon test comparing the scores at test time 1 with the scores at test time 2 which was significant ($W=305.0, p<.007$).\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Be Bayesian! The BEST test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The examples above all show how to tell if two groups are different using traditional \"null hypothesis significance testing.\" However there are several ways in which this is non ideal. Instead, the best way to evaluate differences between groups it do dump the hypothesis testing framework all together and use Bayesian estimation of the differences between groups. Kruschke (2013) describes a simplified/standardize work flow called the BEST test (Bayesian ESTimation) although I've also heard it call the Bayesian Estimation Superceeds the T-test!.\n",
"\n",
"Luckily there is a simplified python package that implements this test for you! https://best.readthedocs.io/en/latest/"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"import best"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Auto-assigning NUTS sampler...\n",
"Initializing NUTS using jitter+adapt_diag...\n"
]
},
{
"ename": "TypeError",
"evalue": "function() got an unexpected keyword argument 'nuts_kwargs'",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mbest_out\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mbest\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0manalyze_two\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdrug\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mplacebo\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m~/.virtualenvs/science/lib/python3.7/site-packages/best/model.py\u001b[0m in \u001b[0;36manalyze_two\u001b[0;34m(group1_data, group2_data, n_samples, **kwargs)\u001b[0m\n\u001b[1;32m 438\u001b[0m \"\"\"\n\u001b[1;32m 439\u001b[0m \u001b[0mmodel\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mBestModelTwo\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgroup1_data\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgroup2_data\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 440\u001b[0;31m \u001b[0mtrace\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmodel\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msample\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn_samples\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 441\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mBestResultsTwo\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmodel\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtrace\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 442\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/science/lib/python3.7/site-packages/best/model.py\u001b[0m in \u001b[0;36msample\u001b[0;34m(self, n_samples, **kwargs)\u001b[0m\n\u001b[1;32m 71\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mr\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmax_rounds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 72\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmodel\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 73\u001b[0;31m \u001b[0mtrace\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpm\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msample\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn_samples\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 74\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 75\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mtrace\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreport\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mok\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/science/lib/python3.7/site-packages/pymc3/sampling.py\u001b[0m in \u001b[0;36msample\u001b[0;34m(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, pickle_backend, **kwargs)\u001b[0m\n\u001b[1;32m 501\u001b[0m \u001b[0mprogressbar\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mprogressbar\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 502\u001b[0m \u001b[0mjitter_max_retries\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mjitter_max_retries\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 503\u001b[0;31m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 504\u001b[0m )\n\u001b[1;32m 505\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mstart\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/science/lib/python3.7/site-packages/pymc3/sampling.py\u001b[0m in \u001b[0;36minit_nuts\u001b[0;34m(init, chains, n_init, model, random_seed, progressbar, jitter_max_retries, **kwargs)\u001b[0m\n\u001b[1;32m 2184\u001b[0m \u001b[0;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34mf\"Unknown initializer: {init}.\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2185\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2186\u001b[0;31m \u001b[0mstep\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpm\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mNUTS\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpotential\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mpotential\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmodel\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mmodel\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2187\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2188\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mstart\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mstep\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/science/lib/python3.7/site-packages/pymc3/step_methods/hmc/nuts.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, vars, max_treedepth, early_max_treedepth, **kwargs)\u001b[0m\n\u001b[1;32m 166\u001b[0m \u001b[0;31m`\u001b[0m\u001b[0mpm\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msample\u001b[0m\u001b[0;31m`\u001b[0m \u001b[0mto\u001b[0m \u001b[0mthe\u001b[0m \u001b[0mdesired\u001b[0m \u001b[0mnumber\u001b[0m \u001b[0mof\u001b[0m \u001b[0mtuning\u001b[0m \u001b[0msteps\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 167\u001b[0m \"\"\"\n\u001b[0;32m--> 168\u001b[0;31m \u001b[0msuper\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__init__\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mvars\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 169\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 170\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmax_treedepth\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmax_treedepth\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/science/lib/python3.7/site-packages/pymc3/step_methods/hmc/base_hmc.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, vars, scaling, step_scale, is_cov, model, blocked, potential, dtype, Emax, target_accept, gamma, k, t0, adapt_step_size, step_rand, **theano_kwargs)\u001b[0m\n\u001b[1;32m 86\u001b[0m \u001b[0mvars\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minputvars\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mvars\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 87\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 88\u001b[0;31m \u001b[0msuper\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__init__\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mvars\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mblocked\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mblocked\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmodel\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mmodel\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mtheano_kwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 89\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 90\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0madapt_step_size\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0madapt_step_size\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/science/lib/python3.7/site-packages/pymc3/step_methods/arraystep.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, vars, model, blocked, dtype, logp_dlogp_func, **theano_kwargs)\u001b[0m\n\u001b[1;32m 252\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 253\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mlogp_dlogp_func\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 254\u001b[0;31m \u001b[0mfunc\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmodel\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlogp_dlogp_function\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mvars\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mtheano_kwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 255\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 256\u001b[0m \u001b[0mfunc\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlogp_dlogp_func\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/science/lib/python3.7/site-packages/pymc3/model.py\u001b[0m in \u001b[0;36mlogp_dlogp_function\u001b[0;34m(self, grad_vars, tempered, **kwargs)\u001b[0m\n\u001b[1;32m 1001\u001b[0m \u001b[0mvarnames\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0mvar\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mname\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mvar\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mgrad_vars\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1002\u001b[0m \u001b[0mextra_vars\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0mvar\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mvar\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfree_RVs\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mvar\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mname\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mvarnames\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1003\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mValueGradFunction\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcosts\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgrad_vars\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mextra_vars\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1004\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1005\u001b[0m \u001b[0;34m@\u001b[0m\u001b[0mproperty\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/science/lib/python3.7/site-packages/pymc3/model.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, costs, grad_vars, extra_vars, dtype, casting, compute_grads, **kwargs)\u001b[0m\n\u001b[1;32m 698\u001b[0m \u001b[0minputs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_vars_joined\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 699\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 700\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_theano_function\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtheano\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfunction\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minputs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutputs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgivens\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mgivens\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 701\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 702\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mset_weights\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalues\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mTypeError\u001b[0m: function() got an unexpected keyword argument 'nuts_kwargs'"
]
}
],
"source": [
"best_out = best.analyze_two(drug, placebo)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"best.plot_all(best_out)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Watermark"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The code here successfully ran with these versions of the packages:"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The watermark extension is already loaded. To reload it, use:\n",
" %reload_ext watermark\n",
"Last updated: Wed Mar 03 2021\n",
"\n",
"Python implementation: CPython\n",
"Python version : 3.7.3\n",
"IPython version : 6.2.1\n",
"\n",
"statsmodels: 0.12.1\n",
"seaborn : 0.11.0\n",
"scipy : 1.5.4\n",
"pingouin : 0.2.9\n",
"json : 2.0.9\n",
"best : 2.0.0.post0\n",
"pandas : 1.1.4\n",
"numpy : 1.19.4\n",
"autopep8 : 1.4.4\n",
"matplotlib : 3.3.3\n",
"\n",
"Watermark: 2.2.0\n",
"\n"
]
}
],
"source": [
"%load_ext watermark\n",
"%watermark -n -u -v -iv -w"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}