One sample hypothesis testings

Null hypothesis H0: already known, established, default, status quo, old, pre-existing, current practice, well-known, working assumption, nothing new, boring. The (generic) parameter φ equals some number a.
Alternative hypothesis HA: new, exciting, hoped/wished, changed, different, research, challenger. Either the parameter φ<a, or φ>a, or φ≠a.
Test if the sample (i.e. its statistic and its size, n) provides enough evidence to overthrow ("warrant rejection of") the null hypothesis. Is the sample statistic extreme enough.
Either "reject" or "fail to reject" the null hypothesis; never "accept" it. Rejecting it ≡ "support" the alternative.
The alternative hypothesis is neither rejected nor accepted.
Nothing is ever "proven".

8-3. T-Test for mean μ. Uses μ, s, x̄, and n. Test statistic is t.
Data is normal or n≥30.

8-3 (Obsolete). Z-Test for mean μ if σ known (rare) [or with large n and use s for σ]. Uses μ, σ, x̄, and n. Test statistic is z.
Data is normal or n≥30.

8-2. 1-PropZTest for proportion p. Uses #yeses or p̂, p, and n. Test statistic is z.
Binary nominal data. Normal distribution is approximating a Binomial distribution.

8-4. Χ2-test for standard deviation σ. Uses σ, s, and n. Test statistic is Χ2.
Population must be normal.

The test statistic is a measure of discrepancy between a sample statistic and the H0 claimed value of the population parameter.

Given null hypothesis H0: parameter = a
Choose one:
HA: parameter (stat) < a "HA < H0" Left-tailed
HA: parameter (stat) > a "HA > H0" Right-tailed
HA: parameter (stat) ≠ a "HA ≠ H0" Two-tailed

μ:
σ: OR (if σ unknown) s: OR both if doing Χ2-test
:
n:
     OR     8-3.      OR     8-4.


OR    8-2. Proportion test: Uses #yeses or p̂, p, n
#Yeses, x: OR :     and p:   
np: nq: Both should be ≥5    Standard error=√(pq)/√n:

Power: specific value of p:   α:    =    p̂:


Result:
     z:                                  Standard error=σ/√n:
   Critical value: One-tailed: α=0.05:±1.645    α=0.01:±2.324    Two-tailed: α=0.05:±1.96    α=0.01:±2.576
   If Left-tailed and (-)z≤-CritValue then Reject H0 at that α level.
   If Right-tailed and z≥CritValue then Reject H0 at that α level.


OR     t: df:       StdErr=s/√n:          
   Critical value: α=0.05: α=0.01:
   If Left-tailed and (-)t≤CritValue then Reject H0 at that α level.
   If Right-tailed and t≥CritValue then Reject H0 at that α level.


OR     Χ2: df:   
   Critical value: α=0.05: α=0.01:
   If Left-tailed and Χ2≤CritValue then Reject H0 at that α level.
   If Right-tailed and Χ2≥CritValue then Reject H0 at that α level.

*** p-value (CDF(z) or TCDF(t,df) or Chisqr_CDF(Χ2,df)):
Chance that the test statistic would be as much or more if H0 were true.
"If the p is low, the null must go."
Typically the critical/rejection region ("level of significance", α) is chosen to be .05 or .01, so if p is less than it reject H0; if p is not less than the critical value don't reject H0 ("fail to reject").
Probability (area) in a tail (or two) of the test statistic's PDF curve.
If p is high (bigger than α), can't reject H0.
Selecting Two-tailed case doubles the p-value over the One-tailed cases.
Mean and Proportion One-tailed tests are "symmetric". SD is
Tip: if the p-value is like .9, check that you selected the appropriate "tail" above before failing to reject.

Exs.
T-test: μ=100, s=10, n=30. Try x̄= 102, 103, 104, 105. Ha>H0
T-test: μ=100, s=10, n=30. Try x̄= 102, 103, 104, 105. Ha≠H0

Effect of s:
T-test: μ=100, s=5, n=30. Try x̄= 101, 102, 103. Ha>H0
T-test: μ=100, s=5, n=30. Try x̄= 101, 102, 103. Ha≠H0

Effect of n:
T-test: μ=100, s=10, n=100. Try x̄= 101, 102, 103. Ha>H0
T-test: μ=100, s=10, n=100. Try x̄= 101, 102, 103. Ha≠H0


NB. p-hacking: great pressures (professional, monetary, publication bias, ideological) to have positive result.
So cheating and lying by:
stop data collection when p≤.05
discard data that prevents p≤.05
repeat the experiment until get p≤.05
test for different effects until find one with p≤.05
NB. Also possible to have:
H0: φ≤a and HA: φ>a
H0: φ≥a and HA: φ<a
NB. With very large sample a very small difference between x̄ and claimed μ can be "significant".