do I need an actual data set
2 posters
Page 1 of 1
do I need an actual data set
I am trying to use statistics to solve a problem and believe that my next step is bootstrap hypothesis testing, which I had not heard about until this morning. statistics101 looks like it might be the thing for this but I just want to clarify something.
I have a frequency distribution showing the birth months of 3196 males from one country who did a thing. call this F1.
I also have census data from their country showing the number of folk born each month, and I have the birth rate ratio showing the number of male to female births. this permits me to calculate a frequency distribution of male births in the year. call this F2.
I want to compare F1 with F2 to test a theory that doing this thing is influenced by birthdate. basically the theory says that folk who do this thing tend to be born in a particular part of the year.
that's where (I think) bootstrapping comes in. as I understand it, I want to take 1000 or so random samples from the census data and compute their statistics to determine whether or not my original sample (the 3196 folk who did a thing) could just be a random sample from the general population.
if that is correct, then do I have to recreate the actual data set for the 2 million folk born each year, or can I just enter the frequency distribution (167488 folk born in January, 153146 born in February, 165951 born in March, etc), and work from that?
many thanks,
Kaarlo Tuomi
I have a frequency distribution showing the birth months of 3196 males from one country who did a thing. call this F1.
I also have census data from their country showing the number of folk born each month, and I have the birth rate ratio showing the number of male to female births. this permits me to calculate a frequency distribution of male births in the year. call this F2.
I want to compare F1 with F2 to test a theory that doing this thing is influenced by birthdate. basically the theory says that folk who do this thing tend to be born in a particular part of the year.
that's where (I think) bootstrapping comes in. as I understand it, I want to take 1000 or so random samples from the census data and compute their statistics to determine whether or not my original sample (the 3196 folk who did a thing) could just be a random sample from the general population.
if that is correct, then do I have to recreate the actual data set for the 2 million folk born each year, or can I just enter the frequency distribution (167488 folk born in January, 153146 born in February, 165951 born in March, etc), and work from that?
many thanks,
Kaarlo Tuomi
Kaarlo Tuomi- Posts : 1
Join date : 2021-11-23
Re: do I need an actual data set
Drawing from the frequency distribution should be sufficient. But I'm not an expert Statistician, so don't take my opinion as gospel.
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|