I received three datasets (the data here are fictional to preserve confidentiality, but the point is the same): In 1998, a group of companies sold 1,000 units of Product A, 500 of B, 200 of C etc, for a total of 2000 units sold for ALL 8 of a set of 8 products. In 1999, the numbers were 900 for A, 400 for B, 300 for C etc, for a total of 1800 units for all 8 of these products.

Also, in 1999, a sample of 300 employees from the different companies were asked about if they had sold each product and how much they had sold. For Product A, 80% said they had sold at least one unit, and among those 240 employees, the mean# of units sold was 20. For B, 90% said they had sold at least one unit and the mean # sold among those 270 employees was 10. For C, 100% had sold a mean of 5, etc., for a total of 100 (mean) for all 8 products.

I do not know how many employees answered the original questions in 1998 and 1999. I do not have the specific data for each employee, only what I have provided above. So basically, this is what I have:

1998 1999 1999 (means for those who sold) sample of 300

Product A 1000 900 20 – 80%

Product B 500 400 10 – 90%

Product C 200 300 5 – 100%

‘

‘

Total = 2000 1800 100

The main question is whether the sample of 300 employees in 1999 is an unbiased representation of the larger population, as depicted in 1998 and 1999. I was thinking of the best way to analyze these data. The first thing that came to mind was Spearman’s Rho. I was also wondering whether I should treat each number (e.g., 1000, 500, 200 etc. in 1998) as a PERCENTAGE of the TOTAL number for that year [2,000] so I would get, e.g., 50%, 25%, 10% etc. and then compare the percentages for each year to see if the patterns were the same. But if I do that, do I use Chi-Square/Goodness of Fit? The problem is that it’s not as if each employee could choose only one product. Plus, I am looking at rates sold, not simply if employees said “yes I sold one.” Thus, I don’t think the Chi-Squareis appropriate (but I could be wrong).

It really is a simple question but I want to make sure I do the right test, based on the limited data I have. I put 1 credit as what I was willing to pay, which I think is fair considering that it should take only a minute to determine the best approach.

Hello,

I would perform a t-test for Each Product. Take the mean of Product A in 1998 (1000), subtract the Mean of A in 1999 (900), and divide by …

I received three data sets (the data here are fictional to preserve confidentiality, but the point is the same): In 1998, a group of companies sold 1,000 units of Product A, 500 of B, 200 of C etc, for a total of 2000 units sold for ALL 8 of a set of 8 products. In 1999, the numbers were 900 for A, 400 for B, 300 for C etc, for a total of 1800 units for all 8 of these products.

Also, in 1999, a sample of 300 employees from the different companies were asked about if they had sold each product and how much they had sold. For Product A, 80% said they had sold at least one unit, and among those 240 employees, the mean # of units sold was 20. For B, 90% said they had sold at least one unit and the mean # sold among those 270 employees was 10. For C, 100% had sold a mean of 5, etc., for a total of 100 (mean) for all 8 products.

I do not know how many employees answered the original questions in 1998 and 1999. I do not have the specific data for each employee, only what I have provided above. So basically, this is what I have:

1998 1999 1999 (means for those who sold) sample of 300

Product A 1000 900 20 – 80%

Product B 500 400 10 – 90%

Product C 200 300 5 – 100%

‘

‘

Total = 2000 1800 100

The main question is whether the sample of 300 employees in 1999 is an unbiased representation of the larger population, as depicted in 1998 and 1999. I was thinking of the best way to analyze these data. The first thing that came to mind was Spearman’s Rho. I was also wondering whether I should treat each number (e.g., 1000, 500, 200 etc. in 1998) as a PERCENTAGE of the TOTAL number for that year [2,000] so I would get, e.g., 50%, 25%, 10% etc. and then compare the percentages for each year to see if the patterns were the same. But if I do that, do I use Chi-Square/Goodness of Fit? The problem is that it’s not as if each employee could choose only one product. Plus, I am looking at rates sold, not simply if employees said “yes I sold one.” Thus, I don’t think the Chi-Square is appropriate (but I could be wrong).

It really is a simple question but I want to make sure I do the right test, based on the limited data I have. I put 1 credit as what I was willing to pay, which I think is fair considering that it should take only a minute to determine the best approach.