## Wednesday, May 6, 2020

### Accusation Against the Industry of Discriminating

Question: Discuss about the Report for Accusation Against the Industry of Discriminating.| Answer: Introduction A research study was carried out to analyze and determine whether a specific industry in Singapore was discriminating on its female workers with regards to their salary earnings. The sample data was prepared based on an investigation on random 50 employees; noting down their current monthly salaries (S\$), highest level of education attained, age, and gender. Here the monthly salary (in Singapore Dollar (SGD)) is the response or the dependent variable (Y) and the gender (0 female, 1 male), age, and the levels of education are the explanatory or the independent variables (Xs). This report is to present an analysis of the findings based on the research study of 50 employees and make meaningful conclusions and recommendations in a business context. Analysis and Conclusions Descriptive Statistics: The below tables provide the descriptive statistics of the independent and the dependent variables involved in the case study: Monthly Salary (in S\$) Mean 3546.52 Standard Error 330.2105 Median 2795 Mode 1900 Standard Deviation 2334.9406 Sample Variance 5451947.7241 Kurtosis -0.0058 Skewness 1.0128 Coefficient of Variation 0.6584 Range 7992 Minimum 1040 Maximum 9032 Sum 177326 Count 50 Consider the variable monthly salary of an employee. As observed from the descriptive statistics, the average salary of 50 sampled employees is S\$3,546.52 with around 50% of them having a monthly salary of S\$2,795. The above distribution is rightly skewed, i.e. not symmetric, and hence not normal. Given the standard deviation of S\$2,334.9, it can be said that the absolute variability of the data values around their mean value is considerably high. The respective coefficient of variation value of 0.658 suggests the relative variability. Descriptive Statistics of the independent variables: Age, Gender and level of education Age (10 years) Mean 3.924 Standard Error 0.1728 Median 3.8 Mode 3.8 Standard Deviation 1.2218 Sample Variance 1.4929 Kurtosis -1.0698 Skewness 0.3802 Range 4 Minimum 2.2 Maximum 6.2 Sum 196.2 Count 50 Level of Education Count Below Secondary 10 Secondary 5 Post-Secondary 7 Diploma professional 13 University 15 Gender Count Females 20 Males 30 The variable monthly salary is a quantitative and numerical (discrete) in nature whereas measured on a ratio scale while the variable levels of education is a qualitative and categorical variable, measured on an ordinal scale. Below is the contingency table with respect to the count and percentage of female and male employees having a salary above or below S\$3,000. Contingency Table Females Males Total Salary \$3000 4 18 22 Salary \$3000 16 12 28 Total 20 30 50 Contingency Table Females Males Total % share Salary \$3000 20% 60% 44% Salary \$3000 80% 40% 56% Total % share 40% 60% 100% From the above table, it is computed that there is a likelihood of 20% that a female worker gets a monthly salary more than S\$3,000, in comparison to 60% of male workers who get a salary of more than S\$3,000. Hence, it can be said that the distribution of salaries is statistically significant to gender. The below table represents the average salaries of male and female workers along with their respective standard deviations: Monthly Salary (in S\$) Male Female Mean 4279.3 2447.35 Standard Deviation 2421.467 1729.487 Based on the above formulated data, a 90% confidence interval (i.e. using z* multiplier of 1.645) was derived for monthly salary of a male worker as S\$(295.987, 8262.613). The spread of the confidence interval is large due to a high standard deviation in monthly salaries of male. To test the claim that the mean monthly salary of workers in the industry seems to be greater than S\$3200, a right-tailed t-test is being carried out at significance level of . The hypotheses are stated as: Null Hypothesis : Alternative Hypothesis : Here,is the hypothesized mean value of the variable monthly salary To test the claim, t-test statistic is used, . Degree of freedom = n 1 = 49. Therefore, right-tailed p-value = 0.1497 Since, p-value =0.1497 0.05=or , we fail to reject the null hypothesis in favour of alternative hypothesis. Hence, it can be said that there is no statistical evidence to support the claim that the mean monthly salary of workers in the industry seems to be greater than S\$3200. Below is the summary output of simple linear regression for the response variable and the other 3 explanatory variables: The coefficient estimate for the slope of the variable Gender is 1301.36 and has a p-value of 0.033. Since this p-value is less than the significance level of 5%, i.e. 0.05, it can be said that the result is statistically significant. Further developing the Hypotheses as: and Test Statistic: Here, For significance level of , two tailed critical value: We reject the null hypothesis if Outcome: Since , we reject Therefore, we conclude that at a significance level of 5%, there is a statistically significant relationship between the hourly earnings a person makes and the hours of training took. Following graph shows the distribution and the estimated linear equations of monthly salaries of males and females separately. The estimated regression equations of monthly salaries of both males and females are computed as and respectively. To estimate how much do male workers earn more than female workers, the difference of these two equations is calculated, i.e. The coefficient estimates for the slope of the variables age and level of education is 263.36 and 635.06. The signs of the coefficients are positive implying that a greator age (10 years) and a higher educational qualification will result in a greator monthly salary of an employee, which is similar to what was expected. For the regression model, the adjusted R-square value is equal to 0.2834 implying that 28.34% of the variation in the dependent variable can be explained by the regression model. It is a better measure than R-squared value because an adjusted R-square value, unlike an R-squared value compares the descriptive power of regression models that include diverse numbers of predictors and includes the variation explained by only those explanatory or independent variables (not all!) that in reality affects the dependent variable. For the obtained regression model, the respective residual and normal probability plots suggest that the data satisfy the assumptions of a linear regression, Linearity, Normality of Errors, and Homoscedasticity of Errors. Based on the regression model develop above, the predicted monthly salary of a 39-year-old university educated female worker: The adjusted R-squared value shows that only 28.34% of the variation in monthly salaries of the workers is predicted by aforesaid independent variables, namely, gender, age and level of education, which suggests that it is likely the other factors like the working hours, duration of job, night shifts, number of leaves, etc. may have influenced the monthly salaries of workers.