To analyze the relationships between hormone and enzyme levels (Testosterone, LH, ACP, LDH, FSH) in adult male volunteers, and identify significant correlations among these markers.
-
Data Import and Cleaning
- Loaded the cleaned dataset from Week 2.
-
Exploratory Data Analysis (EDA)
- Plotted histograms to visualize the distribution of each marker.
- Created boxplots to identify potential outliers.
-
Descriptive Statistics
- Generated a summary table including mean, median, standard deviation, min, max, quartiles, and skewness.
- Noted variables that were skewed to justify the use of Spearman correlation.
-
Correlation Analysis
- Calculated Spearman correlation between Testosterone and LH:
- rho = 0.963, p < 0.001 (very strong positive correlation)
- Calculated Spearman correlation between LDH and ACP:
- rho = 0.849, p < 0.001 (strong positive correlation)
- Plotted scatterplots with trend lines for both correlations to visualize relationships.
- Calculated Spearman correlation between Testosterone and LH:
- Testosterone vs LH: Very strong positive correlation, indicating that higher Testosterone levels are associated with higher LH levels in participants.
- LDH vs ACP: Strong positive correlation, indicating that higher LDH activity is associated with higher ACP activity.
- Overall Trends: The Spearman correlation matrix highlighted other relationships among hormones and enzymes, providing insights into their interdependence.
- Skewness in some variables justified the use of Spearman correlation, which is robust to non-normal distributions and outliers.
- All analysis was done in Python using libraries:
pandas,numpy,matplotlib,seaborn, andscipy. - This notebook demonstrates how to perform correlation analysis on biological datasets while maintaining a professional workflow.