Abstract
Epidermal growth factor (EGF) is involved in alveolar epithelial repair, lung fluid clearance and inflammation, and is regulated by sex hormones. An unmatched, nested case–control study was conducted to evaluate the associations of EGF variants with acute respiratory distress syndrome (ARDS) and the role of sex on the associations between EGF variants and ARDS.
Patients with ARDS risk factors upon intensive care unit admission were enrolled. Cases were 416 Caucasians who developed ARDS and controls were 1,052 Caucasians who did not develop ARDS. Cases were followed for clinical outcomes and 60-day mortality. One functional single nucleotide polymorphism (SNP), rs4444903, and six haplotype-tagging SNPs spanning the entire EGF gene were genotyped.
No individual SNP or haplotype was associated with ARDS risk or outcomes in all subjects. Sex-stratified analyses showed opposite effects of EGF variants on ARDS in males versus in females. SNPs rs4444903, rs2298991, rs7692976 and rs4698803, and haplotypes GGCGTC and ATCAAG were associated with ARDS risk in males. No associations were observed in females. Interaction analysis showed that rs4444903, rs2298991, rs7692976 and rs6533485 significantly interacted with sex for ARDS risk.
The present study suggests that associations of epidermal growth factor gene variants with acute respiratory distress syndrome risk are modified by sex. The current findings should be replicated in other populations.
- Acute respiratory distress syndrome
- epidermal growth factor
- genetic susceptibility
- haplotypes
- lung injury
- molecular epidemiology
Among patients with sepsis, pneumonia, trauma and other triggering conditions, only a subset will develop acute respiratory distress syndrome (ARDS), and only ∼60% of those developing ARDS will survive, suggesting that genetics may influence the susceptibility to and recovery from this syndrome. ARDS is characterised by diffuse damage to the alveolar barrier, which leads to increased permeability and influx of protein-rich oedema into the interstitial airspace. In addition to endothelial injury, epithelial damage also plays an important role in the development of and recovery from this disorder 1, 2. The loss of epithelial integrity contributes to the formation of alveolar oedema, while the repair of epithelial injury and the restoration of alveolar epithelial fluid transport function facilitate the resolution of pulmonary oedema 2, 3.
Epidermal growth factor (EGF) is a key growth factor among the ligands of EGF receptors (EGFRs). The family of EGFs and receptors is important in regulating cell growth, maturation, function and maintenance in epithelial tissues 4. It is suggested that acute lung injury elicits growth factor responses that trigger repair mechanisms to restore lung integrity 5. Studies show that EGF regulates bronchial and alveolar epithelial repair after lung injury 6, 7. Moreover, EGF decreases alveolar epithelial junctional permeability, upregulates alveolar epithelial Na+-K+-ATPase and increases lung fluid clearance 8–10. Conversely, EGF has also been increasingly regarded as a pro-inflammatory mediator. In asthma, the EGFR pathway is involved not only in the bronchial epithelial repair 11 but also in lung inflammation 12. EGF increases the production of interleukin-8, modulates the inflammatory effects of tumour necrosis factor-α, and enhances the neutrophil-mediated immunity 13–15. Taken together, EGF might play a critical role in the pathogenesis of ARDS and, therefore, EGF gene polymorphisms could be potential risks for ARDS.
The EGF gene is located on chromosome 4 (4q25), spans ∼99 kb and has 24 exons and 23 introns. Genetic variation within the EGF gene has been studied in relation to the EGF phenotype and disease susceptibility 16. However, most of these studies have focused on the single nucleotide polymorphism (SNP) rs4444903, the GG genotype of which is associated with a higher secretion of EGF protein than the AA genotype 17. Recently, it has been recognised that a haplotype-based tagging SNP approach can comprehensively scan the common variation of an entire gene and provide greater power than single-marker tests for genetic disease association 18. In addition, it is evident that the EGFR signalling pathways are regulated by sex hormones 19–21. Studies have shown sex differences for salivary and tear EGF levels 22, 23, as well as the EGF effects in ulcer healing 24. Moreover, sex-specific associations of EGF polymorphisms with phenotypes have been found in schizophrenia 25, 26. In the current study, it was hypothesised that common genetic variation of EGF is associated with the risk and outcomes of ARDS, and that such an association is modified by sex. A hospital-based, unmatched, modified, nested case–control study of patients at risk for ARDS was conducted, and a haplotype-tagging SNP approach was used to test the hypotheses.
METHODS
Study design and subjects
The current study is part of an ongoing molecular epidemiology project investigating the influences of genetic factors on the development and outcomes of ARDS. Details of the study have been described previously 27. Briefly, study subjects were selected from patients admitted to the intensive care units (ICU) at Massachusetts General Hospital (Boston, MA, USA) from September 1999 to November 2006. Patients with clinical risk factors for ARDS such as sepsis, septic shock, trauma, pneumonia, aspiration or multiple transfusions were eligible for inclusion (see supplementary material table 1⇓). Exclusion criteria included age <18 yrs, diffuse alveolar haemorrhage, chronic lung diseases other than chronic obstructive pulmonary disease or asthma, directive to withhold intubation, immunosuppression not secondary to corticosteroid, and treatment with granulocyte colony-stimulating factor. Baseline characteristics and Acute Physiology and Chronic Health Evaluation (APACHE) III scores were recorded on ICU admission. The enrolled patients who fulfilled the American–European Consensus Committee (AECC) criteria for ARDS upon ICU admission or during the daily follow-up were considered as ARDS cases, whereas at-risk patients who did not meet the criteria for ARDS were considered as controls. All enrolled patients with ARDS were then followed for clinical outcomes and all-cause 60-day mortality after the development of ARDS. To reduce the potential confounding from ethnic backgrounds, only Caucasian (non-Hispanic white) patients were analysed. A schematic of study design and patient selection is illustrated in figure 1⇓. The present study was approved by the Human Subjects Committees of the Massachusetts General Hospital and the Harvard School of Public Health (Boston). Written informed consent was obtained from all subjects or surrogates.
SNP selection
The haplotype-tagging SNPs in the EGF gene were selected based on the HapMap data 28 for the CEU population (Utah residents with ancestry from northern and western Europe). The multimarker tagging algorithm was used with criteria of r2>0.8 and minor allele frequency ≥0.1. The entire EGF gene was covered, including 5 kb on each side of the gene encompassing the promoter and 3′ untranslated region (UTR). In addition, a functional SNP rs4444903 (+61 A>G), which has not been included in the HapMap data, was also used.
Genotyping
Genomic DNA was extracted from whole blood using the Puregene DNA Isolation Kit (Gentra Systems, Minneapolis, MN, USA). The selected SNPs of EGF were genotyped using TaqMan® SNP Genotyping Assay (Applied Biosystems, Foster City, CA, USA). Primers and probes were ordered from Applied Biosystems. All PCR amplifications were performed in a 384-well format on GeneAmp® PCR Systems 9700 (Applied Biosystems). The fluorescence of PCR products was detected using the ABI Prism® 7900HT Sequence Detection System (Applied Biosystems). Genotyping was performed by laboratory personnel blinded to case–control status. A random 10% of samples were inserted in different 384-well plates as duplicates for quality-control purposes. Two investigators reviewed all genotyping results independently. The concordance rate for the duplicate samples was >99% and the overall genotyping success rate was 98.8%. Samples not yielding the genotypes of all SNPs were excluded from analysis.
Statistical analyses
The homogeneity of baseline characteristics between the two groups was tested by Fisher's exact test for categorical variables and by an unpaired t-test for continuous variables. The differences of genotype distributions between groups were compared using a Chi-squared test. SAS/Genetics was used to calculate the allele frequencies, test the deviation from the Hardy–Weinberg equilibrium (HWE), and estimate pairwise D′ and r2 values for linkage disequilibrium (LD).
Haplotype frequencies were estimated from the unphased genotype data in the combined population (cases and controls), using the expectation maximisation algorithm as implemented in SAS/Genetics (SAS Institute Inc., Cary, NC, USA). The associations between EGF haplotypes and the risk and survival of ARDS were analysed using the expectation-substitution approach as implemented in the SAS macro Haplotype Scoring for Generalized Linear Modeling and Haplotype-Disease Association Tests (HAPPY) 29, 30. This approach treats subject-specific expected haplotype indicators, calculated by an additive mode, as observed covariates for regression models. Haplotypes with a frequency ≥5% in the total population were considered to be common and the most common haplotype was used as the referent in the regression model to assess the haplotype-specific risk for ARDS. All other haplotypes were pooled into a separate rare haplotypes category.
A multivariate logistic regression was used to estimate the genotype- and haplotype-specific odds ratio (OR) and 95% confidence interval (CI) for ARDS risk. To evaluate the associations of individual genotypes and haplotypes with ARDS survival, the Cox proportional hazard model was used to estimate the hazard ratio (HR) and 95% CI. Covariates for the logistic regression and Cox models included age, sex (not included in sex-stratified analyses), APACHE III score and the potential risks for ARDS development and mortality based on univariate analysis. Global tests for the associations between haplotypes and ARDS risk and survival were carried out using the likelihood ratio test. For statistically significant associations, adjusted p-values were calculated to correct for multiple comparisons, using the false discovery rate (FDR) procedure of Benjamini and Hochberg 31. The gene–sex interactions were examined by sex-stratification and their strength was evaluated in multivariate logistic regression models including an interaction term. A two-sided p-value <0.05 was considered to be statistically significant.
RESULTS
Study population
The flow diagram of patient selection in the present study is illustrated in figure 1⇑. The baseline characteristics of the current study population, including 416 ARDS cases and 1052 at-risk controls, are shown in table 1⇑. Patients who developed ARDS were younger and had higher APACHE III scores than those who did not develop ARDS. Among patients with ARDS, the survivors were younger and had lower APACHE III scores than the nonsurvivors. There were no differences in sex distributions between ARDS patients and controls, or between ARDS survivors and nonsurvivors. The comparisons of ARDS risk factors and comorbidities between ARDS patients and controls, and between ARDS survivors and nonsurvivors, are also shown in table 1⇑. The comparisons of clinical characteristics between males and females are provided in the online supplementary material table 2⇓.
SNP selection and genotype frequencies
Seven SNPs of the EGF gene were selected for the current study. In the order of 5′ to 3′, they were: the functional SNP rs4444903 and six tagging SNPs, rs2298991, rs11568993, rs6850557, rs7692976, rs4698803 and rs6533485 (table 2⇑). All SNPs conformed to HWE, with the exception of rs6850557 (FDR adjusted p<0.05), which was then excluded from further analyses. Pairwise LD analysis revealed that all alleles among selected SNPs are in high LD (online supplementary material figure 1⇑).
The alleles, locations, chromosome positions and minor allele frequencies of these seven SNPs are presented in table 2⇑. The minor allele frequencies for ARDS and controls were not different in all subjects for all SNPs, but were significantly different in males for rs7692976 and rs4698803 (p = 0.030 and p = 0.024, respectively). The genotype distributions are shown in the online supplementary material table 3⇓. No differences in genotype frequencies were found between ARDS and controls in all subjects, between males and females, or between ARDS survivors and nonsurvivors. However, the sex-stratified analyses showed that the genotype frequencies of rs4444903, rs7692976 and rs4698803 were significantly different between ARDS and controls in males (p = 0.029, p = 0.037 and p = 0.032, respectively; online supplementary material table 4⇓).
Associations between EGF variants and ARDS risk
Considering all subjects as a whole, none of the individual SNPs were significantly associated with ARDS risk (table 3⇑). Upon stratification by sex, the variant genotypes of rs4444903 (OR 1.64, 95% CI 1.17–2.31; p = 0.005), rs2298991 (OR 1.50, 95% CI 1.07–2.11; p = 0.019) and rs7692976 (OR 1.64, 95% CI 1.17–2.31; p = 0.005) were found to be associated with increased risks, while the variant genotypes of rs4698803 (OR 0.67, 95% CI 0.48–0.96; p = 0.025) were associated with reduced risks of developing ARDS in males. In females, none of the associations between EGF variants and ARDS risk were significant. However, the effects of variant genotypes on ARDS risk were observed to be opposite from those in males for five of the six SNPs.
There were five common haplotypes (frequencies 30.6, 29.0, 14.0, 8.8 and 7.5%) inferred from the six polymorphisms analysed. Similar with the results from genotypes analyses, a significant association between haplotypes and ARDS risk was observed only in the male subgroup, in which the global test for association was significant (likelihood ratio test p = 0.005; table 3⇑). In addition, Hap2 (GGCGTC) was associated with an increased ARDS risk (OR 1.35, 95% CI 1.00–1.81; p = 0.048), whereas Hap3 (ATCAAG) was associated with a reduced ARDS risk (OR 0.64, 95% CI 0.44–0.94; p = 0.022). The haplotype-specific associations were assessed in one regression model, thus correcting for multiple comparisons is not necessary.
Associations between EGF variants and clinical outcomes of ARDS
The clinical outcomes of ARDS patients among different genotype groups are shown in the online supplementary material table 5⇓. In 28 days since ARDS diagnosis, no significant differences in ICU-free days, ventilator-free days, successful extubation rates and mortality rates were observed between genotypes of all SNPs. The 60-day mortality rates were also not different between genotypes. In analyses stratified by sex (data not shown), male ARDS patients with variant genotypes of rs4698803 had a higher successful extubation rate (68.7 versus 54.2%; p = 0.043) and more ventilator-free days (11.9±9.8 versus 8.5±9.7 days; p = 0.014) than those with wildtype homozygote. There were no differences in clinical outcomes between genotypes in females.
None of the EGF genotypes or haplotypes were significantly associated with ARDS 60-day survival in all subjects, or in male and female subgroups (table 4⇑). Only Hap3 was marginally associated with decreased mortality in male ARDS patients (HR 0.57, 95% CI 0.31–1.05; p = 0.069).
Gene–sex interaction for ARDS risk
The interaction analyses showed that the variant genotypes of rs4444903, rs2298991, rs7692976 and rs6533485 significantly interacted with sex for ARDS risk (p-values for interaction 0.010, 0.010, 0.007 and 0.021, respectively; table 5⇑). Because sex hormone activities decrease with age, the current authors further stratified the interaction analysis by age. In order to have the largest statistical power, the median age was used for stratification. The gene-sex interactions were found to be even stronger in the younger group (age <65 yrs), but were not significant in the elder group (age ≥65 yrs), whose sex hormone effects were expected to be less active.
DISCUSSION
In the present study, the associations of EGF variants with ARDS risk and outcomes, and the role of sex in the association between EGF variants and ARDS were comprehensively evaluated. The common EGF variants were found to be associated with ARDS risk in a sex-specific manner. Variant genotypes of rs4444903, rs2298991, rs7692976 and rs4698803 and haplotypes GGCGTC and ATCAAG were associated with ARDS risk in males. Although no significant associations were found in females, the current authors observed that the associations of EGF variants on ARDS development in females were mostly opposite to those observed in males. Such gene–sex interaction was further supported by the results from interaction analyses. Conversely, EGF variants did not significantly influence the ARDS outcomes and survival in the present study.
EGF was first discovered in submaxillary glands of adult male mice, while human EGF (β-urogastrone) was first isolated from urine. The 6-kDa human EGF consists of 53 amino acids. It is initially synthesised as a transmembrane glycoprotein precursor (prepro-EGF) of 1,207 amino acids, and is then processed through a pro-EGF stage to a mature EGF protein 32, 33. The large prepro-EGF polypeptide contains the EGF subunit and eight additional EGF-like subunits, the biological significance of which is unknown 32. The heparin-binding 160-kDa pro-EGF, isolated from human urine, has been shown to be biologically active 33. However, there seem to be increasing reports of molecular heterogeneity in mature EGF. The biological properties and physiological significance of the EGF precursors and the heterogeneity of mature EGF remain to be elucidated 34.
The EGF gene in humans is located on chromosome four. As for functional polymorphisms, the variant G allele of rs4444903 has been associated with a higher secretion of EGF protein than the A allele. The mechanism by which EGF levels are modulated has been recently proposed in a study of hepatocellular carcinoma risk 35. Transcripts from the G allele exhibit a longer half-life than those from the A allele, thus the EGF mRNA and EGF protein are increased in cell lines with more copies of G in the genotype. In the current study, the variant genotypes (AG/GG) of this polymorphism significantly predisposed at-risk male patients to develop ARDS but were not associated with better clinical outcomes, suggesting that higher EGF levels in the early phase might contribute to the pathogenesis of ARDS, possibly through the pro-inflammatory properties of EGF. Conversely, the proposed function of EGF to facilitate epithelial repair and alveolar fluid reabsorption might not be decisive in the recovery from ARDS. Hap2 (GGCGTC), carrying the variant alleles of rs4444903, rs2298991, rs7692976 and rs6533485, was identified as a common haplotype associated with increased ARDS risk in males. The variant genotypes of these four SNPs were all associated with increased ARDS risk (p<0.05 for the first three SNPs; p = 0.08 for rs6533485). Based on the established functional significance of rs4444903, the positions of SNPs (rs4444903 is in the 5′ UTR; the other three are in introns) and the high degree of pairwise LD within these SNPs (D′ >0.85), the present authors inferred that the associations of rs2298991 and rs7692976 with ARDS observed in genotype analysis were mediated by their LD with rs4444903.
SNP rs4698803 is a missense T>A polymorphism located in exon 19 of the EGF gene, causing an amino acid substitution (V920E). The A allele of rs4698803 was identified as a potentially protective allele against ARDS. Based on the current authors’ tagging algorithm, this SNP did not tag any other SNPs in the EGF gene. Hap3 (ATCAAG), carrying the variant allele of rs4698803 and the wildtype alleles of all other SNPs, was also identified as a potentially protective haplotype against ARDS. The present results from genotype and haplotype analyses suggest that rs4698803 is independently associated with ARDS. Future studies are needed to understand whether and how this single amino acid mutation, V920E, might alter the biochemical function of the EGF precursor and mature EGF protein.
A sex difference similar to that found in the current study has also been observed in the association studies of EGF with schizophrenia. The G allele at SNP rs4444903 is associated with the age of onset in male patients with schizophrenia, but not in females 25, 26. The sex difference in the genetic influence of EGF on ARDS might be explained by the sex- and tissue-specific regulation of EGFR signalling pathways by sex hormones. In the present study, the gene–sex interaction was not significant in the elder group, whose sex hormone effects are expected to be less active. It is evident that there is crosstalk between sex hormone receptors and EGFR pathways. In lung development, the expression and activity of EGFR appears to be sex specific and cell specific 21. Androgen treatment has been shown to decrease EGFR density and EGF-induced autophosphorylation of EGFR in foetal rabbit lung 19. Conversely, oestrogen upregulates the expression of EGFR, whereas progesterone upregulates the expression of 133- and 71-kDa immunoreactive EGF (the prepro-EGF-like proteins) in uterine leiomyoma cells 20. Indeed, the impacts of sex and sex hormones on acute lung injury have been studied in animal models. Androgens appear to be detrimental while oestrogens tend to be protective in the pathogenesis of acute lung injury 36.
To the current authors’ knowledge, the present study is the first to use the haplotype-tagging SNP approach to investigate genetic susceptibility to ARDS. A major advantage of this approach is that it allows a cost-effective identification of common susceptibility alleles across the entire gene region. In addition, the current study is so far the largest report for genetic epidemiology of ARDS and, thus, provided more statistical power to detect genetic associations with ARDS, particularly in sex-stratified analyses. Another major strength of the present study is its study design. The AECC definition was used for ARDS diagnosis, in order to clearly define the phenotype prospectively. The at-risk critically ill patients were selected as controls, to reduce the possible confounding from associations between candidate polymorphisms and predisposing conditions for ARDS. Furthermore, the analysis was restricted to a single ethnic group, thus minimising false results due to population stratification.
One of the limitations in the current study is that neither EGF nor sex hormone levels were determined, thus the functional significance of the EGF genotypes and haplotypes on ARDS and their interactions with sex hormones remain to be further defined. Although this is so far the largest population available for ARDS association study, the present authors might not have had adequate power to detect the association of EGF variants on ARDS survival with 416 ARDS cases, particularly when it was expected to be small because the resolution of lung oedema and injury contributes only partially to surviving ARDS. Since the current study included only a single cohort, the present findings need to be validated in other independent populations. Finally, the current results are based on Caucasian subjects and additional studies in other ethnic groups will be needed.
In conclusion, the present study demonstrated that genetic associations of epidermal growth factor gene variants with acute respiratory distress syndrome risk were modified by sex. The variant genotypes of rs4444903, rs2298991, rs7692976 and rs4698803, and haplotypes GGCGTC and ATCAAG were significantly associated with ARDS development in at-risk males. The current findings should be replicated in other cohorts. The present results also warrant future basic research to understand the role of epidermal growth factor, as well as interactions with sex hormones, in the pathogenesis of acute respiratory distress syndrome.
Support statement
The present study was supported in part by grants from the National Institutes of Health (grants HL60710 and ES00002; Bethesda, MD, USA) and the Flight Attendant Medical Research Institute (grant no. 062459-YCSA; Miami, FL, USA).
Statement of interest
None declared.The current authors would like to thank W. Zhang, K. McCoy, T. McCabe, J. Shin and H. Fujii-Rios for patient recruitment, and A. Shafer and S.Sumpter for research support(all Pulmonary Critical Care Unit, Massachusetts General Hospital, Boston, MA, USA). They also thank M. Convery for laboratory expertise and J. Frelich for data management (both Department of Environmental Health, Harvard School of Public Health, Boston, MA, USA), and the patients and staff of intensive care units at Massachusetts General Hospital (Boston, MA, USA).
Footnotes
-
This article has supplementary material accessible from www.erj.ersjournals.com
- Received June 16, 2008.
- Accepted October 13, 2008.
- © ERS Journals Ltd