Source of data
The data was obtained from the measure DHS program at www.measuredhs.com after prepared concept notes about the project. The demographic and Health Survey (DHS) data were pooled from the 32 Sub-Saharan Africa (SSA) countries from 2010 to 2020. The Sub-Saharan African continent consists of 54 recognized countries. Geographically, sub-Saharan Africa is a region situated south of the Sahara desert on the continent of Africa. Sub-Saharan Africa, according to the United Nations (UN), consists of all African countries which are entirely or partially located south of the Sahara. As part of Sub-Saharan Africa, the UN Development Program recognizes 46 out of 54 African countries, while the World Bank mentions Somalia and Sudan. The recent DHS of country specific dataset was extracted during the specified period.
In this study, 34 countries in the sub region met our selection criteria (sub-Saharan African countries that possessed DHS data sets between 2010 and 2020) available in the public domain. The countries were Angola, Benin, Burkina Faso, Burundi, Cameroon, Cote d’Ivoire, Comoros, Congo Brazzaville, Democratic Republic of Congo, Ethiopia, Gabon, Gambia, Ghana, Guinea, Kenya, Lesotho, Liberia, Malawi, Mali, Namibia, Niger, Nigeria, Rwanda, Senegal, Sierra Leone, Tanzania, Uganda, Zambia, and Zimbabwe.
The DHS program adopts standardized method involving uniform questionnaires, manuals, and field procedures to gather the information that is comparable across countries in the world. DHSs are nationally representative household surveys that provide data from a wide range of monitoring and impact evaluation indicators in the area of population, health, and nutrition with face to face interviews of women age 15 to 49. The surveys employ a stratified, multi-stage, random sampling design. Information was obtained from eligible women aged from 15 to 49 years in each country. The detailed methodology of the survey and the process used to collect the data have been recorded elsewhere .
The outcome variable, early/timely initiation of breastfeeding, was determined by asking mothers for details about when their babies were placed on their breasts after birth. The ratio of children placed to the breast within one hour of birth to the total number of children was used to calculate the prevalence of early breastfeeding initiation.
Variables in socio-demographics and the economy (residence, region, maternal age, marital status, religion, maternal education, paternal education, wealth index, maternal occupation/maternal working Status), Pregnancy and factors linked to pregnancy ( ANC visit, Parity, Preceding birth interval, contraceptive use, Place of delivery, Birth order, Mode of delivery, size of child at birth). Behavioural factors.
(Smoking, media exposure) were included for this study.
Non-aggregate community-level variables were place of residence and area. The place of residence has been registered as rural and urban. The area was described as the province from which a child comes from. By aggregation from an individual level, another group of community-level variables was developed using average approaches to conceptualize the neighbourhood effect on the implementation of EIBF. Education for women in the neighbourhood, community poverty, community visit to the ANC, community place of delivery.
Data management and analysis
The research for this thesis was performed using version 15 of STATA (STATA Corporation. IC., TX, USA). For the calculation of descriptive statistics such as proportions, sampling weights were used to account for non-proportional distribution of the sample to strata. In the case of standard regression models, the research participants are considered to be independent of the outcome variable. Nevertheless, units in the same category are rarely independent when data is ordered in hierarchies . Units from the same setting (cluster) are more similar to each other in relation to other units, or in relation to the outcome of interest, than units from another setting. This may then lead to a breach of the assumption of independence which could have the effect of underestimating standard errors and increasing Type I error rates (increases rate of false positivity of our results). In such circumstances, multilevel modelling can simultaneously account for person and community-level variables and provide a more comprehensive understanding of early initiation of breastfeeding factors .
Multilevel models are therefore developed to overcome the analytical problems that arise when data is hierarchically organized, and sampled data is a sample of several stages of this hierarchical population, such as DHS, in which children are nested in households, and households are nested in clusters, and there is an intra-group correlation. In order to estimate both independent (fixed) effects of explanatory variables and community-level random effects on the initiation of prelacteal feeding, a two-level mixed-effect logistic regression model was fitted. The person (children) is the first level and the cluster is the second level (community). In the bi-variable multilevel logistic regression model, the individual and community level variables associated with early initiation of breast feeding were independently tested and variables that were statistically significant at p-value 0.20 were considered for the final individual and community level adjustments. In the multivariable multilevel analysis, variables with p-value < 0.05 were declared as significant determinants of early initiation of breast feeding.
Therefore, using the two-level multilevel model, the record of the likelihood of implementing prelacteal feeding was modelled as follows:
$$\mathrmlog\left(\frac\pi _ij1-\pi _ij\right)=\beta _0+\beta _1X_ij+\beta _2Z_ij+\mu _j$$
where, i and j are the units of level 1 (individual) and level 2 (population) respectively; X and Z apply to variables of the individual and community level, respectively; \(\pi _ij\) is the likelihood of having prelacteal feeds in the jth community for the ith mother; the β’s are the fixed coefficients-therefore, there is a corresponding efficiency for each one-unit increase in X/Z (a set of predictor variables). Whereas, in the absence of control of predictors, \(\beta _0\) is the intercept-the effect on the likelihood of mother on the provision of prelacteal feed; and μj indicates the random effect for the jth community (effect of the community on the decision of mother to provide prelacteal feed). The clustered data existence and the within and between community variations were taken into account by assuming that each community has a different intercept (\(\beta _0\)) and fixed coefficient (β).
A total of four models were fitted. The first was a null model with no exposure variables, which was used to determine random effects at the population level and assess for heterogeneity in the community. Then model I was the multivariable model adjustment for individual-level variables and model II which was adjusted for community-level factors. In model III, the outcome variable was equipped with potential candidate variables from both person and community-level variables.
Parameter estimation method
Fixed effects (an association measure) were used to estimate the relationship between the likelihoods of EIBF and explanatory variables at both the population and person level, and the results were expressed as odds ratios with a 95% confidence interval. Community-level variance with standard deviation, intracluster correlation coefficient (ICC), Proportional Change in Community Variance (PCV), and median odds ratio (MOR) were used as indicators of heterogeneity (random-effects). The median odds ratio (MOR) is used to transform area level variance into the commonly used odds ratio (OR) scale, which has a consistent and intuitive interpretation. When randomly selecting two areas, the MOR is defined as the median value of the odds ratio between the area at the highest risk and the area at the lowest risk. The MOR can be conceptualized as the increased risk that (in median) would have if moving to another area with a higher risk. It is determined by \(MOR=e^\sqrt(2\times VA)\times 0.6745\) . Where; VA is the variance of the region standard, and 0.6745 is the 75th percentile of the normal distribution’s cumulative distribution function with mean 0 and variance 1, see the detailed definition . Whereas the proportional variance shift is determined as  \(PCV=[(VA-VB)/VA]*100\%\), where; VA = original model variance and VB = model variance with more terms.