Journal Article
Model-Based Poststratification of Measurements that Imperfectly Cover the Universe Studied: The Case of Postal Delivery Times
by
Alain Bultez
and
Bert Seghers
Abstract
The goal of this paper is to work out a poststratification method for estimating in-time indicators for international end-to-end delivery processes when
collected data cannot cover all the strata making up the logistics universe to be surveyed,
the true weights of the strata - needed to correct biases in representativeness caused by disproportionate sampling and incomplete
[...] Read more
The goal of this paper is to work out a poststratification method for estimating in-time indicators for international end-to-end delivery processes when
- collected data cannot cover all the strata making up the logistics universe to be surveyed,
- the true weights of the strata - needed to correct biases in representativeness caused by disproportionate sampling and incomplete coverage - are unknown but can be inferred from marginal subtotals related to stratification criteria considered separately, rather than jointly, and conditional on each end of the delivery journey: outbound- versus inbound-specific. Within this perspective, poststratification is used here to mean a statistical correction of measurements derived from incomplete stratified samples, an ex-post calibration aimed at yielding more accurate estimates based on an analysis of the data. Thus, we tackle instances where ex-ante assignment to strata is not a problem, but when surveying all strata is out of the question.
For that purpose, an econometric model is designed
- to link the discrete transport lead times, counted in days, of tested items to the specifics of their material characteristics (e.g. size/weight), as well as those of the routes they take through the distribution network (e.g. origin and destination zones),
- and provide performance predictions for each of the strata, covered as well as non-covered.
Benchmarking the multinomial cumulative logit regression against the negative binomial one reveals that delivery time had better be treated as an ordinal categorical system’s response, rather than as a ratio-scaled count.
The model-based fitted and extrapolated estimates are then used as inputs to the ex-post weighting stage, which produces robust point- and interval-estimates of aggregate key performance indicators (KPIs) through bootstrapping. Simple linear programs provide two extreme weighting sets, one per country-to-country path: the first minimizes the KPIs’ values, while the second maximizes them.
Probabilities of delivery within deadlines summarize distributions of delivery times better than their means and standard deviations, because logistical efforts to cut transit by one day must be enhanced more and more as it gets shortened. Three types of graphs are proposed to help visualize this exponential increase in the service quality required. The applicability of the methodology developed is demonstrated on the 2023 database of the International Post Corporation. In this case, the imprecision of the KPI estimates depends much more on the uncertainty caused by disturbances occurring during the first- and last-miles, than on the imperfection of the information about the real weights of the strata.