In last-mile delivery, platforms suggest routes for drivers to enhance delivery efficiency. However, the efficiency is often compromised when drivers deviate from these routes. Under this dilemma, balancing algorithmic clairvoyance with drivers’ preferences is complex yet crucial. We propose a new approach that merges the benefits of Efficiency-Oriented Delivery Routes (EODRs) suggested by the platform with drivers’ practical insights, resulting in Integrated Delivery Routes (IDRs). We first propose the Adjusted Net Reward (ANR) metric to characterize drivers’ evaluation criteria and employ the Inverse Reinforcement Learning (IRL) framework to learn it from observed routes and evaluations. We then introduce the Sequential Pooling then Selecting (SPS) method to efficiently generate IDRs. These routes align with real-world delivery scenarios and maintain the high-efficiency standards of EODRs, making them popular among drivers. We evaluate our approach using Amazon’s real-world data and find that the ANR trained by the IRL method accurately predicts driver preferences, outperforming traditional methods (accuracy improvement being 42.29% compared to the Inverse Optimization method). The IDRs, generated by the SPS method, enhance both driver satisfaction (improving ANR from 3.13 to 6.48) and operational efficiency (reducing transit time by 7.13% and time window violation by 35.44%). Our research improves smart-city operations by integrating machine learning with operations research, demonstrating that a human-centric approach can also increase delivery efficiency. The win-win outcome underscores the value of human-centric algorithm design toward enabling smart urban logistics.