Standard setting is an important phase in the development of an examination program, especially for a high-stakes test. If the cut scores are not appropriately set, the results of the assessment could come into question. For this reason, establishing cut scores for a test has been considered an important and practical aspect of standard setting. This study is aimed at investigating the validity of the cut scores established for the VSTEP.3-5 listening test, the first ever-standardized test of English proficiency in Vietnam. The study adopts the current argument-based validation approach with a focus on three main inferences constructing the validity argument. They are (1) test tasks and items, (2) test reliability and (3) cut scores. The argument is that in order for the cut-scores of the VSTEP.3-5 listening test to be valid, the test tasks and test items first need to be designed in accordance with the characteristics specified in the specifications. Second, the listening test scores must be sufficiently reliable so as to reasonably reflect test-takers’ listening proficiency. Third, the cut scores are reasonably established for the VSTEP.3-5 listening test. In this study, both qualitative and quantitative methods are combined and structured to back for and against the assumptions in each of these three inferences. This study offers contributions in three areas. First, this study supports the widely-held notion of validity as a unitary concept and validation is the process of building an interpretive argument and collecting evidence in support of that argument. Second, this study contributes towards raising the awareness of the importance of evaluating the cut scores of the high stakes language tests in Vietnam so that fairness can be ensured for all of the test takers. Third, this study contributes to the construction of a systematic, transparent and defensible body of validity argument for the VSTEP.3-5 test in general and its listening component in particular. The results of this study are helpful in providing informative feedback to the establishment of the cut scores for the VSTEP.3-5 listening test, the test specifications, and the test development process. The positive results can provide evidence to strengthen the reasonableness of the cut scores, the specifications and the quality of the VSTEP.3-5 listening test. The negative results can give suggestions for changes or improvement in the cut scores, the specifications and the design of the VSTEP.3-5 listening test.